Bug 3579

Summary: URL detection in messages broken
Product: Claws Mail (GTK 2) Reporter: Jazz Fan <jazz_fan_ts>
Component: UI/Message ViewAssignee: users
Status: RESOLVED PATCHESWELCOME    
Severity: minor CC: jazz_fan_ts
Priority: P3    
Version: 3.11.0   
Hardware: PC   
OS: Linux   

Description Jazz Fan 2015-12-17 08:32:20 UTC
Hi,

I just read a newsgroup post in which there is a website cited:

-----
For many years, my bash page (tiswww.case.edu/~chet/bash/bashtop.html) has
sported a bash logo that someone whose name I have lost donated long ago.
...
-----

Note that the URL in parentheses is wrongly detected as 

www.case.edu/~chet/bash/bashtop.html

although the "tis" is really part of it. The message headers include "Content-Type: text/plain; charset=utf-8" so this is not a "bug" of the author of that post having wrongly specified an <a href...> html tag.
Comment 1 Ricardo Mones 2015-12-17 15:16:40 UTC
This is probably because the URL detector is trying to be smart to help you, but in fact that string is not even a valid URL since it lacks the protocol part.
Comment 2 Jazz Fan 2015-12-18 11:18:55 UTC
(In reply to comment #1)
> This is probably because the URL detector is trying to be smart to help you,
> but in fact that string is not even a valid URL since it lacks the protocol
> part.

Sure, I agree with your analysis. But even though the URL is invalid, it's interpretation is so, too. I guess the detector uses a regexp containing something like '\bwww\.' which is too restrictive. Protocol-less URLs are far too common and usually refer to http. So I think the detector should either except the URL from the OP or not accept any protocol-less URLs at all.