Bug 3579 - URL detection in messages broken
Summary: URL detection in messages broken
Status: RESOLVED PATCHESWELCOME
Alias: None
Product: Claws Mail
Classification: Unclassified
Component: UI/Message View (show other bugs)
Version: 3.11.0
Hardware: PC Linux
: P3 minor
Assignee: users
URL:
Depends on:
Blocks:
 
Reported: 2015-12-17 09:32 CET by Jazz Fan
Modified: 2015-12-18 13:38 CET (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jazz Fan 2015-12-17 09:32:20 CET
Hi,

I just read a newsgroup post in which there is a website cited:

-----
For many years, my bash page (tiswww.case.edu/~chet/bash/bashtop.html) has
sported a bash logo that someone whose name I have lost donated long ago.
...
-----

Note that the URL in parentheses is wrongly detected as 

www.case.edu/~chet/bash/bashtop.html

although the "tis" is really part of it. The message headers include "Content-Type: text/plain; charset=utf-8" so this is not a "bug" of the author of that post having wrongly specified an <a href...> html tag.
Comment 1 Ricardo Mones 2015-12-17 16:16:40 CET
This is probably because the URL detector is trying to be smart to help you, but in fact that string is not even a valid URL since it lacks the protocol part.
Comment 2 Jazz Fan 2015-12-18 12:18:55 CET
(In reply to comment #1)
> This is probably because the URL detector is trying to be smart to help you,
> but in fact that string is not even a valid URL since it lacks the protocol
> part.

Sure, I agree with your analysis. But even though the URL is invalid, it's interpretation is so, too. I guess the detector uses a regexp containing something like '\bwww\.' which is too restrictive. Protocol-less URLs are far too common and usually refer to http. So I think the detector should either except the URL from the OP or not accept any protocol-less URLs at all.