Bug 2787

Summary: Support multipart/related inside multipart/alternative for HTML Views
Product: Claws Mail (GTK 2) Reporter: Abhay S. Kushwaha <clawsmail>
Component: UI/Message ViewAssignee: users
Status: RESOLVED INVALID    
Severity: enhancement CC: stanokopita
Priority: P3    
Version: 3.9.1   
Hardware: PC   
OS: All   

Description Abhay S. Kushwaha 2012-11-16 10:38:14 UTC
Creating a separate RFE so that Bug #2569 can be closed.

Andrej Kacian @ http://www.thewildbeast.co.uk/claws-mail/bugzilla/show_bug.cgi?id=2569#c18

I have noticed that our multipart parsing logic is a bit limited. The html part
of following message will not be selected automatically, because the parser is
not looking for multipart/related inside a multipart/alternative
(messageview.c, around line 1500).

multipart/alternative
 text/plain
 multipart/related
  text/html
  image/jpg

It could use a rewrite, a truly recursive parser could probably be better. Of
course, this is also outside of scope of this bug. :)
Comment 1 Tomasz Kalkosiński 2014-06-16 10:26:52 UTC
Is there a work on this issue in progress?
Comment 2 Ricardo Mones 2015-12-16 09:33:05 UTC
*** Bug 3028 has been marked as a duplicate of this bug. ***
Comment 3 Ricardo Mones 2015-12-16 11:37:23 UTC
The reported structure seems invalid according RFCs. I think that if the text/plain and text/html parts are alternative it should be generated as:

multipart/related
 multipart/alternative
  text/plain
  text/html
 image/jpg

Otherwise, if the message is an HTML message without text alternative but with a text file attached I think it should be like:

multipart/mixed
 multipart/related
  text/html
  image/jpg
 text/plain
Comment 4 Abhay S. Kushwaha 2015-12-16 11:55:53 UTC
multipart/related
 multipart/alternative
  text/plain
  text/html
 image/jpg

is invalid. The image/jpg is not related to the text/plain, it's related to text/html. The text/html with its embedded image/jpg is the alternative to text/plain.

multipart/alternative
 text/plain
 multipart/related
  text/html
  image/jpg

That is why this is valid, and that's why HTML mailers produce this structure.
Comment 5 Paul 2015-12-16 12:11:09 UTC
please send me an example privately, or attach it to this item.
Comment 6 Ricardo Mones 2015-12-16 13:01:48 UTC
(In reply to comment #4)
> multipart/related
>  multipart/alternative
>   text/plain
>   text/html
>  image/jpg
> 
> is invalid. 

Maybe you should tell MS, because those are produced exactly that way by Office365's Outlook (and BTW correctly displayed on Claws Mail :-)

> The image/jpg is not related to the text/plain, it's related to
> text/html. The text/html with its embedded image/jpg is the alternative to
> text/plain.

That is correct, but the MIME structure for that is the one above :-)

> multipart/alternative
>  text/plain
>  multipart/related
>   text/html
>   image/jpg
> 
> That is why this is valid, and that's why HTML mailers produce this
> structure.

As said the one above, which I think is a valid example for an HTML mailer, does not believe the same, so... ;-)
Comment 7 Abhay S. Kushwaha 2015-12-22 14:58:13 UTC
Ricardo, all I find are

multipart/related
 multipart/alternative
  text/plain
  text/html
 image/jpg

So you're apparently right and I'm mistaken.
Comment 8 Ricardo Mones 2015-12-22 15:10:48 UTC
(In reply to comment #7)
> Ricardo, all I find are
> 
> multipart/related
>  multipart/alternative
>   text/plain
>   text/html
>  image/jpg
> 
> So you're apparently right and I'm mistaken.

No problem, your effort is appreciated anyway.
And many thanks for confirming! :-)
Comment 9 Abhay S. Kushwaha 2016-02-02 10:36:52 UTC
Okay, I'm opening this because I've found a steady source of

multipart/alternative
 text/plain
 multipart/related
  text/html
  image/jpeg
  image/png

That source is

User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:38.0) Gecko/20100101
 Thunderbird/38.5.1

Apparently they subscribe to my interpretation of Comment #4 (second half). :)
Comment 10 Paul 2016-02-02 11:07:08 UTC
that tells us nothing new.
Comment 11 Abhay S. Kushwaha 2016-02-02 12:21:54 UTC
Paul, what it tells us is that this structure of email is far more common than I thought. Thunderbird produces this structure quite often apparently since I'm getting a crazy amount of mail with this structure from Thunderbird users. So now I can easily provide samples.

The anticipation is that it would be covered in Claws' repertoire of structures it can parse and display in Fancy by selecting/promoting the HTML part automatically when set to do so.

I didn't reopen the bug but does the availability of sample emails to test the scenario make it a case for an open RFE?
Comment 12 Ricardo Mones 2016-02-03 09:09:22 UTC
(In reply to comment #11)
> Paul, what it tells us is that this structure of email is far more common
> than I thought. Thunderbird produces this structure quite often apparently
> since I'm getting a crazy amount of mail with this structure from
> Thunderbird users. So now I can easily provide samples.

Being produced just by one version of one MUA is not exactly my definition of "common" :-)

Given that uniqueness and the number of bugs such MUA has in its MailCore's MIME component, I'd say it's just another one.

> The anticipation is that it would be covered in Claws' repertoire of
> structures it can parse and display in Fancy by selecting/promoting the HTML
> part automatically when set to do so.

That reminds me of something... can you test the patch on #3028 and see if improves promotion of those wrong structures to you?

> I didn't reopen the bug but does the availability of sample emails to test
> the scenario make it a case for an open RFE?

The samples are not good as discused before ;-)

Anyway, in my view a jpeg cannot be ever an alternative to a text part, look at this (reiterated) definitions on RFC 2046 (arrows are mine):
┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈
(1)   multipart -- data consisting of multiple entities of
          independent data types.  Four subtypes are initially
          defined, including the basic "mixed" subtype specifying
          a generic mixed set of parts, "alternative" for
→         representing the same data in multiple formats,
          "parallel" for parts intended to be viewed
          simultaneously, and "digest" for multipart entities in
          which each part has a default type of "message/rfc822".
┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈

those jpeg are not an alternative format for the text part unless it's a jpeg of the written text, which i've never seen.

Another one more precise, later on same RFC:
┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈
5.1.4.  Alternative Subtype

   The "multipart/alternative" type is syntactically identical to
   "multipart/mixed", but the semantics are different.  In particular,
→  each of the body parts is an "alternative" version of the same
→  information.
┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈

Key is "same information", the jpeg doesn't carry the same information as text part, the html does, that's why it's an alternative.

I hope those clarifies this to you.
Comment 13 Abhay S. Kushwaha 2016-02-03 13:57:15 UTC
Ricardo, I applied the patch to the latest Git. It works well for the mails I have with this structure. So I think we're good here. Thank you so much. I hope you would push the patch to the main repo soon!

Also, since you spent that much time on this, quoting the RFC too, let me also say that I'm not arguing for the validity of this structure and your discussion about the valid structure in light of the RFC excerpts seems well founded. Reading up the RFC excerpts only made me appreciate the complexity involved a bit more.

I suppose, for me as an individual, whether or not it's right, whether or not it's popular, I identify the broken format as something that causes plain-text display when I expect HTML part loaded in Fancy, as a problem *I* am facing, as that individual. So if it gets addressed by the dev team of Claws, it certainly makes my individual experience (and _hopefully_ somebody else's experience) with Claws that much better. And that in turn makes me deeply appreciate the time and efforts that have been spent in making it happen. I can't code myself so I treat the skill, and even more, the time and effort that's donated by Claws dev team on this project, especially on itches they don't even personally have, with utmost respect.