Bug 481 - extraneous imap commands slow down imap filtering
Summary: extraneous imap commands slow down imap filtering
Status: RESOLVED FIXED
Alias: None
Product: Sylpheed-Claws (GTK1)
Classification: Unclassified
Component: Folders/IMAP (show other bugs)
Version: 0.9.10
Hardware: PC Linux
: P3 enhancement
Assignee: sylpheed-claws-users
URL:
Depends on:
Blocks: 704
  Show dependency tree
 
Reported: 2004-04-21 01:48 UTC by Wouter Van Hemel
Modified: 2005-09-11 09:24 UTC (History)
1 user (show)

See Also:


Attachments

Description Wouter Van Hemel 2004-04-21 01:48:33 UTC
As I mentioned on the list:

Sylpheed-claws seems to do a full imap fetch (BODY.PEEK[]) when
filtering messages from imap inbox to another imap folder. I filter (with
a X-Spam-Flag header) spam from my imap inbox to my imap spam folder, and
most clients use a simple imap copy command for this (including
sylpheed-main). This is very fast. But Sylpheed-claws has to do a full
body fetch somewhere, for some reason. This means all those spam and virus
email bodies still get downloaded for no reason, which is exactly what I
don't want.

Sample:

* 139 UID FETCH 94190 BODY.PEEK[]..
                                                                                
* 112 FETCH (UID 94190 BODY[] {6038}..
Return-Path: [...headers...]
You should not see this message's body...You should not see this message's
body...You should not see this message's body...You should not see this
message's body...You should not see this message's body...You should not see
this message's body...You should not see this message's body...You should not
see this message's body...You should not see this message's body...You should
not see this message's body...You should not see this message's body...You
should not see this message's body...You should not see this message's
body...You should not see this message's body...You should not see this
message's body........)..139 OK UID FETCH completed..

(email snipped, 3 tcp packets containing 6038 bytes of data).

That is my filtered message that passes my sniffer. This was a set-up
message, but it also does that to my spam. I don't want to download my
spam while it gets IMAP-to-IMAP copied in the same account. It makes
everything a lot slower, I'm talking about orders of magnitude here.
Besides, I don't want to waste my bandwidth on downloading spam. Why does
it do this? Does clicking 'Check all' download everything, even Trash and
other folders?
                                                                                
The sylpheed filter used here filters on 'Subject', the sylpheed spam
filter uses 'X-Spam-Flag'. I don't understand why it needs to download the
body, because the IMAP command before that, it already checked the headers
for its filtering.
Comment 1 Christoph Hohmann 2004-04-21 12:39:24 UTC
I told you this is not a bug, if you filter on a non standard header sylpheed
has to fetch it!
Comment 2 Wouter Van Hemel 2004-04-21 19:01:36 UTC
... but not the whole body.

You can check headers easily, standard or not (by the way, if subject isn't a
standard header, then what is?). There is no need to download the whole body,
that's just terrible programming practice. Besides, it already downloaded the
headers, there is no reason whatsoever to download the whole body for whatever
reason during a imap-to-imap copy. You should not do a full body fetch if there
are only header filter rules.

At least read the bug report before you close it.
Comment 3 Christoph Hohmann 2004-04-21 19:26:42 UTC
You don't understand, that's all and I won't explain how Sylpheed works in bug
reports
Comment 4 Alfons Hoogervorst 2004-04-21 20:08:32 UTC
Hmm, I have a hunch that storing just the headers in the message file would 
get us into the right direction. (Probably need to insert a sylpheed specific 
header too.) 
It's a bit hacky though, and only applies to certain "transports" (IMAP, 
NNTP). 
What about that? 
Comment 5 Wouter Van Hemel 2004-04-22 00:43:19 UTC
I understand IMAP-to-IMAP copy is a special case. It's a bit silly to both
download and store email messages locally on a move from imap to imap folder
though, especially when it's spam you deliberately do not want to clutter up
both your bandwidth and hd space. (In fact, with pine, an IMAP move for about
100 spam messages takes less than 1 second, while downloading these 100 messages
to a local folder takes at least a minute. It's not fun to wait minutes after
opening your email program, before you're able to do something.)

What about deleting the local cached headers - or not saving anything? I
presume, upon filtering, the header/message is placed in an imap cache by id -
that's probably what Christoph Hohmann tried to explain concisely. But when the
message is moved from one remote mailbox to another remote mailbox, there is no
need to have a local copy, only to do a simple (and *fast*) imap move command.
When opening the folder the message was filtered to, the whole message could be
downloaded just like any other new message (because it is a new message to the
user anyway). That way, the message won't be downloaded unless it is opened
specifically from its new location - when/if the user actually wants to see it.

If possible, it seems most logical to just filter it, but pretend you never saw
it, remove any data from the mailbox cache (sine it's moved away anyway) and
treat it as if it's a new message in its new location. You don't need to
download it when the user doesn't request to see it, and a header filter doesn't
need the body anyway.

Ofcourse, like you said, you could also come up with a way to mark partially
cached messages, but maybe this slows down cache routines... Unless this method
would be useful in some other situations, too. Or you could add a mark to the
imap cache filename to indicate that it's only a partial message, if that's faster.
Comment 6 Colin Leroy 2005-06-13 21:37:49 UTC
Fixed in 1.9.11cvs64
Comment 7 Wouter Van Hemel 2005-09-11 09:24:34 UTC
Thanks! Do you think you could come up a similar fix for 'rebuild folders'?

When I do a 'rebuild folders', sylpheed-claws-gtk2 always dives into my spam
archive and doesn't come out for a long, long time. I think it's caching
headers. I don't want it to go there. It can update the folders (which is what I
asked by issuing a 'rebuild folders'), but I really don't want to let it cache
tenthousands of old spam messages that are only there to train spamfilters. It
takes over 15 minutes.

I'm very happy with the 'check for new mail' checkbox that can stop this caching
of irrelevant folders every time I check for new mail. (Claws probably shouldn't
check this by default for non-standard folders, so it searches for new messages
only in the standard folders and in folders manually marked so... but that's my
opinion, ofcourse. Claws could be a lot faster if it doesn't do what's not
requested and possibly not even desired.)

Perhaps this 'check for new mail' feature could be used by 'rebuild folders' to
know what to cache and what not. It's more logical that way, because now
'rebuild folders' does cache and check for new email on folders that are marked
not to need a 'check for new email'.

Spam --> [ ] check for new mail --> do not cache or check folder on rebuild
Drafts --> [x] check for new mail --> cache and check folder on rebuild

I think this case belongs in the same bug report, as it's also caching of
messages (whole folders, actually) the user never asked for, but I'll leave it
up to one of you to decide if this is worth pursuing and opening a new bug
report for.

Note You need to log in before you can comment on or make changes to this bug.