Bug 2833 - RSSyl generates too much IO
Summary: RSSyl generates too much IO
Status: REOPENED
Alias: None
Product: Claws Mail (GTK 2)
Classification: Unclassified
Component: Plugins/RSSyl (show other bugs)
Version: 3.9.1
Hardware: PC Linux
: P3 enhancement
Assignee: users
URL:
Depends on:
Blocks:
 
Reported: 2012-12-08 20:32 UTC by Daniel Mota Leite
Modified: 2016-03-18 01:28 UTC (History)
0 users

See Also:


Attachments

Description Daniel Mota Leite 2012-12-08 20:32:09 UTC
i have setup RSS feeds for slashdot and freshmeat/freecode and as they have high volume of changes and i dont expire old news, their folders have ~10.000 items.

When RSSyl tries to fetch/update the feed, the machine load jumps to ~50 and i can see high IO requests from claws. Trying to check what is happening i get this:
write(1, "feed.c:911:", 11)             = 11
write(1, "Appending 'Greg KH Leaves SUSE F"..., 53) = 53
write(1, "feed.c:909:", 11)             = 11
write(1, "RSSyl: starting to parse '42143'"..., 33) = 33
write(1, "feed.c:662:", 11)             = 11
write(1, "RSSyl: parsing '/home/higuita/.c"..., 64) = 64
open("/home/higuita/.claws-mail/RSSyl/Slashdot/42143", O_RDONLY) = 31
fstat(31, {st_mode=S_IFREG|0600, st_size=3859, ...}) = 0
read(31, "Date: Sun, 19 Feb 2012 20:11:00 "..., 3859) = 3859
close(31)                               = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3439, ...}) = 0
write(1, "feed.c:719:", 11)             = 11
write(1, "RSSyl: got date \n", 17)      = 17
write(1, "feed.c:712:", 11)             = 11
write(1, "RSSyl: got author 'samzenpus'\n", 30) = 30
write(1, "feed.c:725:", 11)             = 11
write(1, "RSSyl: got title 'Canada's Onlin"..., 84) = 84
write(1, "feed.c:768:", 11)             = 11
write(1, "RSSyl: updated title to 'Canada'"..., 100) = 100
write(1, "feed.c:732:", 11)             = 11
write(1, "RSSyl: got link 'http://rss.slas"..., 148) = 148
write(1, "feed.c:741:", 11)             = 11
write(1, "RSSyl: got id 'http://rss.slashd"..., 146) = 146
write(1, "feed.c:697:", 11)             = 11
write(1, "RSSyl: finished parsing headers\n", 32) = 32
write(1, "feed.c:789:", 11)             = 11
write(1, "Leading html tag found at line 1"..., 34) = 34
write(1, "feed.c:796:", 11)             = 11
write(1, "Trailing html tag found at line "..., 35) = 35
write(1, "feed.c:911:", 11)             = 11
write(1, "Appending 'Canada's Online Surve"..., 86) = 86
write(1, "feed.c:909:", 11)             = 11
write(1, "RSSyl: starting to parse '43813'"..., 33) = 33
write(1, "feed.c:662:", 11)             = 11
write(1, "RSSyl: parsing '/home/higuita/.c"..., 64) = 64
open("/home/higuita/.claws-mail/RSSyl/Slashdot/43813", O_RDONLY) = 31
fstat(31, {st_mode=S_IFREG|0600, st_size=3829, ...}) = 0
read(31, "Date: Tue,  3 Apr 2012 18:31:00 "..., 3829) = 3829
close(31)                               = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3439, ...}) = 0
write(1, "feed.c:719:", 11)             = 11
write(1, "RSSyl: got date \n", 17)      = 17
write(1, "feed.c:712:", 11)             = 11
write(1, "RSSyl: got author 'samzenpus'\n", 30) = 30

So looks that RSSyk is trying to read ALL the items on those folders. So the more items we have, the more this is a issue
Comment 1 Andrej Kacian 2012-12-09 00:40:18 UTC
This is inevitable, since every new item from currently parsed feed update needs to be checked against existing items, to see if it is one of them (so it can either be ignored, or updated with new content).

You could create a cleanup processing rule (e.g. condition "age_greater 7 & ~unread", action "delete") to get rid of old stuff. I understand that might not be desirable, though.

Only possible solution I can think of would be an optional age cutoff setting, with updates not being checked against older items.
Comment 2 Dan 2016-03-18 01:28:33 UTC
Any workaround for this? I'm using Claws 3.13.2 and RSSyl still is I/O intensive...

Note You need to log in before you can comment on or make changes to this bug.