Bug 3069 - Crash with the new libetpan callback logger
Summary: Crash with the new libetpan callback logger
Status: RESOLVED INVALID
Alias: None
Product: Claws Mail (GTK 2)
Classification: Unclassified
Component: NNTP (show other bugs)
Version: 3.10.0
Hardware: PC Linux
: P3 major
Assignee: users
URL:
Depends on:
Blocks:
 
Reported: 2014-01-25 20:11 UTC by Dan
Modified: 2014-05-03 09:11 UTC (History)
1 user (show)

See Also:


Attachments
Patch to fix nntp segmentation fault in libetpan v1.2/1.3 (615 bytes, patch)
2014-03-29 04:49 UTC, Michael Shell
no flags Details | Diff

Description Dan 2014-01-25 20:11:25 UTC
I'm using Claws-mail (version 3.9.3-11-g845d06) with libetpan (latest GIT) and sometimes (not always) when I select a newsgroup it crashes. Here's the backtrace:

folderview.c:2157:Folder gmane.announce is selected
folderview.c:2173:Opening folder gmane.announce...
folder.c:2164:Scanning folder gmane.announce for cache changes.
** Message: Account 'NNTP': Connecting to NNTP server: news.gmane.org:119...

[New Thread 0x7fffe25dd700 (LWP 21222)]
nntp-thread.c:248:found nntp 0xac9800

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe25dd700 (LWP 21222)]
0x0000000000d6c9c0 in ?? ()
(gdb) bt
#0  0x0000000000d6c9c0 in ?? ()
#1  0x00007ffff0a7e760 in mailstream_logger_internal (size=20, buffer=0x7ffff0ae7fab "<<<<<<< read <<<<<<\n", direction=0, is_stream_data=0, s=0x7fffdc0024b0) at mailstream_low.c:408
#2  mailstream_low_read (s=0x7fffdc0024b0, buf=0x7fffdc002550, count=<optimized out>) at mailstream_low.c:232
#3  0x00007ffff0a7fb08 in mailstream_feed_read_buffer (s=s@entry=0x7fffdc0024f0) at mailstream.c:322
#4  0x00007ffff0a7dd50 in mailstream_read_line_append (stream=0x7fffdc0024f0, line=line@entry=0xed2c1e0) at mailstream_helper.c:101
#5  0x00007ffff0a7dda1 in mailstream_read_line (stream=<optimized out>, line=line@entry=0xed2c1e0) at mailstream_helper.c:62
#6  0x00007ffff0a7ddc9 in mailstream_read_line_remove_eol (stream=<optimized out>, line=0xed2c1e0) at mailstream_helper.c:116
#7  0x00007ffff0ab8309 in read_line (f=0xac9800, f=0xac9800) at newsnntp.c:1832
#8  newsnntp_connect (f=0xac9800, s=<optimized out>) at newsnntp.c:195
#9  0x0000000000619c60 in connect_run (op=0xed45bf0) at nntp-thread.c:334
#10 0x0000000000642282 in thread_run (data=0xecfdf90) at etpan-thread-manager.c:351
#11 0x00007ffff0d21003 in start_thread () from /lib64/libpthread.so.0
#12 0x00007fffeef2dbcd in clone () from /lib64/libc.so.6

********************************************************

If you need more information, just ask.
Comment 1 Dan 2014-01-30 19:09:32 UTC
I asked the libetpan to look at this bug and confirm if it's a libetpan bug or Claws-mail bug. Here is his answer:

"Hi Daniel,
Sorry, I'm behind on my github issues.
I had a look at the stack trace and I think claws mail is probably doing something unsafe in their logger.

-- 
Hoa V. Dinh"

***

So, anyone can fix this? I'm available to test patches etc. Unfortunately I'm not a programmer and I'm unable to detect where's the problem. Thanks again.
Comment 2 Dan 2014-02-09 09:02:25 UTC
Nobody?
Comment 3 Paul 2014-02-09 09:44:10 UTC
nobody yet
Comment 4 Dan 2014-02-21 09:05:32 UTC
I found a workaround:

Metadata handling: Safer

This avoid the crash. If I use "Faster" it crashes. So the problem is with the "Faster" option.
Comment 5 Dan 2014-02-22 23:51:02 UTC
(In reply to comment #4)
> I found a workaround:
> 
> Metadata handling: Safer
> 
> This avoid the crash. If I use "Faster" it crashes. So the problem is with
> the "Faster" option.

Forget it! The problem returned :(
Comment 6 Paul 2014-02-23 11:16:24 UTC
Evidently, since the metadata handling option  can't have anything to do with this :)
Comment 7 Dan 2014-02-23 11:25:00 UTC
(In reply to comment #6)
> Evidently, since the metadata handling option  can't have anything to do
> with this :)

Yes, it was something as "wishful thinking". ;) If at least I had a workaround...
Comment 8 Dan 2014-03-04 19:56:04 UTC
I did what I should have done from the beginning: bisect!

1) all the lastest versions of claws-mail show this bug, so I suspect of libetpan

2) bisecting libetpan from 1.2 to master gave me the following:

7e51aed0564cc5f6a6890f0ffa2927c77a061d2a is the first bad commit
commit 7e51aed0564cc5f6a6890f0ffa2927c77a061d2a
Author: Hoa V. DINH <dinh.viet.hoa@gmail.com>
Date:   Sun Jun 23 16:00:07 2013 -0700

    Implemented logger callback on mailstream to provide a better logging system.

:040000 040000 5e640b60778753f7cd604a96b37a1c5ad19279e0 8e206fc5b0d80cdc98726b3a912337953433a13f M      src

https://github.com/dinhviethoa/libetpan/commit/7e51aed0564cc5f6a6890f0ffa2927c77a061d2a

So is there a way now to see why Claws-mail is crashing? Since the logger was changed in libetpan, it affected Claws-mail. 

Does it help or not? If you use libetpan with this commit, does it crash there too?

Thank you!
Comment 9 Michael Shell 2014-03-29 04:49:06 UTC
Created attachment 1360 [details]
Patch to fix nntp segmentation fault in libetpan v1.2/1.3

I recently ran into what I believe is the same problem under Claws Mail
3.9.3 with libetpan 1.3.pre2. Although in my case the seg fault happens
reliably and immediately upon trying to read a newsgroup and my gdb
backtrace is slightly different:


Thread 2 (Thread -1243321456 (LWP 2762)):
#0  0x000004ea in ?? ()
#1  0xb68fb232 in nntp_logger (s=0x8470cf0, log_type=0, 
    str=0xb6931fcb "<<<<<<< read <<<<<<\n", size=20, context=0x86a4b18)
    at newsnntp.c:2552
#2  0xb68be232 in low_logger (s=0x8470e30, log_type=0, 
    str=0xb6931fcb "<<<<<<< read <<<<<<\n", size=20, context=0x8470cf0)
    at mailstream.c:287



The difference in our backtraces may be due to different mail
configurations and/or libetpan versions. The fault in my system occurs
in the nntp_logger() of libetpan (not the function of the same name in
Claws Mail).


nntp_logger() is a simple function around line 2543 in
src/low-level/nntp/newsnntp.c of libetpan:



static inline void nntp_logger(mailstream * s, int log_type,
    const char * str, size_t size, void * context)
{
  newsnntp * session;

  session = context;
  if (session->nntp_logger == NULL)
    return;

  session->nntp_logger(session, log_type, str, size, session->nntp_logger_context);
}



The problem seems to be that session->nntp_logger is pointing to
something invalid (and not NULL) which would explain why gbd is unable
to provide a name for it:


#0  0x000004ea in ?? ()


It appears that session->nntp_logger is not being set to NULL during
initialization. And indeed this is done in the other protocols
(e.g., smtp, imap, etc.).

For example, in src/low-level/smtp/mailsmtp.c there is a function
mailsmtp_new() starting around line 107 that has at around line 141:



        session->smtp_timeout = 0;

  session->smtp_logger = NULL;
  session->smtp_logger_context = NULL;

  return session;



However, look at the corresponding code for the function newsnntp_new()
starting around line 88 in src/low-level/nntp/newsnntp.c:



newsnntp * newsnntp_new(size_t progr_rate, progress_function * progr_fun)
{
  newsnntp * f;

  f = malloc(sizeof(* f));
  if (f == NULL)
    goto err;
  
  f->nntp_stream = NULL;
  f->nntp_readonly = FALSE;

  f->nntp_progr_rate = progr_rate;
  f->nntp_progr_fun = progr_fun;

  f->nntp_stream_buffer = mmap_string_new("");
  if (f->nntp_stream_buffer == NULL)
    goto free_f;

  f->nntp_response_buffer = mmap_string_new("");
  if (f->nntp_response_buffer == NULL)
    goto free_stream_buffer;

        f->nntp_timeout = 0;

  return f;

 free_stream_buffer:
    mmap_string_free(f->nntp_stream_buffer);
 free_f:
    free(f);
 err:
    return NULL;




If I am understanding things correctly, there is supposed be lines to
initialize the pointers for the nntp_logger to NULL just after the
setting of the nntp_timeout line (around line 110) in newsnntp_new():



         f->nntp_timeout = 0;

  f->nntp_logger = NULL;
  f->nntp_logger_context = NULL;

  return f;



In this way, libetpan's nntp_logger() will not attempt to transfer
execution to the pointer until after the application actually sets up
a logger with libetpan (if ever). (Claws Mail does its own nntp logging
because libetpan did not even provide a logging framework for nntp until
version 1.2.)


The attached patch file, fix_libetpan_nntp_logger_seg_fault.patch, that
I wrote for libetpan 1.3.pre2 adds these two null setting lines. It fixes
the problem on my system.


Please do let us know if it works (or not) for anyone else so we can confirm
it. If indeed it is a libetpan problem, upstream should be notified and this
bug report for Claws Mail can be closed.



  Cheers,

  Mike Shell
Comment 10 Colin Leroy 2014-03-29 12:14:35 UTC
Thanks for your work, Mike, this seems to be the missing thing.
Comment 11 Colin Leroy 2014-03-29 17:34:17 UTC
Mike,

Hoa included your patch in Libetpan, so I'm marking this fixed :)

https://github.com/dinhviethoa/libetpan/commit/b153299848ddba34200ff954d4751198ef1e3d9d

Thanks again!
Comment 12 Paul 2014-05-03 09:11:41 UTC
it was a Libetpan bug not a Claws bug, for the record marking as INVALID

Note You need to log in before you can comment on or make changes to this bug.