[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: (ITS#6310) Slapd with pcache crashes under load
masarati@aero.polimi.it wrote:
>
> Thanks for collecting this info. The valgrind output could be of some
> use, but unfortunately I don't have time right now to set up a working
> RDBMS and extensively debug things. I'll keep this on my todo list.
>
> You should please re-run valgrind with --num-callers=30 or more, because
> in some cases errors are in too nested functions to get a clear idea of
> whether the issue is caused by garbage fed by slapd/back-sql or by errors
> inside the RDBMS/ODBC layers. The fact that valgrind systematically
> complains about internals of the RDBMS/ODBC reading past the end of memory
> chunks malloc'ed by slapd could be related to passing some non-nul
> terminated bervals that are dealt with as strings. Having a longer call
> stack could help tracking those occurrences. However, those issues should
> not be critical, since there's no invalid writes.
>
> Also, you should walk through the list of attributes being returned, to
> provide a hint about whether back-sql is computing a screwed attrlist or
> so. Along the lines of your current gdb session, you should get to frame
> #5, refresh_merge() in pcache.c, and print *e->e_attrs,
> *e->e_attrs->a_desc, *e->e_attrs->a_vals[0]; then move to
> e->e_attrs->a_next and repeat the prints to the end of the list. The fact
> you get a value of "a" equal to 0x500000000 looks definitely odd to me, as
> that attr list should result from be_entry_get_rw(), which in turn should
> collect it from the local database. Unless valgrind reveals some oddity
> in back-sql, the behavior you notice should not depend on the specific
> remote database you're using, but rather from the local one.
>
> p.
Hello,
Tomorrow I will make a setup with pure sql process and a pure pcache
daemon that reads from the first over unix domain socket. In this manner
it will be clear if the crashing part is related to back-sql and the
database drivers/ODBC manager or not.
Meanwhile, you could find the requested debugging session here:
http://purgatory.spnet.net/~karavelov/attr_list/gdb-1
It seems that the "e" pointer is corrupted. Tomorrow I will start it
through valgrind with more back-frames as requested
Best regards
Luben