[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: (ITS#4659) Core dump after MOD
quanah@stanford.edu wrote:
> After looking at the timing of everything, it is possible that I did kill
> -9 on a syncrepl consumer right in the middle of the MOD where the master
> died. That may have triggered this core, if the timing was all right down
> to the nanosecond...
>
Well, that's not a good reason to core anyway :)
OK, before actually destroying the connection, connection_close() waits
for the operations queue to be empty; from your dump, it seems that the
connection being destroyed has no (pending) ops, but the c_write_mutex
is locked and c_writewaiter is set. This means that send_ldap_ber() in
result.c was waiting on
ldap_pvt_thread_cond_wait( &conn->c_write_cv,
&conn->c_mutex );
but the operation was somehow removed from the operations queue. I'm
trying to figure out how this could happen.
One point is that even if your killing killed a persistent search op
(which I doubt), no writewaiter should be involved in a persistent
search... However I note that syncprov actually removes operations from
c_ops, in syncprov_drop_search(). I wonder if by chance it's being
called in few cases with lock erroneously unset...
It would be great to know what operation caused this, but since there is
no c_*ops in the Connection structure, it's going to be impossible.
p.
Ing. Pierangelo Masarati
OpenLDAP Core Team
SysNet s.n.c.
Via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
------------------------------------------
Office: +39.02.23998309
Mobile: +39.333.4963172
Email: pierangelo.masarati@sys-net.it
------------------------------------------