Yes, it does do that; nevertheless, conn->c_writing == 1 in all threads
when the deadlock occurs.
Which thread owns the write1 mutex at that point?
I don’t know how to determine that. Here’s a dump of *conn in one of the
threads:
$21 = {c_struct_state = SLAP_C_USED, c_conn_state = SLAP_C_ACTIVE, c_conn_idx = 39, c_sd = 39,
c_close_reason = 0x5add3b "?", c_mutex = {wrapped = {__data = {__lock = 0, __count = 0, __owner = 0,
__nusers = 0, __kind = 2, __spins = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = '\000' <repeats 16 times>, "\002", '\000' <repeats 22 times>, __align = 0}, usage = {
magic = ldap_debug_magic, self = 18446603377522038887, mem = {ptr = 0x0, num = 0},
state = ldap_debug_state_inited}, owner = 18446744073709551615}, c_sb = 0x7ff5f0114c30,
c_starttime = 1356066580, c_activitytime = 1356066580, c_connid = 1370, c_peer_domain = {bv_len = 7,
bv_val = 0x7ff630001140 "unknown"}, c_peer_name = {bv_len = 18, bv_val = 0x7ff630001120 "IP=127.0.0.1:42441"},
c_listener = 0x1325390, c_sasl_bind_mech = {bv_len = 0, bv_val = 0x0}, c_sasl_dn = {bv_len = 0, bv_val = 0x0},
c_sasl_authz_dn = {bv_len = 0, bv_val = 0x0}, c_authz_backend = 0x13767e0, c_authz_cookie = 0x0, c_authz = {
sai_method = 128, sai_mech = {bv_len = 0, bv_val = 0x0}, sai_dn = {bv_len = 24,
bv_val = 0x7ff624111980 "cn=manager,dc=alu,dc=com"}, sai_ndn = {bv_len = 24,
bv_val = 0x7ff6241086c0 "cn=manager,dc=alu,dc=com"}, sai_ssf = 0, sai_transport_ssf = 0, sai_tls_ssf = 0,
sai_sasl_ssf = 0}, c_protocol = 3, c_ops = {stqh_first = 0x7ff62813db60, stqh_last = 0x7ff6241108c8},
c_pending_ops = {stqh_first = 0x7ff6240020c0, stqh_last = 0x7ff624139028}, c_write1_mutex = {wrapped = {
__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 7, __kind = 2, __spins = 0, __list = {
__prev = 0x0, __next = 0x0}},
As for the rest - "reverse engineering" would be something like taking a
compiled binary and trying to decompile it. Reading source code is simply
that, it's not reverse engineering.
This is a matter of definition. To me, reverse engineering is when you
take insufficiently commented code and try to divine the organizing
principles behind it; it doesn’t matter if the code you’re looking at it
is machine code, assembly, or C (and I’ve done plenty of all of those).
The variables are documented with their purpose and usage, the comments
show what a block of code aims to do. What more do you expect?
Perhaps I have missed some documentation. What would help here is some
high-level description of the various synchronization primitives involved,
their consumers, what critical regions they aim to protect and how their
usage is supposed to avoid deadlock, along with detailed comments to that
effect. Here are some examples. This code in send_ldap_ber():
conn->c_writers++;
while ( conn->c_writers > 0 && conn->c_writing ) {
ldap_pvt_thread_cond_wait( &conn->c_write1_cv, &conn->c_write1_mutex );
}
/* connection was closed under us */
if ( conn->c_writers < 0 ) {
/* we're the last waiter, let the closer continue */
if ( conn->c_writers == -1 )
ldap_pvt_thread_cond_signal( &conn->c_write1_cv );
conn->c_writers++;
ldap_pvt_thread_mutex_unlock( &conn->c_write1_mutex );
return 0;
}
There are almost no comments here. conn->c_writers is mysteriously
incremented, perhaps to indicate that we are now intending to write?