[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
slapd crash in back-bdb/ctxcsn.c (ITS#3301)
Full_Name: Ralf Haferkamp
Version: 2.2.15
OS: Linux (Kernel 2.6)
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (212.95.102.25)
I did run a slightly modified version of "test008-concurrency" on a test server
with around 10000 entries. The test runs a many add, read and modify (I adapted
slapd-modrdn to do modifies instead of modrdn) operations in parallel. After a
short while the server crashed. I was able to produce the following backtrace:
#0 0x080c8a09 in bdb_csn_commit (op=0x44219800, rs=0x44088870, tid=0x4623df90,
ei=0x81842a8, suffix_ei=0x440884d0, ctxcsn_e=0x440884cc,
ctxcsn_added=0x440884c8,
locker=2147502168) at ctxcsn.c:62
bdb = (struct bdb_info *) 0x816aad0
ctxcsn_ei = (EntryInfo *) 0x0
ctxcsn_lock = {off = 0, ndx = 938, gen = 135781712, mode = 1075070032}
max_committed_csn = {bv_len = 135421424, bv_val = 0x4620e4d0 "\017"}
suffix_lock = {off = 1176550456, ndx = 0, gen = 1141408840, mode =
135075791}
rc = -30995
ret = 10427
ctxcsn_id = 1176560848
e = (Entry *) 0x46237f18
textbuf = "....."
textlen = 256
eip = (EntryInfo *) 0x0
#1 0x080c4d7e in bdb_add (op=0x44219800, rs=0x44088870) at add.c:441
bdb = (struct bdb_info *) 0x816aad0
pdn = {bv_len = 11, bv_val = 0x46243feb "o=customers"}
p = (Entry *) 0x0
ei = (EntryInfo *) 0x81842a8
textbuf = "....."
textlen = 256
children = (AttributeDescription *) 0x81298b0
entry = (AttributeDescription *) 0x8129720
ltid = (DB_TXN *) 0x4623df90
lt2 = (DB_TXN *) 0x462573a0
opinfo = {boi_bdb = 0x816a9d0, boi_txn = 0x4623df90, boi_lock = {off =
16,
ndx = 1077478705, gen = 1176560872, mode = 1074201072}, boi_err = 0,
boi_locker = 2147502168, boi_acl_cache = 0}
subentry = 0
locker = 2147502168
lock = {off = 298840, ndx = 386, gen = 3273, mode = DB_LOCK_READ}
num_retries = 0
ps_list = (Operation *) 0x10
rc = 1176502288
suffix_ei = (EntryInfo *) 0x0
ctxcsn_e = (Entry *) 0x440884e8
ctxcsn_added = 0
postread_ctrl = (LDAPControl **) 0x0
ctrls = {0x0, 0x4043f4c0, 0x4043f4c0, 0x400, 0x18, 0x4620e4e0}
num_ctrls = 0
#2 0x0806aad2 in do_add (op=0x44219800, rs=0x44088870) at add.c:318
update = 0
textbuf = "....."
textlen = 256
cb = {sc_next = 0x0, sc_response = 0x807106a <slap_replog_cb>,
sc_cleanup = 0,
sc_private = 0x0}
repl_user = 0
ber = (BerElement *) 0x4623df10
last = 0x46230d6f ""
dn = {bv_len = 30, bv_val = 0x46230b32 "cn=James A Jones
5,o=customers"}
len = 36
tag = 4294967295
e = (Entry *) 0x4620bc38
modlist = (Modifications *) 0x46265498
modtail = (Modifications **) 0x4625cbc0
tmp = {sml_mod = {sm_op = 1141409752, sm_desc = 0x40324eb0, sm_type =
{bv_len = 15,
bv_val = 0x46230d4d "telephoneNumber"}, sm_values = 0x46225a90, sm_nvalues
= 0x0},
sml_next = 0x0}
manageDSAit = 0
#3 0x0806445e in connection_operation (ctx=0x44088900, arg_v=0x44219800)
at connection.c:1048
rc = 80
op = (Operation *) 0x44219800
rs = {sr_type = REP_RESULT, sr_tag = 0, sr_msgid = 0, sr_err = 0,
sr_matched = 0x0,
sr_text = 0x0, sr_ref = 0x0, sr_ctrls = 0x0, sr_un = {sru_sasl = {r_sasldata =
0x0},
sru_extended = {r_rspoid = 0x0, r_rspdata = 0x0}, sru_search = {r_entry =
0x0,
r_attrs = 0x0, r_nentries = 0, r_v2ref = 0x0}}, sr_flags = 0}
tag = 104
oldtag = 104
conn = (Connection *) 0x42d274bc
memctx = (void *) 0x819c1e0
memctx_null = (void *) 0x0
memsiz = 1048576
#4 0x4003166d in ldap_int_thread_pool_wrapper (xpool=0x812b520) at tpool.c:467
pool = (struct ldap_int_thread_pool_s *) 0x812b520
ctx = (ldap_int_thread_ctx_t *) 0x8199c70
ltc_key = {{ltk_key = 0x80a3de8, ltk_data = 0x819c1e0,
ltk_free = 0x80a3db8 <sl_mem_destroy>}, {ltk_key = 0x817d210, ltk_data =
0xe,
ltk_free = 0x80c786f <bdb_locker_id_free>}, {ltk_key = 0x817d211, ltk_data =
0x81a1df0,
ltk_free = 0x80c76db <bdb_txn_free>}, {ltk_key = 0x0, ltk_data = 0x0,
ltk_free = 0} <repeats 29 times>}
tid = 1141410736
i = 391
keyslot = 391
hash = 391
#5 0x403239ed in start_thread () from /lib/tls/libpthread.so.0
No symbol table info available.
#6 0x403e59ca in clone () from /lib/tls/libc.so.6
No symbol table info available.
So it looks the like dn2entry in back-bdb/ctxcsn.c:62 is returning
DB_LOCK_DEADLOCK (rc = -30995 in the backtrace) and therefore ctxcsn_ei is still
NULL.
Unfortunately I am not very familar with this code so I don't know how to
correctly fix it, but returning BDB_CSN_RETRY directly after the dn2entry call
if rc==DB_LOCK_DEADLOCK seems to fix the problem.