[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#7951) slapd deadlocks during mod operation



christianclinton@gmail.com wrote:
> Full_Name: Christian Clinton
> Version: 2.4.36
> OS: CentOS 6.4
> URL: ftp://ftp.openldap.org/incoming/christian-clinton-140925.zip
> Submission from: (NULL) (216.148.0.72)
>
>
> Description: Slapd hangs during LDAP modification operation. Subsequent read or
> write operations hang from all clients until slapd restart.

2.4.36 is over a year old. Bugs reported against old releases will not be 
investigated.

> slapd logs show:
> slapd[11213]: => bdb_idl_insert_key: c_put id failed: BDB0068 DB_LOCK_DEADLOCK:
> Locker killed to resolve a deadlock (-30993)
> slapd[11213]: conn=12787 op=25: attribute "entryCSN" index add failure
> slapd[11213]: => bdb_idl_insert_key: c_put id failed: BDB0068 DB_LOCK_DEADLOCK:
> Locker killed to resolve a deadlock (-30993)
> slapd[11213]: conn=13382 op=3: attribute "entryCSN" index add failure

This is a normal occurrence in BerkeleyDB and does not indicate an actual 
error. The bdb/hdb backend code automatically retries when a BDB deadlock is 
reported.

> This issue has occurred sporadically for several months across different
> OpenLDAP versions. I managed to capture of a backtrace for two occurrences of
> the crash (excerpt below, full backtrace loloaded to FTP). In both instances, a
> couple of threads were acquiring locks which is where I suspect the issue lies.
>
> BT: ftp://ftp.openldap.org/incoming/christian-clinton-140925.zip

When talking about BDB deadlock issues you should always obtain the output 
from db_stat -CA on the db directory, in addition to the full stack trace.

You should consider testing with different BDB versions as well, as issues are 
known to slip into various updates.

> Environment:
> OS: CentOS 6.4
> OpenLDAP 2.4.36
> 1 Master w/ many syncrepl slaves
>
> Thread 129 (Thread 0x7f10ddffb700 (LWP 16586)):
> #0  0x00000037e560b43c in pthread_cond_wait@@GLIBC_2.3.2 () from
> /lib64/libpthread.so.0
> #1  0x00007f11a076e623 in __db_hybrid_mutex_suspend () from
> /usr/local/openldap/lib/libdb-5.3.so
> #2  0x00007f11a076d6eb in __db_tas_mutex_lock () from
> /usr/local/openldap/lib/libdb-5.3.so
> #3  0x00007f11a08205e6 in __lock_get_internal () from
> /usr/local/openldap/lib/libdb-5.3.so
> #4  0x00007f11a08208ab in __lock_get () from
> /usr/local/openldap/lib/libdb-5.3.so
> #5  0x00007f11a0848468 in __db_lget () from
> /usr/local/openldap/lib/libdb-5.3.so
> #6  0x00007f11a078dff6 in __bam_search () from
> /usr/local/openldap/lib/libdb-5.3.so
> #7  0x00007f11a0778ed6 in __bamc_search () from
> /usr/local/openldap/lib/libdb-5.3.so
> #8  0x00007f11a077c637 in __bamc_get () from
> /usr/local/openldap/lib/libdb-5.3.so
> #9  0x00007f11a083507d in __dbc_iget () from
> /usr/local/openldap/lib/libdb-5.3.so
> #10 0x00007f11a0845d00 in __dbc_get_pp () from
> /usr/local/openldap/lib/libdb-5.3.so
> #11 0x00000000005489ed in hdb_idl_delete_key (be=0x7f10ddff9480,
> db=0x7f111a3f9610, tid=0x7f10603015b0, key=0x7f10ddff8ec0, id=250418) at
> idl.c:947
> #12 0x000000000054b455 in hdb_key_change (be=0x7f10ddff9480, db=0x7f111a3f9610,
> txn=0x7f10603015b0, k=0x7f1060001ab0, id=250418, op=2) at key.c:97
> #13 0x000000000054a4ee in indexer (op=0x7f10603c9f90, txn=0x7f10603015b0,
> ad=0x1b6d5e0, atname=0x1b6d4b8, vals=0x7f1060001a88, id=250418, opid=2, mask=4)
> at index.c:214
> #14 0x000000000054a978 in index_at_values (op=0x7f10603c9f90,
> txn=0x7f10603015b0, ad=0x1b6d5e0, type=0x1b6d450, tags=0x1b6d600,
> vals=0x7f1060001a88, id=250418, opid=2)
>      at index.c:337
> #15 0x000000000054aaf6 in hdb_index_values (op=0x7f10603c9f90,
> txn=0x7f10603015b0, desc=0x1b6d5e0, vals=0x7f1060001a88, id=250418, opid=2) at
> index.c:386
> #16 0x00000000004dbb44 in hdb_modify_internal (op=0x7f10603c9f90,
> tid=0x7f10603015b0, modlist=0x7f10603f9930, e=0x7f10ddff91d0,
> text=0x7f10ddffa990,
>      textbuf=0x7f10ddff9260 "\300\032\070\220\020\177", textlen=256) at
> modify.c:358
> #17 0x00000000004dcc09 in hdb_modify (op=0x7f10603c9f90, rs=0x7f10ddffa970) at
> modify.c:642
> #18 0x00000000004bbc11 in overlay_op_walk (op=0x7f10603c9f90, rs=0x7f10ddffa970,
> which=op_modify, oi=0x1c9a500, on=0x0) at backover.c:671
> #19 0x00000000004bbe28 in over_op_func (op=0x7f10603c9f90, rs=0x7f10ddffa970,
> which=op_modify) at backover.c:723
> #20 0x00000000004bbf64 in over_op_modify (op=0x7f10603c9f90, rs=0x7f10ddffa970)
> at backover.c:762
> #21 0x000000000044d5c6 in fe_op_modify (op=0x7f10603c9f90, rs=0x7f10ddffa970) at
> modify.c:303
> #22 0x000000000044ce98 in do_modify (op=0x7f10603c9f90, rs=0x7f10ddffa970) at
> modify.c:177
> #23 0x000000000042d068 in connection_operation (ctx=0x7f10ddffaaa0,
> arg_v=0x7f10603c9f90) at connection.c:1155
> #24 0x000000000042d609 in connection_read_thread (ctx=0x7f10ddffaaa0, argv=0xb7)
> at connection.c:1291
> #25 0x0000000000596261 in ldap_int_thread_pool_wrapper (xpool=0x1b77350) at
> tpool.c:688
> #26 0x00000037e5607851 in start_thread () from /lib64/libpthread.so.0
> #27 0x00000037e52e890d in clone () from /lib64/libc.so.6
>
> Thread 71 (Thread 0x7f0fedffb700 (LWP 16648)):
> #0  0x00000037e560b43c in pthread_cond_wait@@GLIBC_2.3.2 () from
> /lib64/libpthread.so.0
> #1  0x00007f11a076e623 in __db_hybrid_mutex_suspend () from
> /usr/local/openldap/lib/libdb-5.3.so
> #2  0x00007f11a076d6eb in __db_tas_mutex_lock () from
> /usr/local/openldap/lib/libdb-5.3.so
> #3  0x00007f11a08205e6 in __lock_get_internal () from
> /usr/local/openldap/lib/libdb-5.3.so
> #4  0x00007f11a08208ab in __lock_get () from
> /usr/local/openldap/lib/libdb-5.3.so
> #5  0x00007f11a0848468 in __db_lget () from
> /usr/local/openldap/lib/libdb-5.3.so
> #6  0x00007f11a078dff6 in __bam_search () from
> /usr/local/openldap/lib/libdb-5.3.so
> #7  0x00007f11a0778ed6 in __bamc_search () from
> /usr/local/openldap/lib/libdb-5.3.so
> #8  0x00007f11a077cb1f in __bamc_get () from
> /usr/local/openldap/lib/libdb-5.3.so
> #9  0x00007f11a083507d in __dbc_iget () from
> /usr/local/openldap/lib/libdb-5.3.so
> #10 0x00007f11a0845d00 in __dbc_get_pp () from
> /usr/local/openldap/lib/libdb-5.3.so
> #11 0x000000000053f414 in hdb_dn2id_delete (op=0x7f10883ba740,
> txn=0x7f10a82e7430, eip=0x7f10505a82b0, e=0x7f117b58b968) at dn2id.c:646
> #12 0x000000000053d8f0 in hdb_delete (op=0x7f10883ba740, rs=0x7f0fedffa970) at
> dele.c.c:406
> #13 0x00000000004bbc11 in overlay_op_walk (op=0x7f10883ba740, rs=0x7f0fedffa970,
> which=op_delete, oi=0x1c9a500, on=0x0) at backover.c:671
> #14 0x00000000004bbe28 in over_op_func (op=0x7f10883ba740, rs=0x7f0fedffa970,
> which=op_delete) at backover.c:723
> #15 0x00000000004bbfe2 in over_op_delete (op=0x7f10883ba740, rs=0x7f0fedffa970)
> at backover.c:780
> #16 0x000000000044ff7c in fe_op_delete (op=0x7f10883ba740, rs=0x7f0fedffa970) at
> delete.c:174
> #17 0x000000000044fbef in do_delete (op=0x7f10883ba740, rs=0x7f0fedffa970) at
> delete.c:95
> #18 0x000000000042d068 in connection_operation (ctx=0x7f0fedffaaa0,
> arg_v=0x7f10883ba740) at connection.c:1155
> #19 0x000000000042d609 in connection_read_thread (ctx=0x7f0fedffaaa0, argv=0x3f)
> at connection.cA1A1291
> #20 0x0000000000596261 in ldap_int_thread_pool_wrapper (xpool=0x1b77350) at
> tpool.c:688
> #21 0x00000037e5607851 in start_thread () from /lib64/libpthread.so.0
> #22 0x00000037e52e890d in clone () from /lib64/libc.so.6
>
>


-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/