[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#7010) Deadlock in Producer/Consumer Nodes



ghola@rebelbase.com wrote:
> Full_Name: Duncan Idaho
> Version: 2.4.25
> OS: Centos 5.5
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (216.148.0.72)
>
>
> We are running a cluster of "synchronization hosts" that act as Producers for a
> large number of Consumers and additionally Consume from a single global
> read/write master.  After several weeks of stable operation synchronization came
> to a halt on both nodes in the Producer/Consumer cluster simultaneously.

Can you also provide the output of db_stat -CA on the database when this 
lockup occurs? It seems that a database operation is preventing syncprov from 
making any progress.
>
> Stack-trace:
>
> Thread 15 (Thread 10745):
> #0  0x000000322ea07b35 in pthread_join () from /lib64/libpthread.so.0
> #1  0x000000000041929d in slapd_daemon () at daemon.c:2922
> #2  0x000000000040730e in main (argc=<value optimized out>, argv=0x7fffd2567cb8)
> at main.c:983
>
> Thread 14 (Thread 8866):
> #0  0x000000322ea0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from
> /lib64/libpthread.so.0
> #1  0x00000000004ec695 in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
> tpool.c:672
> #2  0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #3  0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #4  0x0000000000000000 in ?? ()
>
> Thread 13 (Thread 14876):
> #0  0x000000322ea0d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0
> #1  0x000000322ea08e1a in _L_lock_1034 () from /lib64/libpthread.so.0
> #2  0x000000322ea08cdc in pthread_mutex_lock () from /lib64/libpthread.so.0
> #3  0x00000000004d7e48 in syncprov_op_search (op=0x2aaac06ab4a0, rs=0x471a9c10)
> at syncprov.c:2476
> #4  0x000000000047e64a in overlay_op_walk (op=0x2aaac06ab4a0, rs=0x471a9c10,
> which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
> #5  0x000000000047ec27 in over_op_func (op=0x2aaac06ab4a0, rs=0x471a9c10,
> which=op_search) at backover.c:721
> #6  0x0000000000420a79 in fe_op_search (op=0x2aaac06ab4a0, rs=0x471a9c10) at
> search.c:372
> #7  0x0000000000421217 in do_search (op=0x2aaac06ab4a0, rs=0x471a9c10) at
> search.c:217
> #8  0x000000000041e965 in connection_operation (ctx=0x471a9d60, arg_v=<value
> optimized out>) at connection.c:1113
> #9  0x000000000041f23c in connection_read_thread (ctx=0x471a9d60, argv=<value
> optimized out>) at connection.c:1249
> #10 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
> tpool.c:685
> #11 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #12 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #13 0x0000000000000000 in ?? ()
>
> Thread 12 (Thread 14875):
> #0  0x000000322ea0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from
> /lib64/libpthread.so.0
> #1  0x00000000004ec695 in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
> tpool.c:672
> #2  0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #3  0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #4  0x0000000000000000 in ?? ()
>
> Thread 11 (Thread 14874):
> #0  0x000000322ea0d605 in __lll_unlock_wake () from /lib64/libpthread.so.0
> #1  0x000000322ea0a0e7 in _L_unlock_766 () from /lib64/libpthread.so.0
> #2  0x000000322ea0a04e in pthread_mutex_unlock () from /lib64/libpthread.so.0
> #3  0x00000000004d7e5e in syncprov_op_search (op=0x2aaac0897eb0, rs=0x461a7c10)
> at syncprov.c:2471
> #4  0x000000000047e64a in overlay_op_walk (op=0x2aaac0897eb0, rs=0x461a7c10,
> which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
> #5  0x000000000047ec27 in over_op_func (op=0x2aaac0897eb0, rs=0x461a7c10,
> which=op_search) at backover.c:721
> #6  0x0000000000420a79 in fe_op_search (op=0x2aaac0897eb0, rs=0x461a7c10) at
> search.c:372
> #7  0x0000000000421217 in do_search (op=0x2aaac0897eb0, rs=0x461a7c10) at
> search.c:217
> #8  0x000000000041e965 in connection_operation (ctx=0x461a7d60, arg_v=<value
> optimized out>) at connection.c:1113
> #9  0x000000000041f23c in connection_read_thread (ctx=0x461a7d60, argv=<value
> optimized out>) at connection.c:1249
> #10 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
> tpool.c:685
> #11 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #12 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #13 0x0000000000000000 in ?? ()
>
> Thread 10 (Thread 14873):
> #0  0x000000322ea0d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0
> #1  0x000000322ea08e1a in _L_lock_1034 () from /lib64/libpthread.so.0
> #2  0x000000322ea08cdc in pthread_mutex_lock () from /lib64/libpthread.so.0
> #3  0x00000000004d7e48 in syncprov_op_search (op=0x2aaab82b63c0, rs=0x459a6c10)
> at syncprov.c:2476
> #4  0x000000000047e64a in overlay_op_walk (op=0x2aaab82b63c0, rs=0x459a6c10,
> which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
> #5  0x000000000047ec27 in over_op_func (op=0x2aaab82b63c0, rs=0x459a6c10,
> which=op_search) at backover.c:721
> #6  0x0000000000420a79 in fe_op_search (op=0x2aaab82b63c0, rs=0x459a6c10) at
> search.c:372
> #7  0x0000000000421217 in do_search (op=0x2aaab82b63c0, rs=0x459a6c10) at
> search.c:217
>
> ...back 1 page
>
> Thread 15 (Thread 10745):
> #0  0x000000322ea07b35 in pthread_join () from /lib64/libpthread.so.0
> #1  0x000000000041929d in slapd_daemon () at daemon.c:2922
> #2  0x000000000040730e in main (argc=<value optimized out>, argv=0x7fffd2567cb8)
> at main.c:983
>
> Thread 14 (Thread 8866):
> #0  0x000000322ea0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from
> /lib64/libpthread.so.0
> #1  0x00000000004ec695 in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
> tpool.c:672
> #2  0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #3  0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #4  0x0000000000000000 in ?? ()
>
> Thread 13 (Thread 14876):
> #0  0x000000322ea0d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0
> #1  0x000000322ea08e1a in _L_lock_1034 () from /lib64/libpthread.so.0
> #2  0x000000322ea08cdc in pthread_mutex_lock () from /lib64/libpthread.so.0
> #3  0x00000000004d7e48 in syncprov_op_search (op=0x2aaac06ab4a0, rs=0x471a9c10)
> at syncprov.c:2476
> #4  0x000000000047e64a in overlay_op_walk (op=0x2aaac06ab4a0, rs=0x471a9c10,
> which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
> #5  0x000000000047ec27 in over_op_func (op=0x2aaac06ab4a0, rs=0x471a9c10,
> which=op_search) at backover.c:721
> #6  0x0000000000420a79 in fe_op_search (op=0x2aaac06ab4a0, rs=0x471a9c10) at
> search.c:372
> #7  0x0000000000421217 in do_search (op=0x2aaac06ab4a0, rs=0x471a9c10) at
> search.c:217
> #8  0x000000000041e965 in connection_operation (ctx=0x471a9d60, arg_v=<value
> optimized out>) at connection.c:1113
> #9  0x000000000041f23c in connection_read_thread (ctx=0x471a9d60, argv=<value
> optimized out>) at connection.c:1249
> #10 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
> tpool.c:685
> #11 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #12 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #13 0x0000000000000000 in ?? ()
>
> Thread 12 (Thread 14875):
> #0  0x000000322ea0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from
> /lib64/libpthread.so.0
> #1  0x00000000004ec695 in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
> tpool.c:672
> #2  0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #3  0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #4  0x0000000000000000 in ?? ()
>
> Thread 11 (Thread 14874):
> #0  0x000000322ea0d605 in __lll_unlock_wake () from /lib64/libpthread.so.0
> #1  0x000000322ea0a0e7 in _L_unlock_766 () from /lib64/libpthread.so.0
> #2  0x000000322ea0a04e in pthread_mutex_unlock () from /lib64/libpthread.so.0
> #3  0x00000000004d7e5e in syncprov_op_search (op=0x2aaac0897eb0, rs=0x461a7c10)
> at syncprov.c:2471
> #4  0x000000000047e64a in overlay_op_walk (op=0x2aaac0897eb0, rs=0x461a7c10,
> which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
> #5  0x000000000047ec27 in over_op_func (op=0x2aaac0897eb0, rs=0x461a7c10,
> which=op_search) at backover.c:721
> #6  0x0000000000420a79 in fe_op_search (op=0x2aaac0897eb0, rs=0x461a7c10) at
> search.c:372
> #7  0x0000000000421217 in do_search (op=0x2aaac0897eb0, rs=0x461a7c10) at
> search.c:217
> #8  0x000000000041e965 in connection_operation (ctx=0x461a7d60, arg_v=<value
> optimized out>) at connection.c:1113
> #9  0x000000000041f23c in connection_read_thread (ctx=0x461a7d60, argv=<value
> optimized out>) at connection.c:1249
> #10 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
> tpool.c:685
> #11 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #12 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #13 0x0000000000000000 in ?? ()
>
> Thread 10 (Thread 14873):
> #0  0x000000322ea0d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0
> #1  0x000000322ea08e1a in _L_lock_1034 () from /lib64/libpthread.so.0
> #2  0x000000322ea08cdc in pthread_mutex_lock () from /lib64/libpthread.so.0
> #3  0x00000000004d7e48 in syncprov_op_search (op=0x2aaab82b63c0, rs=0x459a6c10)
> at syncprov.c:2476
> #4  0x000000000047e64a in overlay_op_walk (op=0x2aaab82b63c0, rs=0x459a6c10,
> which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
> #5  0x000000000047ec27 in over_op_func (op=0x2aaab82b63c0, rs=0x459a6c10,
> which=op_search) at backover.c:721
> #6  0x0000000000420a79 in fe_op_search (op=0x2aaab82b63c0, rs=0x459a6c10) at
> search.c:372
> #7  0x0000000000421217 in do_search (op=0x2aaab82b63c0, rs=0x459a6c10) at
> search.c:217
> [root@su1-admin-syncrepl01 ldap]# cat gdb.txt
>
> Thread 15 (Thread 10745):
> #0  0x000000322ea07b35 in pthread_join () from /lib64/libpthread.so.0
> #1  0x000000000041929d in slapd_daemon () at daemon.c:2922
> #2  0x000000000040730e in main (argc=<value optimized out>, argv=0x7fffd2567cb8)
> at main.c:983
>
> Thread 14 (Thread 8866):
> #0  0x000000322ea0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from
> /lib64/libpthread.so.0
> #1  0x00000000004ec695 in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
> tpool.c:672
> #2  0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #3  0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #4  0x0000000000000000 in ?? ()
>
> Thread 13 (Thread 14876):
> #0  0x000000322ea0d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0
> #1  0x000000322ea08e1a in _L_lock_1034 () from /lib64/libpthread.so.0
> #2  0x000000322ea08cdc in pthread_mutex_lock () from /lib64/libpthread.so.0
> #3  0x00000000004d7e48 in syncprov_op_search (op=0x2aaac06ab4a0, rs=0x471a9c10)
> at syncprov.c:2476
> #4  0x000000000047e64a in overlay_op_walk (op=0x2aaac06ab4a0, rs=0x471a9c10,
> which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
> #5  0x000000000047ec27 in over_op_func (op=0x2aaac06ab4a0, rs=0x471a9c10,
> which=op_search) at backover.c:721
> #6  0x0000000000420a79 in fe_op_search (op=0x2aaac06ab4a0, rs=0x471a9c10) at
> search.c:372
> #7  0x0000000000421217 in do_search (op=0x2aaac06ab4a0, rs=0x471a9c10) at
> search.c:217
> #8  0x000000000041e965 in connection_operation (ctx=0x471a9d60, arg_v=<value
> optimized out>) at connection.c:1113
> #9  0x000000000041f23c in connection_read_thread (ctx=0x471a9d60, argv=<value
> optimized out>) at connection.c:1249
> #10 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
> tpool.c:685
> #11 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #12 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #13 0x0000000000000000 in ?? ()
>
> Thread 12 (Thread 14875):
> #0  0x000000322ea0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from
> /lib64/libpthread.so.0
> #1  0x00000000004ec695 in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
> tpool.c:672
> #2  0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #3  0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #4  0x0000000000000000 in ?? ()
>
> Thread 11 (Thread 14874):
> #0  0x000000322ea0d605 in __lll_unlock_wake () from /lib64/libpthread.so.0
> #1  0x000000322ea0a0e7 in _L_unlock_766 () from /lib64/libpthread.so.0
> #2  0x000000322ea0a04e in pthread_mutex_unlock () from /lib64/libpthread.so.0
> #3  0x00000000004d7e5e in syncprov_op_search (op=0x2aaac0897eb0, rs=0x461a7c10)
> at syncprov.c:2471
> #4  0x000000000047e64a in overlay_op_walk (op=0x2aaac0897eb0, rs=0x461a7c10,
> which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
> #5  0x000000000047ec27 in over_op_func (op=0x2aaac0897eb0, rs=0x461a7c10,
> which=op_search) at backover.c:721
> #6  0x0000000000420a79 in fe_op_search (op=0x2aaac0897eb0, rs=0x461a7c10) at
> search.c:372
> #7  0x0000000000421217 in do_search (op=0x2aaac0897eb0, rs=0x461a7c10) at
> search.c:217
> #8  0x000000000041e965 in connection_operation (ctx=0x461a7d60, arg_v=<value
> optimized out>) at connection.c:1113
> #9  0x000000000041f23c in connection_read_thread (ctx=0x461a7d60, argv=<value
> optimized out>) at connection.c:1249
> #10 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
> tpool.c:685
> #11 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #12 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #13 0x0000000000000000 in ?? ()
>
> Thread 10 (Thread 14873):
> #0  0x000000322ea0d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0
> #1  0x000000322ea08e1a in _L_lock_1034 () from /lib64/libpthread.so.0
> #2  0x000000322ea08cdc in pthread_mutex_lock () from /lib64/libpthread.so.0
> #3  0x00000000004d7e48 in syncprov_op_search (op=0x2aaab82b63c0, rs=0x459a6c10)
> at syncprov.c:2476
> #4  0x000000000047e64a in overlay_op_walk (op=0x2aaab82b63c0, rs=0x459a6c10,
> which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
> #5  0x000000000047ec27 in over_op_func (op=0x2aaab82b63c0, rs=0x459a6c10,
> which=op_search) at backover.c:721
> #6  0x0000000000420a79 in fe_op_search (op=0x2aaab82b63c0, rs=0x459a6c10) at
> search.c:372
> #7  0x0000000000421217 in do_search (op=0x2aaab82b63c0, rs=0x459a6c10) at
> search.c:217
> #8  0x000000000041e965 in connection_operation (ctx=0x459a6d60, arg_v=<value
> optimized out>) at connection.c:1113
> #9  0x000000000041f23c in connection_read_thread (ctx=0x459a6d60, argv=<value
> optimized out>) at connection.c:1249
> #10 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
> tpool.c:685
> #11 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #12 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #13 0x0000000000000000 in ?? ()
>
> Thread 9 (Thread 14872):
> #0  0x000000322ea0d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0
> #1  0x000000322ea08e1a in _L_lock_1034 () from /lib64/libpthread.so.0
> #2  0x000000322ea08cdc in pthread_mutex_lock () from /lib64/libpthread.so.0
> #3  0x00000000004d7e48 in syncprov_op_search (op=0x2aaab89e58c0, rs=0x451a5c10)
> at syncprov.c:2476
> #4  0x000000000047e64a in overlay_op_walk (op=0x2aaab89e58c0, rs=0x451a5c10,
> which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
> #5  0x000000000047ec27 in over_op_func (op=0x2aaab89e58c0, rs=0x451a5c10,
> which=op_search) at backover.c:721
> #6  0x0000000000420a79 in fe_op_search (op=0x2aaab89e58c0, rs=0x451a5c10) at
> search.c:372
> #7  0x0000000000421217 in do_search (op=0x2aaab89e58c0, rs=0x451a5c10) at
> search.c:217
> #8  0x000000000041e965 in connection_operation (ctx=0x451a5d60, arg_v=<value
> optimized out>) at connection.c:1113
> #9  0x000000000041f23c in connection_read_thread (ctx=0x451a5d60, argv=<value
> optimized out>) at connection.c:1249
> #10 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
> tpool.c:685
> #11 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #12 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #13 0x0000000000000000 in ?? ()
>
> Thread 8 (Thread 16034):
> #0  0x000000322ea0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from
> /lib64/libpthread.so.0
> #1  0x000000000042db1c in send_ldap_ber (op=<value optimized out>,
> ber=0x449a43f0) at result.c:372
> #2  0x00000000004308d8 in slap_send_search_entry (op=0x449a45a0, rs=0x449a4af0)
> at result.c:1410
> #3  0x00000000004da80d in syncprov_qplay (ctx=<value optimized out>, arg=<value
> optimized out>) at syncprov.c:884
> #4  syncprov_qtask (ctx=<value optimized out>, arg=<value optimized out>) at
> syncprov.c:1006
> #5  0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
> tpool.c:685
> #6  0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #7  0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #8  0x0000000000000000 in ?? ()
>
> Thread 7 (Thread 16033):
> #0  0x000000322ea0d605 in __lll_unlock_wake () from /lib64/libpthread.so.0
> #1  0x000000322ea0a0e7 in _L_unlock_766 () from /lib64/libpthread.so.0
> #2  0x000000322ea0a04e in pthread_mutex_unlock () from /lib64/libpthread.so.0
> #3  0x00000000004d7e5e in syncprov_op_search (op=0x2d554b70, rs=0x441a3c10) at
> syncprov.c:2471
> #4  0x000000000047e64a in overlay_op_walk (op=0x2d554b70, rs=0x441a3c10,
> which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
> #5  0x000000000047ec27 in over_op_func (op=0x2d554b70, rs=0x441a3c10,
> which=op_search) at backover.c:721
> #6  0x0000000000420a79 in fe_op_search (op=0x2d554b70, rs=0x441a3c10) at
> search.c:372
> #7  0x0000000000421217 in do_search (op=0x2d554b70, rs=0x441a3c10) at
> search.c:217
> #8  0x000000000041e965 in connection_operation (ctx=0x441a3d60, arg_v=<value
> optimized out>) at connection.c:1113
> #9  0x000000000041f23c in connection_read_thread (ctx=0x441a3d60, argv=<value
> optimized out>) at connection.c:1249
> #10 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
> tpool.c:685
> #11 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #12 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #13 0x0000000000000000 in ?? ()
>
> Thread 6 (Thread 11937):
> #0  0x000000322e2baa27 in sched_yield () from /lib64/libc.so.6
> #1  0x00000000004d7e7f in syncprov_op_search (op=0x2aaab802af20, rs=0x418f7c10)
> at syncprov.c:2475
> #2  0x000000000047e64a in overlay_op_walk (op=0x2aaab802af20, rs=0x418f7c10,
> which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
> #3  0x000000000047ec27 in over_op_func (op=0x2aaab802af20, rs=0x418f7c10,
> which=op_search) at backover.c:721
> #4  0x0000000000420a79 in fe_op_search (op=0x2aaab802af20, rs=0x418f7c10) at
> search.c:372
> #5  0x0000000000421217 in do_search (op=0x2aaab802af20, rs=0x418f7c10) at
> search.c:217
> #6  0x000000000041e965 in connection_operation (ctx=0x418f7d60, arg_v=<value
> optimized out>) at connection.c:1113
> #7  0x000000000041f23c in connection_read_thread (ctx=0x418f7d60, argv=<value
> optimized out>) at connection.c:1249
> #8  0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
> tpool.c:685
> #9  0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #10 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #11 0x0000000000000000 in ?? ()
>
> Thread 5 (Thread 10755):
> #0  0x000000322ea0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from
> /lib64/libpthread.so.0
> #1  0x00002b1d8ae85bd1 in __db_pthread_mutex_lock () from
> /usr/local/openldap/lib/libdb-5.1.so
> #2  0x00002b1d8ae8528a in __db_tas_mutex_lock () from
> /usr/local/openldap/lib/libdb-5.1.so
> #3  0x00002b1d8af1804c in __lock_get_internal () from
> /usr/local/openldap/lib/libdb-5.1.so
> #4  0x00002b1d8af189ea in __lock_vec () from
> /usr/local/openldap/lib/libdb-5.1.so
> #5  0x00002b1d8af1969b in __lock_vec_pp () from
> /usr/local/openldap/lib/libdb-5.1.so
> #6  0x00000000004bc6c1 in hdb_cache_entry_db_relock (bdb=0xe2bc780, txn=<value
> optimized out>, ei=0x2aaadaefc6a0, rw=1, tryOnly=<value optimized out>,
> lock=0x439a1640) at cache.c:198
> #7  0x00000000004bcb70 in hdb_cache_modify (bdb=0x2b1d8f0fa00c,
> e=0x2b1df490e138, newAttrs=0x2b1e071d6918, txn=0x0, lock=0x0) at cache.c:1231
> #8  0x0000000000495d13 in hdb_modify (op=0x439a2120, rs=0x439a1cb0) at
> modify.c:662
> #9  0x000000000047e6c2 in overlay_op_walk (op=0x439a2120, rs=0x439a1cb0,
> which=op_modify, oi=0xe2be010, on=0x0) at backover.c:669
> #10 0x000000000047ec27 in over_op_func (op=0x439a2120, rs=0x439a1cb0,
> which=op_modify) at backover.c:721
> #11 0x000000000047240f in syncrepl_entry (si=0xe2c2800, op=0x439a2120,
> entry=0x2b1df490e408, modlist=0x439a2c68, syncstate=2, syncUUID=<value optimized
> out>, syncCSN=0x2aaade6b6ab0) at syncrepl.c:2694
> #12 0x0000000000479cf7 in do_syncrep2 (ctx=<value optimized out>, arg=<value
> optimized out>) at syncrepl.c:955
> #13 do_syncrepl (ctx=<value optimized out>, arg=<value optimized out>) at
> syncrepl.c:1455
> #14 0x000000000041f7ca in connection_read_thread (ctx=0x439a2d60, argv=<value
> optimized out>) at connection.c:1251
> #15 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
> tpool.c:685
> #16 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #17 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #18 0x0000000000000000 in ?? ()
>
> Thread 4 (Thread 10752):
> #0  0x000000322e2baa27 in sched_yield () from /lib64/libc.so.6
> #1  0x00000000004d7e7f in syncprov_op_search (op=0xe6f0990, rs=0x410f6c10) at
> syncprov.c:2475
> #2  0x000000000047e64a in overlay_op_walk (op=0xe6f0990, rs=0x410f6c10,
> which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
> #3  0x000000000047ec27 in over_op_func (op=0xe6f0990, rs=0x410f6c10,
> which=op_search) at backover.c:721
> #4  0x0000000000420a79 in fe_op_search (op=0xe6f0990, rs=0x410f6c10) at
> search.c:372
> #5  0x0000000000421217 in do_search (op=0xe6f0990, rs=0x410f6c10) at
> search.c:217
> #6  0x000000000041e965 in connection_operation (ctx=0x410f6d60, arg_v=<value
> optimized out>) at connection.c:1113
> #7  0x000000000041f23c in connection_read_thread (ctx=0x410f6d60, argv=<value
> optimized out>) at connection.c:1249
> #8  0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
> tpool.c:685
> #9  0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #10 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #11 0x0000000000000000 in ?? ()
>
> Thread 3 (Thread 10749):
> #0  0x000000322e2baa27 in sched_yield () from /lib64/libc.so.6
> #1  0x00000000004d7e7f in syncprov_op_search (op=0xe5eefa0, rs=0x431a1c10) at
> syncprov.c:2475
> #2  0x000000000047e64a in overlay_op_walk (op=0xe5eefa0, rs=0x431a1c10,
> which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
> #3  0x000000000047ec27 in over_op_func (op=0xe5eefa0, rs=0x431a1c10,
> which=op_search) at backover.c:721
> #4  0x0000000000420a79 in fe_op_search (op=0xe5eefa0, rs=0x431a1c10) at
> search.c:372
> #5  0x0000000000421217 in do_search (op=0xe5eefa0, rs=0x431a1c10) at
> search.c:217
> #6  0x000000000041e965 in connection_operation (ctx=0x431a1d60, arg_v=<value
> optimized out>) at connection.c:1113
> #7  0x000000000041f23c in connection_read_thread (ctx=0x431a1d60, argv=<value
> optimized out>) at connection.c:1249
> #8  0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
> tpool.c:685
> #9  0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #10 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #11 0x0000000000000000 in ?? ()
>
> Thread 2 (Thread 10748):
> #0  0x000000322e2baa27 in sched_yield () from /lib64/libc.so.6
> #1  0x00000000004d7e7f in syncprov_op_search (op=0xe5ebbf0, rs=0x429a0c10) at
> syncprov.c:2475
> #2  0x000000000047e64a in overlay_op_walk (op=0xe5ebbf0, rs=0x429a0c10,
> which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
> #3  0x000000000047ec27 in over_op_func (op=0xe5ebbf0, rs=0x429a0c10,
> which=op_search) at backover.c:721
> #4  0x0000000000420a79 in fe_op_search (op=0xe5ebbf0, rs=0x429a0c10) at
> search.c:372
> #5  0x0000000000421217 in do_search (op=0xe5ebbf0, rs=0x429a0c10) at
> search.c:217
> #6  0x000000000041e965 in connection_operation (ctx=0x429a0d60, arg_v=<value
> optimized out>) at connection.c:1113
> #7  0x000000000041f23c in connection_read_thread (ctx=0x429a0d60, argv=<value
> optimized out>) at connection.c:1249
> #8  0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
> tpool.c:685
> #9  0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #10 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #11 0x0000000000000000 in ?? ()
>
> Thread 1 (Thread 10747):
> #0  0x000000322e2d4108 in epoll_wait () from /lib64/libc.so.6
> #1  0x000000000041bc8b in slapd_daemon_task (ptr=<value optimized out>) at
> daemon.c:2528
> #2  0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
> #3  0x000000322e2d3d1d in clone () from /lib64/libc.so.6
> #4  0x0000000000000000 in ?? ()
>
> Logs at the time of the lock-up:
>
> Producer:
> Jul 28 22:27:08 admin02 slapd[5178]: conn=422733 op=92 MOD
> dn="thing=thing1,ou=things,dc=example,dc=net"
> Jul 28 22:27:08 admin02 slapd[5178]: conn=422733 op=92 MOD attr=stuff
> Jul 28 22:27:08 admin02 slapd[5178]: slap_queue_csn: queing 0x4b0883f0
>   20110728222708.058275Z#000000#000#000000
> Jul 28 22:27:08 admin02 slapd[5178]: syncprov_sendresp: cookie=rid=001
> ,csn=20110728222708.058275Z#000000#000#000000
> Jul 28 22:27:08 admin02 slapd[5178]: syncprov_sendresp: cookie=rid=001
>
> Producer later:
> Jul 29 07:04:55 admin02 slapd[5178]: Entry
> thing=thing1,ou=things,dc=example,dc=net CSN
> 20110728222708.058275Z#000000#000#000000 older or equal to ctx
> 20110728222708.058275Z#000000#000#000000
>
> Producer/Consumer:
> Jul 28 22:27:08 su1-admin-syncrepl01 slapd[10745]: do_syncrep2: rid=001
> cookie=rid=001,csn=20110728222708.058275Z#000000#000#000000
> Jul 28 22:27:08 su1-admin-syncrepl01 slapd[10745]: slap_queue_csn: queing
> 0x2aaadecf4b00 20110728222708.058275Z#000000#000#000000
>
> ^^ No more syncrepl logs after this
>
> Help!
>
>


-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/