[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: slapd hungs after being up for over a day under load (ITS#2952)
Hi,
Here is a new backtrace of the server being hung after a day of uptime under
load.
Brian
(gdb) info threads
18 Thread 1088744400 (LWP 23778) 0x4041b5d7 in select ()
from /lib/tls/libc.so.6
17 Thread 1097133008 (LWP 23779) 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
16 Thread 1105521616 (LWP 23902) 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
15 Thread 1115683792 (LWP 23994) 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
14 Thread 1124072400 (LWP 24081) 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
13 Thread 1133509584 (LWP 24104) 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
12 Thread 1141898192 (LWP 24132) 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
11 Thread 1150286800 (LWP 24133) 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
10 Thread 1158675408 (LWP 24742) 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
9 Thread 1167064016 (LWP 24743) 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
8 Thread 1175452624 (LWP 25506) 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
7 Thread 1183841232 (LWP 28037) 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
6 Thread 1193278416 (LWP 7557) 0x4041b5d7 in select ()
from /lib/tls/libc.so.6
5 Thread 1201667024 (LWP 7558) 0x4041b5d7 in select ()
from /lib/tls/libc.so.6
4 Thread 1210055632 (LWP 7559) 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
3 Thread 1218444240 (LWP 4658) 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
2 Thread 1226832848 (LWP 4660) 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
1 Thread 1078463360 (LWP 23777) 0x40174aad in pthread_join ()
from /lib/tls/libpthread.so.0
(gdb) attach 23777
Attaching to program: /usr/sbin/slapd, process 23777
Reading symbols from /usr/lib/libldap_r.so.2...done.
Loaded symbols for /usr/lib/libldap_r.so.2
Reading symbols from /usr/lib/liblber.so.2...done.
Loaded symbols for /usr/lib/liblber.so.2
Reading symbols from /usr/lib/libdb-4.1.so...done.
Loaded symbols for /usr/lib/libdb-4.1.so
Reading symbols from /usr/lib/libiodbc.so.2...done.
Loaded symbols for /usr/lib/libiodbc.so.2
Reading symbols from /usr/lib/libiodbcinst.so.2...done.
Loaded symbols for /usr/lib/libiodbcinst.so.2
Reading symbols from /lib/tls/libpthread.so.0...done.
[New Thread 1078463360 (LWP 23777)]
[New Thread 1226832848 (LWP 4660)]
[New Thread 1218444240 (LWP 4658)]
[New Thread 1210055632 (LWP 7559)]
[New Thread 1201667024 (LWP 7558)]
[New Thread 1193278416 (LWP 7557)]
[New Thread 1183841232 (LWP 28037)]
[New Thread 1175452624 (LWP 25506)]
[New Thread 1167064016 (LWP 24743)]
[New Thread 1158675408 (LWP 24742)]
[New Thread 1150286800 (LWP 24133)]
[New Thread 1141898192 (LWP 24132)]
[New Thread 1133509584 (LWP 24104)]
[New Thread 1124072400 (LWP 24081)]
[New Thread 1115683792 (LWP 23994)]
[New Thread 1105521616 (LWP 23902)]
[New Thread 1097133008 (LWP 23779)]
[New Thread 1088744400 (LWP 23778)]
Loaded symbols for /lib/tls/libpthread.so.0
Reading symbols from /usr/lib/libslp.so.1...done.
Loaded symbols for /usr/lib/libslp.so.1
Reading symbols from /lib/tls/libm.so.6...done.
Loaded symbols for /lib/tls/libm.so.6
Reading symbols from /lib/tls/libnsl.so.1...done.
Loaded symbols for /lib/tls/libnsl.so.1
Reading symbols from /usr/lib/libsasl2.so.2...done.
Loaded symbols for /usr/lib/libsasl2.so.2
Reading symbols from /usr/lib/i686/cmov/libssl.so.0.9.7...done.
Loaded symbols for /usr/lib/i686/cmov/libssl.so.0.9.7
Reading symbols from /usr/lib/i686/cmov/libcrypto.so.0.9.7...done.
Loaded symbols for /usr/lib/i686/cmov/libcrypto.so.0.9.7
Reading symbols from /lib/tls/libcrypt.so.1...done.
Loaded symbols for /lib/tls/libcrypt.so.1
Reading symbols from /lib/tls/libresolv.so.2...done.
Loaded symbols for /lib/tls/libresolv.so.2
Reading symbols from /usr/lib/libltdl.so.3...done.
Loaded symbols for /usr/lib/libltdl.so.3
Reading symbols from /lib/tls/libdl.so.2...done.
Loaded symbols for /lib/tls/libdl.so.2
Reading symbols from /lib/libwrap.so.0...done.
Loaded symbols for /lib/libwrap.so.0
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/tls/libnss_files.so.2...done.
Loaded symbols for /lib/tls/libnss_files.so.2
Reading symbols from /usr/lib/sasl2/libsasldb.so.2...done.
Loaded symbols for /usr/lib/sasl2/libsasldb.so.2
Reading symbols from /usr/lib/libdb3.so.3...done.
Loaded symbols for /usr/lib/libdb3.so.3
Reading symbols from /usr/lib/sasl2/libcrammd5.so.2...done.
Loaded symbols for /usr/lib/sasl2/libcrammd5.so.2
Reading symbols from /usr/lib/sasl2/libdigestmd5.so.2...done.
Loaded symbols for /usr/lib/sasl2/libdigestmd5.so.2
Reading symbols from /usr/lib/sasl2/libotp.so.2...done.
Loaded symbols for /usr/lib/sasl2/libotp.so.2
Reading symbols from /usr/lib/sasl2/libanonymous.so.2...done.
Loaded symbols for /usr/lib/sasl2/libanonymous.so.2
Reading symbols from /usr/lib/sasl2/libplain.so.2...done.
Loaded symbols for /usr/lib/sasl2/libplain.so.2
Reading symbols from /usr/lib/sasl2/liblogin.so.2...done.
Loaded symbols for /usr/lib/sasl2/liblogin.so.2
Reading symbols from /usr/lib/sasl2/libntlm.so.2...done.
Loaded symbols for /usr/lib/sasl2/libntlm.so.2
Reading symbols from /usr/lib/ldap/back_ldbm.so...done.
Loaded symbols for /usr/lib/ldap/back_ldbm.so
0x40174aad in pthread_join () from /lib/tls/libpthread.so.0
(gdb) thread apply all bt
Thread 18 (Thread 1088744400 (LWP 23778)):
#0 0x4041b5d7 in select () from /lib/tls/libc.so.6
#1 0x4017bb74 in __JCR_LIST__ () from /lib/tls/libpthread.so.0
#2 0x40173964 in start_thread () from /lib/tls/libpthread.so.0
#3 0x0812145c in ?? ()
Thread 17 (Thread 1097133008 (LWP 23779)):
#0 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib/tls/libpthread.so.0
#1 0x00000000 in ?? ()
Thread 16 (Thread 1105521616 (LWP 23902)):
#0 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib/tls/libpthread.so.0
#1 0x00000000 in ?? ()
Thread 15 (Thread 1115683792 (LWP 23994)):
#0 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib/tls/libpthread.so.0
#1 0x00000000 in ?? ()
Thread 14 (Thread 1124072400 (LWP 24081)):
#0 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib/tls/libpthread.so.0
#1 0x00000000 in ?? ()
Thread 13 (Thread 1133509584 (LWP 24104)):
#0 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib/tls/libpthread.so.0
#1 0x00000000 in ?? ()
Thread 12 (Thread 1141898192 (LWP 24132)):
#0 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib/tls/libpthread.so.0
#1 0x00000000 in ?? ()
Thread 11 (Thread 1150286800 (LWP 24133)):
#0 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib/tls/libpthread.so.0
#1 0x00000000 in ?? ()
Thread 10 (Thread 1158675408 (LWP 24742)):
#0 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib/tls/libpthread.so.0
#1 0x00000000 in ?? ()
---Type <return> to continue, or q <return> to quit---
Thread 9 (Thread 1167064016 (LWP 24743)):
#0 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib/tls/libpthread.so.0
#1 0x00000000 in ?? ()
Thread 8 (Thread 1175452624 (LWP 25506)):
#0 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib/tls/libpthread.so.0
#1 0x00000000 in ?? ()
Thread 7 (Thread 1183841232 (LWP 28037)):
#0 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib/tls/libpthread.so.0
#1 0x00000000 in ?? ()
Thread 6 (Thread 1193278416 (LWP 7557)):
#0 0x4041b5d7 in select () from /lib/tls/libc.so.6
#1 0x40124448 in db_xa_switch_4001 () from /usr/lib/libdb-4.1.so
#2 0x400f2c93 in __memp_alloc_4001 () from /usr/lib/libdb-4.1.so
#3 0x400f40c1 in __memp_fget_4001 () from /usr/lib/libdb-4.1.so
#4 0x40095f3a in __bam_search_4001 () from /usr/lib/libdb-4.1.so
#5 0x4008c150 in __bam_c_rget_4001 () from /usr/lib/libdb-4.1.so
#6 0x400895d5 in __bam_c_dup_4001 () from /usr/lib/libdb-4.1.so
#7 0x400aae99 in __db_c_get_4001 () from /usr/lib/libdb-4.1.so
#8 0x400a4d84 in __db_get_4001 () from /usr/lib/libdb-4.1.so
#9 0x40577b55 in ldbm_fetch (ldbm=0x8125548, key=
{data = 0x471fe6b8, size = 4, ulen = 0, dlen = 0, doff = 0, flags = 0})
at /home/masneyb/openldap-2.1.25/servers/slapd/back-ldbm/ldbm.c:443
#10 0x4056c526 in id2entry_rw (be=0x8113d18, id=41867, rw=0)
at /home/masneyb/openldap-2.1.25/servers/slapd/back-ldbm/id2entry.c:220
#11 0x40567753 in ldbm_back_search (be=0x8113d18, conn=0x405fda28,
op=0x81aba70, base=0x471ff8a8, nbase=0x471ff8a0, scope=2, deref=2,
slimit=1746, tlimit=3600, filter=0x41f62d68, filterstr=0x471ff898,
attrs=0x0, attrsonly=0)
at /home/masneyb/openldap-2.1.25/servers/slapd/back-ldbm/search.c:316
#12 0x08059cd4 in do_search (conn=0x405fda28, op=0x81aba70)
at /home/masneyb/openldap-2.1.25/servers/slapd/search.c:401
#13 0x08057cdf in connection_operation (ctx=0x46972a98, arg_v=0x81aba70)
at /home/masneyb/openldap-2.1.25/servers/slapd/connection.c:943
#14 0x40027258 in ldap_int_thread_pool_wrapper (xpool=0x80be690)
at /home/masneyb/openldap-2.1.25/libraries/libldap_r/tpool.c:432
#15 0x40173964 in start_thread () from /lib/tls/libpthread.so.0
#16 0x41f37ed4 in ?? ()
Thread 5 (Thread 1201667024 (LWP 7558)):
#0 0x4041b5d7 in select () from /lib/tls/libc.so.6
#1 0x40124448 in db_xa_switch_4001 () from /usr/lib/libdb-4.1.so
---Type <return> to continue, or q <return> to quit---
#2 0x400f2c93 in __memp_alloc_4001 () from /usr/lib/libdb-4.1.so
#3 0x400f40c1 in __memp_fget_4001 () from /usr/lib/libdb-4.1.so
#4 0x40095f3a in __bam_search_4001 () from /usr/lib/libdb-4.1.so
#5 0x4008c150 in __bam_c_rget_4001 () from /usr/lib/libdb-4.1.so
#6 0x400895d5 in __bam_c_dup_4001 () from /usr/lib/libdb-4.1.so
#7 0x400aae99 in __db_c_get_4001 () from /usr/lib/libdb-4.1.so
#8 0x400a4d84 in __db_get_4001 () from /usr/lib/libdb-4.1.so
#9 0x40577b55 in ldbm_fetch (ldbm=0x8125548, key=
{data = 0x479fe6b8, size = 4, ulen = 0, dlen = 0, doff = 0, flags = 0})
at /home/masneyb/openldap-2.1.25/servers/slapd/back-ldbm/ldbm.c:443
#10 0x4056c526 in id2entry_rw (be=0x8113d18, id=41847, rw=0)
at /home/masneyb/openldap-2.1.25/servers/slapd/back-ldbm/id2entry.c:220
#11 0x40567753 in ldbm_back_search (be=0x8113d18, conn=0x405fe448,
op=0x81b2648, base=0x479ff8a8, nbase=0x479ff8a0, scope=2, deref=2,
slimit=1747, tlimit=3600, filter=0x49635270, filterstr=0x479ff898,
attrs=0x0, attrsonly=0)
at /home/masneyb/openldap-2.1.25/servers/slapd/back-ldbm/search.c:316
#12 0x08059cd4 in do_search (conn=0x405fe448, op=0x81b2648)
at /home/masneyb/openldap-2.1.25/servers/slapd/search.c:401
#13 0x08057cdf in connection_operation (ctx=0x41f89ba0, arg_v=0x81b2648)
at /home/masneyb/openldap-2.1.25/servers/slapd/connection.c:943
#14 0x40027258 in ldap_int_thread_pool_wrapper (xpool=0x80be690)
at /home/masneyb/openldap-2.1.25/libraries/libldap_r/tpool.c:432
#15 0x40173964 in start_thread () from /lib/tls/libpthread.so.0
#16 0x41f6db6c in ?? ()
Thread 4 (Thread 1210055632 (LWP 7559)):
#0 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib/tls/libpthread.so.0
#1 0x00000000 in ?? ()
Thread 3 (Thread 1218444240 (LWP 4658)):
#0 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib/tls/libpthread.so.0
#1 0x00000000 in ?? ()
Thread 2 (Thread 1226832848 (LWP 4660)):
#0 0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib/tls/libpthread.so.0
#1 0x00000000 in ?? ()
Thread 1 (Thread 1078463360 (LWP 23777)):
#0 0x40174aad in pthread_join () from /lib/tls/libpthread.so.0
#0 0x40174aad in pthread_join () from /lib/tls/libpthread.so.0
On Thu, Feb 05, 2004 at 09:59:13PM -0800, Howard Chu wrote:
> There is not enough information here to draw any useful conclusions. Try
> getting more info out of gdb, e.g.
> info threads
> thread apply all bt
>
> Also, it appears that your symbol information is not intact in this
> backtrace, there's no way "ldap_pvt_thread_pool_destroy" would be an ancestor
> of any of these calls. Please make sure you use the correct binary when
> attaching the debugger. I don't believe the trace you provided bears any
> relation to reality.
>
> -- Howard Chu
> Chief Architect, Symas Corp. Director, Highland Sun
> http://www.symas.com http://highlandsun.com/hyc
> Symas: Premier OpenSource Development and Support
>
> > -----Original Message-----
> > From: owner-openldap-bugs@OpenLDAP.org
> > [mailto:owner-openldap-bugs@OpenLDAP.org]On Behalf Of masneyb@gftp.org
>
> > Full_Name: Brian Masney
> > Version: 2.1.25 (20031217)
> > OS: Debian GNU/Linux
> > URL: ftp://ftp.openldap.org/incoming/
> > Submission from: (NULL) (216.12.23.12)
> >
> >
> > There is a bug in slapd that it will hang whenever it's up
> > for more than a day.
> > It will accept a TCP connection but it will not perform any
> > kind of reads and
> > writes.
> > On our main LDAP master server, after slapd hung on the
> > slave, the entry it hung
> > on was a delete request. I initially did a strace on the hung
> > slapd process and
> > it showed this:
> >
> > futex(0x40e3ec18, FUTEX_WAIT, 14751, NULL <unfinished ...>
> >
> > Here is a gdb backtrace:
> >
> > (gdb) bt
> > #0 0x080501a1 in ber_memcalloc ()
> > #1 0x080696f5 in ch_calloc ()
> > #2 0x40556b5e in idl_alloc () from /usr/lib/ldap/back_ldbm.so
> > #3 0x40556b8b in idl_allids () from /usr/lib/ldap/back_ldbm.so
> > #4 0x40556c9a in idl_free () from /usr/lib/ldap/back_ldbm.so
> > #5 0x40557d7a in idl_delete_key () from /usr/lib/ldap/back_ldbm.so
> > #6 0x4055ce4f in dn2id_delete () from /usr/lib/ldap/back_ldbm.so
> > #7 0x40561853 in ldbm_back_delete () from /usr/lib/ldap/back_ldbm.so
> > #8 0x080684bb in do_delete ()
> > #9 0x08056517 in connection_done ()
> > #10 0x40026c58 in ldap_pvt_thread_pool_destroy () from
> > /usr/lib/libldap_r.so.2
> > #11 0x40166964 in start_thread () from /lib/tls/libpthread.so.0
> > #12 0x42857854 in ?? ()
> >
> > Also, even though I'm using this on Debian, the source that I
> > am using is the
> > official 2.1.25 (20031217) source without any of the Debian
> > GNU/TLS patches
> > applied.
> >
>
>