[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: slapd hungs after being up for over a day under load (ITS#2952)



> -----Original Message-----
> From: masneyb@marge.ntelos.net [mailto:masneyb@marge.ntelos.net]On Behalf
Of Brian Masney

> Hi,
>    Here is a new backtrace of the server being hung after a
> day of uptime under load.

This trace is much more enlightening.

I see that you have both BerkeleyDB 4.1 and BerkeleyDB 3 linked in here.
Since you have versioned symbols, it doesn't conflict, but you're better off
consistently using the newest version of BDB across your entire platform.

Since the hang occurs in the BDB library, it seems you have not allocated a
large enough dbcachesize in slapd.conf. BerkeleyDB 4.1 is known to hang
whenever it runs out of cache. Unfortunately with back-ldbm there is no easy
way to query the existing cache to see how it's performing, but you can still
use db_stat on the individual database files to get an estimate of how much
cache you should be using.

See the FAQ-o-Matic for more information:
http://www.openldap.org/faq/index.cgi?file=191

This does not appear to be an OpenLDAP bug, this ITS will be closed.

> (gdb) attach 23777
> Attaching to program: /usr/sbin/slapd, process 23777
> Reading symbols from /usr/lib/libldap_r.so.2...done.
> Loaded symbols for /usr/lib/libldap_r.so.2
> Reading symbols from /usr/lib/liblber.so.2...done.
> Loaded symbols for /usr/lib/liblber.so.2
> Reading symbols from /usr/lib/libdb-4.1.so...done.
> Loaded symbols for /usr/lib/libdb-4.1.so
> Reading symbols from /usr/lib/libiodbc.so.2...done.
> Loaded symbols for /usr/lib/libiodbc.so.2
> Reading symbols from /usr/lib/libiodbcinst.so.2...done.
> Loaded symbols for /usr/lib/libiodbcinst.so.2
> Reading symbols from /lib/tls/libpthread.so.0...done.
> [New Thread 1078463360 (LWP 23777)]
> [New Thread 1226832848 (LWP 4660)]
> [New Thread 1218444240 (LWP 4658)]
> [New Thread 1210055632 (LWP 7559)]
> [New Thread 1201667024 (LWP 7558)]
> [New Thread 1193278416 (LWP 7557)]
> [New Thread 1183841232 (LWP 28037)]
> [New Thread 1175452624 (LWP 25506)]
> [New Thread 1167064016 (LWP 24743)]
> [New Thread 1158675408 (LWP 24742)]
> [New Thread 1150286800 (LWP 24133)]
> [New Thread 1141898192 (LWP 24132)]
> [New Thread 1133509584 (LWP 24104)]
> [New Thread 1124072400 (LWP 24081)]
> [New Thread 1115683792 (LWP 23994)]
> [New Thread 1105521616 (LWP 23902)]
> [New Thread 1097133008 (LWP 23779)]
> [New Thread 1088744400 (LWP 23778)]
> Loaded symbols for /lib/tls/libpthread.so.0
> Reading symbols from /usr/lib/libslp.so.1...done.
> Loaded symbols for /usr/lib/libslp.so.1
> Reading symbols from /lib/tls/libm.so.6...done.
> Loaded symbols for /lib/tls/libm.so.6
> Reading symbols from /lib/tls/libnsl.so.1...done.
> Loaded symbols for /lib/tls/libnsl.so.1
> Reading symbols from /usr/lib/libsasl2.so.2...done.
> Loaded symbols for /usr/lib/libsasl2.so.2
> Reading symbols from /usr/lib/i686/cmov/libssl.so.0.9.7...done.
> Loaded symbols for /usr/lib/i686/cmov/libssl.so.0.9.7
> Reading symbols from /usr/lib/i686/cmov/libcrypto.so.0.9.7...done.
> Loaded symbols for /usr/lib/i686/cmov/libcrypto.so.0.9.7
> Reading symbols from /lib/tls/libcrypt.so.1...done.
> Loaded symbols for /lib/tls/libcrypt.so.1
> Reading symbols from /lib/tls/libresolv.so.2...done.
> Loaded symbols for /lib/tls/libresolv.so.2
> Reading symbols from /usr/lib/libltdl.so.3...done.
> Loaded symbols for /usr/lib/libltdl.so.3
> Reading symbols from /lib/tls/libdl.so.2...done.
> Loaded symbols for /lib/tls/libdl.so.2
> Reading symbols from /lib/libwrap.so.0...done.
> Loaded symbols for /lib/libwrap.so.0
> Reading symbols from /lib/tls/libc.so.6...done.
> Loaded symbols for /lib/tls/libc.so.6
> Reading symbols from /lib/ld-linux.so.2...done.
> Loaded symbols for /lib/ld-linux.so.2
> Reading symbols from /lib/tls/libnss_files.so.2...done.
> Loaded symbols for /lib/tls/libnss_files.so.2
> Reading symbols from /usr/lib/sasl2/libsasldb.so.2...done.
> Loaded symbols for /usr/lib/sasl2/libsasldb.so.2
> Reading symbols from /usr/lib/libdb3.so.3...done.
> Loaded symbols for /usr/lib/libdb3.so.3
> Reading symbols from /usr/lib/sasl2/libcrammd5.so.2...done.
> Loaded symbols for /usr/lib/sasl2/libcrammd5.so.2
> Reading symbols from /usr/lib/sasl2/libdigestmd5.so.2...done.
> Loaded symbols for /usr/lib/sasl2/libdigestmd5.so.2
> Reading symbols from /usr/lib/sasl2/libotp.so.2...done.
> Loaded symbols for /usr/lib/sasl2/libotp.so.2
> Reading symbols from /usr/lib/sasl2/libanonymous.so.2...done.
> Loaded symbols for /usr/lib/sasl2/libanonymous.so.2
> Reading symbols from /usr/lib/sasl2/libplain.so.2...done.
> Loaded symbols for /usr/lib/sasl2/libplain.so.2
> Reading symbols from /usr/lib/sasl2/liblogin.so.2...done.
> Loaded symbols for /usr/lib/sasl2/liblogin.so.2
> Reading symbols from /usr/lib/sasl2/libntlm.so.2...done.
> Loaded symbols for /usr/lib/sasl2/libntlm.so.2
> Reading symbols from /usr/lib/ldap/back_ldbm.so...done.
> Loaded symbols for /usr/lib/ldap/back_ldbm.so
> 0x40174aad in pthread_join () from /lib/tls/libpthread.so.0
> (gdb) thread apply all bt
>
> Thread 18 (Thread 1088744400 (LWP 23778)):
> #0  0x4041b5d7 in select () from /lib/tls/libc.so.6
> #1  0x4017bb74 in __JCR_LIST__ () from /lib/tls/libpthread.so.0
> #2  0x40173964 in start_thread () from /lib/tls/libpthread.so.0
> #3  0x0812145c in ?? ()
>
> Thread 17 (Thread 1097133008 (LWP 23779)):
> #0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
>    from /lib/tls/libpthread.so.0
> #1  0x00000000 in ?? ()
>
> Thread 16 (Thread 1105521616 (LWP 23902)):
> #0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
>    from /lib/tls/libpthread.so.0
> #1  0x00000000 in ?? ()
>
> Thread 15 (Thread 1115683792 (LWP 23994)):
> #0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
>    from /lib/tls/libpthread.so.0
> #1  0x00000000 in ?? ()
>
> Thread 14 (Thread 1124072400 (LWP 24081)):
> #0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
>    from /lib/tls/libpthread.so.0
> #1  0x00000000 in ?? ()
>
> Thread 13 (Thread 1133509584 (LWP 24104)):
> #0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
>    from /lib/tls/libpthread.so.0
> #1  0x00000000 in ?? ()
>
> Thread 12 (Thread 1141898192 (LWP 24132)):
> #0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
>    from /lib/tls/libpthread.so.0
> #1  0x00000000 in ?? ()
>
> Thread 11 (Thread 1150286800 (LWP 24133)):
> #0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
>    from /lib/tls/libpthread.so.0
> #1  0x00000000 in ?? ()
>
> Thread 10 (Thread 1158675408 (LWP 24742)):
> #0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
>    from /lib/tls/libpthread.so.0
> #1  0x00000000 in ?? ()
>
> ---Type <return> to continue, or q <return> to quit---
> Thread 9 (Thread 1167064016 (LWP 24743)):
> #0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
>    from /lib/tls/libpthread.so.0
> #1  0x00000000 in ?? ()
>
> Thread 8 (Thread 1175452624 (LWP 25506)):
> #0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
>    from /lib/tls/libpthread.so.0
> #1  0x00000000 in ?? ()
>
> Thread 7 (Thread 1183841232 (LWP 28037)):
> #0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
>    from /lib/tls/libpthread.so.0
> #1  0x00000000 in ?? ()
>
> Thread 6 (Thread 1193278416 (LWP 7557)):
> #0  0x4041b5d7 in select () from /lib/tls/libc.so.6
> #1  0x40124448 in db_xa_switch_4001 () from /usr/lib/libdb-4.1.so
> #2  0x400f2c93 in __memp_alloc_4001 () from /usr/lib/libdb-4.1.so
> #3  0x400f40c1 in __memp_fget_4001 () from /usr/lib/libdb-4.1.so
> #4  0x40095f3a in __bam_search_4001 () from /usr/lib/libdb-4.1.so
> #5  0x4008c150 in __bam_c_rget_4001 () from /usr/lib/libdb-4.1.so
> #6  0x400895d5 in __bam_c_dup_4001 () from /usr/lib/libdb-4.1.so
> #7  0x400aae99 in __db_c_get_4001 () from /usr/lib/libdb-4.1.so
> #8  0x400a4d84 in __db_get_4001 () from /usr/lib/libdb-4.1.so
> #9  0x40577b55 in ldbm_fetch (ldbm=0x8125548, key=
>       {data = 0x471fe6b8, size = 4, ulen = 0, dlen = 0, doff
> = 0, flags = 0})
>     at
> /home/masneyb/openldap-2.1.25/servers/slapd/back-ldbm/ldbm.c:443
> #10 0x4056c526 in id2entry_rw (be=0x8113d18, id=41867, rw=0)
>     at
> /home/masneyb/openldap-2.1.25/servers/slapd/back-ldbm/id2entry.c:220
> #11 0x40567753 in ldbm_back_search (be=0x8113d18, conn=0x405fda28,
>     op=0x81aba70, base=0x471ff8a8, nbase=0x471ff8a0, scope=2,
> deref=2,
>     slimit=1746, tlimit=3600, filter=0x41f62d68,
> filterstr=0x471ff898,
>     attrs=0x0, attrsonly=0)
>     at
> /home/masneyb/openldap-2.1.25/servers/slapd/back-ldbm/search.c:316
> #12 0x08059cd4 in do_search (conn=0x405fda28, op=0x81aba70)
>     at /home/masneyb/openldap-2.1.25/servers/slapd/search.c:401
> #13 0x08057cdf in connection_operation (ctx=0x46972a98,
> arg_v=0x81aba70)
>     at /home/masneyb/openldap-2.1.25/servers/slapd/connection.c:943
> #14 0x40027258 in ldap_int_thread_pool_wrapper (xpool=0x80be690)
>     at /home/masneyb/openldap-2.1.25/libraries/libldap_r/tpool.c:432
> #15 0x40173964 in start_thread () from /lib/tls/libpthread.so.0
> #16 0x41f37ed4 in ?? ()
>
> Thread 5 (Thread 1201667024 (LWP 7558)):
> #0  0x4041b5d7 in select () from /lib/tls/libc.so.6
> #1  0x40124448 in db_xa_switch_4001 () from /usr/lib/libdb-4.1.so
> ---Type <return> to continue, or q <return> to quit---
> #2  0x400f2c93 in __memp_alloc_4001 () from /usr/lib/libdb-4.1.so
> #3  0x400f40c1 in __memp_fget_4001 () from /usr/lib/libdb-4.1.so
> #4  0x40095f3a in __bam_search_4001 () from /usr/lib/libdb-4.1.so
> #5  0x4008c150 in __bam_c_rget_4001 () from /usr/lib/libdb-4.1.so
> #6  0x400895d5 in __bam_c_dup_4001 () from /usr/lib/libdb-4.1.so
> #7  0x400aae99 in __db_c_get_4001 () from /usr/lib/libdb-4.1.so
> #8  0x400a4d84 in __db_get_4001 () from /usr/lib/libdb-4.1.so
> #9  0x40577b55 in ldbm_fetch (ldbm=0x8125548, key=
>       {data = 0x479fe6b8, size = 4, ulen = 0, dlen = 0, doff
> = 0, flags = 0})
>     at
> /home/masneyb/openldap-2.1.25/servers/slapd/back-ldbm/ldbm.c:443
> #10 0x4056c526 in id2entry_rw (be=0x8113d18, id=41847, rw=0)
>     at
> /home/masneyb/openldap-2.1.25/servers/slapd/back-ldbm/id2entry.c:220
> #11 0x40567753 in ldbm_back_search (be=0x8113d18, conn=0x405fe448,
>     op=0x81b2648, base=0x479ff8a8, nbase=0x479ff8a0, scope=2,
> deref=2,
>     slimit=1747, tlimit=3600, filter=0x49635270,
> filterstr=0x479ff898,
>     attrs=0x0, attrsonly=0)
>     at
> /home/masneyb/openldap-2.1.25/servers/slapd/back-ldbm/search.c:316
> #12 0x08059cd4 in do_search (conn=0x405fe448, op=0x81b2648)
>     at /home/masneyb/openldap-2.1.25/servers/slapd/search.c:401
> #13 0x08057cdf in connection_operation (ctx=0x41f89ba0,
> arg_v=0x81b2648)
>     at /home/masneyb/openldap-2.1.25/servers/slapd/connection.c:943
> #14 0x40027258 in ldap_int_thread_pool_wrapper (xpool=0x80be690)
>     at /home/masneyb/openldap-2.1.25/libraries/libldap_r/tpool.c:432
> #15 0x40173964 in start_thread () from /lib/tls/libpthread.so.0
> #16 0x41f6db6c in ?? ()

  -- Howard Chu
  Chief Architect, Symas Corp.       Director, Highland Sun
  http://www.symas.com               http://highlandsun.com/hyc
  Symas: Premier OpenSource Development and Support