[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: (ITS#5687) corrupted double-linked list on shutdown
--On Saturday, September 06, 2008 12:08 AM +0000 hyc@symas.com wrote:
> ando@sys-net.it wrote:
>> h.b.furuseth@usit.uio.no wrote:
>>>> [quanah@freelancer tests]$ *** glibc detected ***
>>>> /home/quanah/q/openldap-2.4.12/servers/slapd/.libs/lt-slapd: corrupted
>>>> double-linked list: 0x000000000d28a430 ***
>>> This message comes from glibc malloc. Its data structures are
>>> corrupted.
>>>
>>> I couldn't figure out much from the the valgrind output.
>>> But you'll get more frequent mallocs with CPPFLAGS=-DSLAP_NO_SL_MALLOC,
>>> if that code isn't rotted - maybe that'll bring out the problem in
>>> another guise. If that code still works, I don't remember.
>>
>> Here's my output after compiling with -DSLAP_NO_SL_MALLOC=1, running
>> test039 under valgrind and suddenly killing the process:
>>
>> ==9657== Memcheck, a memory error detector.
>> ==9657== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al.
>> ==9657== Using LibVEX rev 1658, a library for dynamic binary translation.
>> ==9657== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP.
>> ==9657== Using valgrind-3.2.1, a dynamic binary instrumentation
>> framework. ==9657== Copyright (C) 2000-2006, and GNU GPL'd, by Julian
>> Seward et al. ==9657== For more details, rerun with: -v
>> ==9657==
>> @(#) $OpenLDAP: slapd 2.X (Sep 6 2008 01:35:19) $
>> masarati@odino.intra.sys-net.it:/home/masarati/Lavoro/sysnet/Ldap/ldap-
>> devel/servers/slapd slapd starting
>> daemon: shutdown requested and initiated.
>> slapd shutdown: waiting for 8 threads to terminate
>> ==9657== Thread 3:
>> ==9657== Invalid read of size 4
>> ==9657== at 0x807EF03: slap_listener (daemon.c:1754)
>> ==9657== by 0x807F9AC: slap_listener_thread (daemon.c:1997)
>> ==9657== by 0x820F3D4: ldap_int_thread_pool_wrapper (tpool.c:663)
>> ==9657== by 0xDEB46A: start_thread (in /lib/libpthread-2.5.so)
>> ==9657== by 0xD42DBD: clone (in /lib/libc-2.5.so)
>> ==9657== Address 0x415D3A8 is 32 bytes inside a block of size 164 free'd
>> ==9657== at 0x4004FDA: free (vg_replace_malloc.c:233)
>> ==9657== by 0x82407C9: ber_memfree_x (memory.c:152)
>> ==9657== by 0x80A59E5: ch_free (ch_malloc.c:139)
>> ==9657== by 0x807EDF8: close_listeners (daemon.c:1703)
>> ==9657== by 0x808137F: slapd_daemon_task (daemon.c:2578)
>> ==9657== by 0xDEB46A: start_thread (in /lib/libpthread-2.5.so)
>> ==9657== by 0xD42DBD: clone (in /lib/libc-2.5.so)
>> ==9657==
>> ==9657== Invalid write of size 4
>> ==9657== at 0x807EF22: slap_listener (daemon.c:1759)
>> ==9657== by 0x807F9AC: slap_listener_thread (daemon.c:1997)
>> ==9657== by 0x820F3D4: ldap_int_thread_pool_wrapper (tpool.c:663)
>> ==9657== by 0xDEB46A: start_thread (in /lib/libpthread-2.5.so)
>> ==9657== by 0xD42DBD: clone (in /lib/libc-2.5.so)
>> ==9657== Address 0x415D3A4 is 28 bytes inside a block of size 164 free'd
>> ==9657== at 0x4004FDA: free (vg_replace_malloc.c:233)
>> ==9657== by 0x82407C9: ber_memfree_x (memory.c:152)
>> ==9657== by 0x80A59E5: ch_free (ch_malloc.c:139)
>> ==9657== by 0x807EDF8: close_listeners (daemon.c:1703)
>> ==9657== by 0x808137F: slapd_daemon_task (daemon.c:2578)
>> ==9657== by 0xDEB46A: start_thread (in /lib/libpthread-2.5.so)
>> ==9657== by 0xD42DBD: clone (in /lib/libc-2.5.so)
>> ==9657==
>> ==9657== Invalid read of size 4
>> ==9657== at 0x807EFCD: slap_listener (daemon.c:1781)
>> ==9657== by 0x807F9AC: slap_listener_thread (daemon.c:1997)
>> ==9657== by 0x820F3D4: ldap_int_thread_pool_wrapper (tpool.c:663)
>> ==9657== by 0xDEB46A: start_thread (in /lib/libpthread-2.5.so)
>> ==9657== by 0xD42DBD: clone (in /lib/libc-2.5.so)
>> ==9657== Address 0x415D3A8 is 32 bytes inside a block of size 164 free'd
>> ==9657== at 0x4004FDA: free (vg_replace_malloc.c:233)
>> ==9657== by 0x82407C9: ber_memfree_x (memory.c:152)
>> ==9657== by 0x80A59E5: ch_free (ch_malloc.c:139)
>> ==9657== by 0x807EDF8: close_listeners (daemon.c:1703)
>> ==9657== by 0x808137F: slapd_daemon_task (daemon.c:2578)
>> ==9657== by 0xDEB46A: start_thread (in /lib/libpthread-2.5.so)
>> ==9657== by 0xD42DBD: clone (in /lib/libc-2.5.so)
>> daemon: accept(7) failed errno=9 (Bad file descriptor)
>> ==9657==
>> ==9657== Thread 5:
>> ==9657== Invalid free() / delete / delete[]
>> ==9657== at 0x4004FDA: free (vg_replace_malloc.c:233)
>> ==9657== by 0x82407C9: ber_memfree_x (memory.c:152)
>> ==9657== by 0x80E522B: slap_sl_free (sl_malloc.c:451)
>> ==9657== by 0x80874AF: do_search (search.c:221)
>> ==9657== by 0x8084347: connection_operation (connection.c:1084)
>> ==9657== by 0x8084834: connection_read_thread (connection.c:1211)
>> ==9657== by 0x820F3D4: ldap_int_thread_pool_wrapper (tpool.c:663)
>> ==9657== by 0xDEB46A: start_thread (in /lib/libpthread-2.5.so)
>> ==9657== by 0xD42DBD: clone (in /lib/libc-2.5.so)
>> ==9657== Address 0x5595D60 is 0 bytes inside a block of size 29 free'd
>> ==9657== at 0x4004FDA: free (vg_replace_malloc.c:233)
>> ==9657== by 0x82407C9: ber_memfree_x (memory.c:152)
>> ==9657== by 0x80A59E5: ch_free (ch_malloc.c:139)
>> ==9657== by 0x81E1D64: rwm_op_cleanup (rwm.c:64)
>> ==9657== by 0x80976CD: slap_cleanup_play (result.c:341)
>> ==9657== by 0x8097DEA: send_ldap_response (result.c:522)
>> ==9657== by 0x809850E: slap_send_ldap_result (result.c:642)
>> ==9657== by 0x81855CE: ldap_back_op_result (bind.c:1823)
>> ==9657== by 0x8184C3D: ldap_back_dobind_int (bind.c:1486)
>> ==9657== by 0x8184D41: ldap_back_dobind (bind.c:1520)
>> ==9657== by 0x812C9F7: ldap_back_search (search.c:174)
>> ==9657== by 0x80F7DF4: glue_sub_search (backglue.c:340)
>> ==9657== by 0x80F8458: glue_op_search (backglue.c:467)
>> ==9657== by 0x80FACE3: overlay_op_walk (backover.c:657)
>> ==9657== by 0x80FAF18: over_op_func (backover.c:719)
>> ==9657== by 0x80FAF9C: over_op_search (backover.c:741)
>> ==9657== by 0x8087ADB: fe_op_search (search.c:366)
>> ==9657== by 0x80FAD63: overlay_op_walk (backover.c:667)
>> ==9657== by 0x80FAF18: over_op_func (backover.c:719)
>> ==9657== by 0x80FAF9C: over_op_search (backover.c:741)
>> ==9657==
>> ==9657== Invalid free() / delete / delete[]
>> ==9657== at 0x4004FDA: free (vg_replace_malloc.c:233)
>> ==9657== by 0x82407C9: ber_memfree_x (memory.c:152)
>> ==9657== by 0x80E522B: slap_sl_free (sl_malloc.c:451)
>> ==9657== by 0x80874D3: do_search (search.c:224)
>> ==9657== by 0x8084347: connection_operation (connection.c:1084)
>> ==9657== by 0x8084834: connection_read_thread (connection.c:1211)
>> ==9657== by 0x820F3D4: ldap_int_thread_pool_wrapper (tpool.c:663)
>> ==9657== by 0xDEB46A: start_thread (in /lib/libpthread-2.5.so)
>> ==9657== by 0xD42DBD: clone (in /lib/libc-2.5.so)
>> ==9657== Address 0x5595E60 is 0 bytes inside a block of size 29 free'd
>> ==9657== at 0x4004FDA: free (vg_replace_malloc.c:233)
>> ==9657== by 0x82407C9: ber_memfree_x (memory.c:152)
>> ==9657== by 0x80A59E5: ch_free (ch_malloc.c:139)
>> ==9657== by 0x81E1D90: rwm_op_cleanup (rwm.c:69)
>> ==9657== by 0x80976CD: slap_cleanup_play (result.c:341)
>> ==9657== by 0x8097DEA: send_ldap_response (result.c:522)
>> ==9657== by 0x809850E: slap_send_ldap_result (result.c:642)
>> ==9657== by 0x81855CE: ldap_back_op_result (bind.c:1823)
>> ==9657== by 0x8184C3D: ldap_back_dobind_int (bind.c:1486)
>> ==9657== by 0x8184D41: ldap_back_dobind (bind.c:1520)
>> ==9657== by 0x812C9F7: ldap_back_search (search.c:174)
>> ==9657== by 0x80F7DF4: glue_sub_search (backglue.c:340)
>> ==9657== by 0x80F8458: glue_op_search (backglue.c:467)
>> ==9657== by 0x80FACE3: overlay_op_walk (backover.c:657)
>> ==9657== by 0x80FAF18: over_op_func (backover.c:719)
>> ==9657== by 0x80FAF9C: over_op_search (backover.c:741)
>> ==9657== by 0x8087ADB: fe_op_search (search.c:366)
>> ==9657== by 0x80FAD63: overlay_op_walk (backover.c:667)
>> ==9657== by 0x80FAF18: over_op_func (backover.c:719)
>> ==9657== by 0x80FAF9C: over_op_search (backover.c:741)
>> conn=3 op=1 ldap_back_retry: retrying URI="ldap://localhost:9011"
>> DN="cn=manager,dc=example,dc=com"
>> ==9657== Warning: invalid file descriptor -1 in syscall close()
>> ==9657== Warning: invalid file descriptor -1 in syscall close()
>> ==9657== Warning: invalid file descriptor -1 in syscall close()
>> ==9657== Warning: invalid file descriptor -1 in syscall close()
>> ==9657== Warning: invalid file descriptor -1 in syscall close()
>> ==9657== Warning: invalid file descriptor -1 in syscall close()
>> ==9657== Warning: invalid file descriptor -1 in syscall close()
>> ==9657== Warning: invalid file descriptor -1 in syscall close()
>> slapd stopped.
>> ==9657==
>> ==9657== ERROR SUMMARY: 7 errors from 5 contexts (suppressed: 36 from 1)
>> ==9657== malloc/free: in use at exit: 1,127 bytes in 15 blocks.
>> ==9657== malloc/free: 34,414 allocs, 34,403 frees, 12,376,432 bytes
>> allocated.
>> ==9657== For counts of detected errors, rerun with: -v
>> ==9657== searching for pointers to 15 not-freed blocks.
>> ==9657== checked 774,072 bytes.
>>
>> Apparently, the listener is freed while some thread is still using it.
>
> Since that can only occur during shutdown, I don't think it's really a
> high priority problem. More to the point, it will not cause any heap
> corruption. The serious problem is in thread 5, where rwm_op_cleanup
> frees a DN that do_search also tries to free. It looks like
> rwm_op_cleanup is badly broken here, it apparently should be checking to
> make sure ros->ro_dn is != ros->r_dn before freeing r_dn (and ndn).
With the latest rwm revision, I still get the glibc corruption:
PID=15912 - Search(500): base="o=Example,c=US" scope=sub filter="(cn=Alumni
Assoc Staff)" attrs=cn (more...).
[quanah@tribes tests]$ *** glibc detected *** corrupted double-linked list:
0x0000003bf99316b8 ***
when I hit control-C during the search.
--Quanah
--
Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc
--------------------
Zimbra :: the leader in open source messaging and collaboration