[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
RE: ch_malloc of 8388608 bytes failed (ITS#2270)
When ch_malloc fails it calls abort() to kill the process. In your stack back
trace, there are 232 threads but none of them is in the abort() routine,
which I find very odd. Regardless, your problem is not due to any bug in
OpenLDAP. The fact is, even though you have a 64 bit machine, you have built
a 32 bit binary. So, it is limited to a 32 bit address space, and in Solaris,
not all of that 32 bit space is available for user memory, only about half of
it (31 bits, 2GB) is available. The default size of a thread stack has grown
in OpenLDAP 2.1, but even in OpenLDAP 2.0 it was 2MB per thread. With the
current 4MB per thread, times 232 threads, you have used 928MB of RAM. You
are also using 1GB for your BDB cache. This alone (1.9GB) leaves practically
nothing left for slapd to run with.
You should decrease the maximum number of threads; creating more beyond a
certain limit does not enhance concurrency anyway. You can increase your
available address space by building as a pure 64 bit executable but that
doesn't change the fact that having too many threads will slow you down.
-- Howard Chu
Chief Architect, Symas Corp. Director, Highland Sun
http://www.symas.com http://highlandsun.com/hyc
Symas: Premier OpenSource Development and Support
> -----Original Message-----
> From: owner-openldap-bugs@OpenLDAP.org
> [mailto:owner-openldap-bugs@OpenLDAP.org]On Behalf Of
> joseph.tingiris@cox.net
> Sent: Wednesday, January 15, 2003 9:27 AM
> To: openldap-its@OpenLDAP.org
> Subject: ch_malloc of 8388608 bytes failed (ITS#2270)
>
>
> Full_Name: Joseph Tingiris
> Version: 2.1.12
> OS: Solaris 8
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (206.157.224.254)
>
>
> I've read some of the other folks, using Solaris, having
> similar problems and
> I've tried almost everything I could find short of actually modifying
> ch_malloc.c myself. It appears to be specific to
> multiprocessor (3+) Sun
> installations. The binaries have been compiled with
> -lmtmalloc and the latest
> versions of all Openldap dependent packages are used. The primary
> authentication mechanism is cleartext.
>
> Some key points:
>
> * This server is a replica.
> * BDB-4.1 with 3.4 million DNs, 6 indexes (eq,sub)
> * process stack 32k (plimit -s), DB cache 1G (via DB_CONFIG)
> * this problem has persisted, on the same hardware, since
> openldap 2.0.12
> * slapd fails at least once a day with the same error every
> time, "ch_malloc of
> 8388608 bytes failed"; it's always the same amount of bytes
> * it appears to happen during a wildcard search, although it
> may be during some
> type of replication event
>
> Here is some info on the build environment:
>
> Application - OpenLdap and Dependencies:
>
> openldap-2.1.12
> openssl-0.9.7
> krb5-1.2.7
> cyrus-sasl-2.1.10
> db-4.1.25
>
> Compiler/Dev Tools:
>
> autoconf-2.57
> automake-1.7.2
> binutils-2.11.2
> bison-1.75
> fileutils-4.1
> gawk-3.1.0
> gcc-2.95.3
> gdb-5.0
> gdbm-1.8.0
> gettext-0.10.37
> glib-1.2.10
> gtk+-1.2.10
> libgcc-3.2
> libiconv-1.6.1
> libnet-1.0.2a
> libpcap-0.7.1
> libtool-1.4
> m4-1.4
> make-3.80
> ncurses-5.2
> slang-1.4.4
> tcl-8.4.1
> termcap-1.3
> textutils-2.0
> tk-8.4.1
> zlib-1.1.4
>
> Here's the system info:
>
> System Configuration: Sun Microsystems sun4u Sun Fire 3800
> System clock frequency: 150 MHz
> Memory size: 8192 Megabytes
>
> ========================= CPUs
> ===============================================
>
> Port Run E$ CPU CPU
> FRU Name ID MHz MB Impl. Mask
> ---------- ---- ---- ---- ------- ----
> /N0/SB0/P0 0 750 8.0 US-III 3.4
> /N0/SB0/P1 1 750 8.0 US-III 3.4
> /N0/SB0/P2 2 750 8.0 US-III 3.4
> /N0/SB0/P3 3 750 8.0 US-III 3.4
> /N0/SB2/P0 8 750 8.0 US-III 3.4
> /N0/SB2/P1 9 750 8.0 US-III 3.4
> /N0/SB2/P2 10 750 8.0 US-III 3.4
> /N0/SB2/P3 11 750 8.0 US-III 3.4
>
> ========================= Memory Configuration
> ===============================
>
> Logical Logical Logical
> Port Bank Bank Bank DIMM
> Interleave
> Interleave
> FRU Name ID Num Size Status Size
> Factor Segment
> ------------- ---- ---- ------ ----------- ------
> ----------
> ----------
> /N0/SB0/P0/B0 0 0 512MB pass 256MB
> 8-way 0
> /N0/SB0/P0/B0 0 2 512MB pass 256MB
> 8-way 0
> /N0/SB0/P1/B0 1 0 512MB pass 256MB
> 8-way 0
> /N0/SB0/P1/B0 1 2 512MB pass 256MB
> 8-way 0
> /N0/SB0/P2/B0 2 0 512MB pass 256MB
> 8-way 0
> /N0/SB0/P2/B0 2 2 512MB pass 256MB
> 8-way 0
> /N0/SB0/P3/B0 3 0 512MB pass 256MB
> 8-way 0
> /N0/SB0/P3/B0 3 2 512MB pass 256MB
> 8-way 0
> /N0/SB2/P0/B0 8 0 512MB pass 256MB
> 8-way 1
> /N0/SB2/P0/B0 8 2 512MB pass 256MB
> 8-way 1
> /N0/SB2/P1/B0 9 0 512MB pass 256MB
> 8-way 1
> /N0/SB2/P1/B0 9 2 512MB pass 256MB
> 8-way 1
> /N0/SB2/P2/B0 10 0 512MB pass 256MB
> 8-way 1
> /N0/SB2/P2/B0 10 2 512MB pass 256MB
> 8-way 1
> /N0/SB2/P3/B0 11 0 512MB pass 256MB
> 8-way 1
> /N0/SB2/P3/B0 11 2 512MB pass 256MB
> 8-way 1
>
> ========================= IO Cards =========================
>
> Bus Max
> IO Port Bus Freq Bus Dev,
> FRU Name Type ID Side Slot MHz Freq Func State Name
>
> Model
> ---------- ---- ---- ---- ---- ---- ---- ---- -----
> -------------------------------- ----------------------
> /N0/IB6/P0 cPCI 24 B 2 33 33 1,0 ok
> pci-pci1011,46.1/pci108e,1000 pci-bridge
> /N0/IB6/P0 cPCI 24 B 2 33 33 0,0 ok
> pci108e,1000-pci108e,1000.1
> /N0/IB6/P0 cPCI 24 B 2 33 33 0,1 ok
> SUNW,hme-pci108e,1001
> SUNW,cheerio
> /N0/IB6/P0 cPCI 24 B 2 33 33 4,0 ok
> SUNW,isptwo-pci1077,1020/sd
> (blo+ QLGC,ISP1040B
> /N0/IB6/P0 cPCI 24 B 3 33 33 2,0 ok
> network-pci108e,abba.11
> SUNW,cpci-ce
> /N0/IB6/P1 cPCI 25 B 4 33 33 1,0 ok
> pci-pci1011,46.1/pci108e,1000 pci-bridge
> /N0/IB6/P1 cPCI 25 B 4 33 33 0,0 ok
> pci108e,1000-pci108e,1000.1
> /N0/IB6/P1 cPCI 25 B 4 33 33 0,1 ok
> SUNW,qfe-pci108e,1001
> SUNW,cpci-qfe
> /N0/IB6/P1 cPCI 25 B 4 33 33 1,0 ok
> pci108e,1000-pci108e,1000.1
> /N0/IB6/P1 cPCI 25 B 4 33 33 1,1 ok
> SUNW,qfe-pci108e,1001
> SUNW,cpci-qfe
> /N0/IB6/P1 cPCI 25 B 4 33 33 2,0 ok
> pci108e,1000-pci108e,1000.1
> /N0/IB6/P1 cPCI 25 B 4 33 33 2,1 ok
> SUNW,qfe-pci108e,1001
> SUNW,cpci-qfe
> /N0/IB6/P1 cPCI 25 B 4 33 33 3,0 ok
> pci108e,1000-pci108e,1000.1
> /N0/IB6/P1 cPCI 25 B 4 33 33 3,1 ok
> SUNW,qfe-pci108e,1001
> SUNW,cpci-qfe
> /N0/IB6/P1 cPCI 25 A 1 66 66 1,0 ok
> fibre-channel-pci10df,f900.10df.+
> /N0/IB8/P0 cPCI 28 B 2 33 33 1,0 ok
> network-pci108e,abba.11
> SUNW,cpci-ce
> /N0/IB8/P1 cPCI 29 B 4 33 33 1,0 ok
> pci-pci1011,46.1/pci108e,1000 pci-bridge
> /N0/IB8/P1 cPCI 29 B 4 33 33 0,0 ok
> pci108e,1000-pci108e,1000.1
> /N0/IB8/P1 cPCI 29 B 4 33 33 0,1 ok
> SUNW,qfe-pci108e,1001
> SUNW,cpci-qfe
> /N0/IB8/P1 cPCI 29 B 4 33 33 1,0 ok
> pci108e,1000-pci108e,1000.1
> /N0/IB8/P1 cPCI 29 B 4 33 33 1,1 ok
> SUNW,qfe-pci108e,1001
> SUNW,cpci-qfe
> /N0/IB8/P1 cPCI 29 B 4 33 33 2,0 ok
> pci108e,1000-pci108e,1000.1
> /N0/IB8/P1 cPCI 29 B 4 33 33 2,1 ok
> SUNW,qfe-pci108e,1001
> SUNW,cpci-qfe
> /N0/IB8/P1 cPCI 29 B 4 33 33 3,0 ok
> pci108e,1000-pci108e,1000.1
> /N0/IB8/P1 cPCI 29 B 4 33 33 3,1 ok
> SUNW,qfe-pci108e,1001
> SUNW,cpci-qfe
> /N0/IB8/P1 cPCI 29 A 1 66 66 1,0 ok
> fibre-channel-pci10df,f900.10df.+
>
> ========================= Active Boards for Domain
> ===========================
>
> Power Fault HotPlug Board
> FRU Name LED LED LED Cond.
> -------- ----- ----- ------- -------
> /N0/SB0 on off off ok
> /N0/SB2 on off off ok
> /N0/IB6 on off off ok
> /N0/IB8 on off off ok
>
> ========================= Available Boards/Slots for Domain
> ==================
>
> Power Fault HotPlug Board/Slot Board/Slot
> FRU Name LED LED LED Condition Assigned
> -------- ----- ----- ------- ---------- ----------
> There are currently no Boards/Slots available to this Domain
>
> ========================= Hardware Failures
> ==================================
> No Hardware failures found in System
>
> Need any more info? I still have pmap, lsof, truss, cores,
> and additional debug
> data. Anyone have any ideas?
>
> Any help would be greatly appreciated.
>
> Thanks!
>
>
>