Hi Quanah,
I moved to OpenLDAP 2.4.18 and patched B DB 4.7.25 with all 4 patches from oracle.
I DIDN't change slapd.config at all
i reduced the number of entries to a total of 3437278.
[root@l01lnp2 ~]# du -c -h /var/lib/ldap/*.bdb
200K /var/lib/ldap/bestMatchPrefix.bdb
982M /var/lib/ldap/dn2id.bdb
2.4G /var/lib/ldap/id2entry.bdb
1.8M /var/lib/ldap/objectClass.bdb
1.2M /var/lib/ldap/originatorPrefixID.bdb
48M /var/lib/ldap/uniqueID.bdb
3.4G total <= interesting ... almost the same as number of entries :)
changed DB_CONFIG to cache 7 GB:
set_cachesize 7 0 1
set_lg_regionmax 262144
set_lg_bsize 2097152
my system has 10 GB of RAM and the situation now is:
[root@l01lnp2 ~]# free
total used free shared buffers cached
Mem: 10234924 10176544 58380 0 2144 3786596
-/+ buffers/cache: 6387804 3847120
Swap: 4096564 753572 3342992
[root@l01lnp2 ~]#
When i'm doing ldapsearch (time ldapsearch -h localhost -x -b ou=bestMatchPrefixList,ou=sipDirektor,dc=ot,dc=hr -D cn=admin,dc=ot,dc=hr -w pero99) before i actuall add anything with ldapadd, the search completes within 40 seconds. slapd process takes 24 - 26% memory.
After I add new entries (just 2 more) and perform the same search, it hangs after a while. When it ldapsearch finishes returning entries, i see slapd process memory starts growing .... it is taking almost everything.... reaching 97% ?!?!
It is always like this.... the search throws all entries and then waits for some time .. it is almost random 60 seconds - 6 minutes to actually exit.
Please can you take a loot to strace logs i've attached in my previous e-mail... as asoon as the ldapsearch stops returning entries i see a lot of jubrish there...
Here is slapd process memory growth:
top - 16:42:22 up 4 days, 1:02, 2 users, load average: 2.13, 0.67, 0.23
Tasks: 119 total, 1 running, 118 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.8%us, 0.2%sy, 0.0%ni, 70.0%id, 28.8%wa, 0.0%hi, 0.2%si, 0.0%st
Mem: 10234924k total, 10177568k used, 57356k free, 6676k buffers
Swap: 4096564k total, 36516k used, 4060048k free, 3603688k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9404 ldap 25 0 13.3g 8.8g 2.8g S 4.0 89.7 1:13.49 slapd
1 root 15 0 10344 372 344 S 0.0 0.0 0:01.69 init
2 root RT -5 0 0 0 S 0.0 0.0 0:00.06 migration/0
Tasks: 117 total, 1 running, 116 sleeping, 0 stopped, 0 zombie
Cpu(s): 7.2%us, 0.7%sy, 0.0%ni, 67.5%id, 24.3%wa, 0.0%hi, 0.3%si, 0.0%st
Mem: 10234924k total, 10177968k used, 56956k free, 6656k buffers
Swap: 4096564k total, 36516k used, 4060048k free, 3580356k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9404 ldap 25 0 13.3g 8.9g 2.9g S 30.3 90.9 1:16.76 slapd
325 root 10 -5 0 0 0 S 0.7 0.0 5:37.11 kswapd0
8458 root 15 0 0 0 0 D 0.3 0.0 0:02.02 pdflush
Tasks: 117 total, 1 running, 116 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.0%us, 0.3%sy, 0.0%ni, 72.3%id, 26.1%wa, 0.0%hi, 0.3%si, 0.0%st
Mem: 10234924k total, 10180560k used, 54364k free, 6140k buffers
Swap: 4096564k total, 36516k used, 4060048k free, 3488164k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9404 ldap 25 0 13.4g 9.3g 3.2g S 4.7 95.5 1:28.86 slapd
8458 root 15 0 0 0 0 D 0.7 0.0 0:02.20 pdflush
Tasks: 117 total, 1 running, 116 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.9%us, 0.4%sy, 0.0%ni, 70.5%id, 28.0%wa, 0.0%hi, 0.2%si, 0.0%st
Mem: 10234924k total, 10177812k used, 57112k free, 3492k buffers
Swap: 4096564k total, 36516k used, 4060048k free, 3481476k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9404 ldap 25 0 13.4g 9.4g 3.2g S 4.3 95.9 1:30.39 slapd
325 root 10 -5 0 0 0 S 0.7 0.0 5:38.08 kswapd0
top - 16:45:01 up 4 days, 1:05, 2 users, load average: 1.91, 1.40, 0.59
Tasks: 117 total, 1 running, 116 sleeping, 0 stopped, 0 zombie
Cpu(s): 3.2%us, 0.2%sy, 0.0%ni, 75.0%id, 21.4%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 10234924k total, 10179744k used, 55180k free, 396k buffers
Swap: 4096564k total, 42328k used, 4054236k free, 3473624k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9404 ldap 25 0 13.5g 9.4g 3.3g S 13.6 96.7 1:33.44 slapd
9490 root 15 0 0 0 0 S 0.3 0.0 0:00.31 pdflush
top - 16:45:33 up 4 days, 1:05, 2 users, load average: 1.55, 1.36, 0.60
Tasks: 117 total, 1 running, 116 sleeping, 0 stopped, 0 zombie
Cpu(s): 2.7%us, 0.2%sy, 0.0%ni, 74.7%id, 22.3%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 10234924k total, 10180100k used, 54824k free, 652k buffers
Swap: 4096564k total, 118616k used, 3977948k free, 3521232k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9404 ldap 25 0 13.5g 9.4g 3.3g S 10.6 96.6 1:37.36 slapd
325 root 10 -5 0 0 0 S 0.3 0.0 5:38.63 kswapd0
This looks to me as a memory leak bug to me.
Tihomir. On Thu, Sep 10, 2009 at 9:37 PM, Quanah Gibson-Mount
<quanah@zimbra.com> wrote:
--On Thursday, September 10, 2009 8:56 PM +0200 Tihomir Culjaga <
tculjaga@gmail.com> wrote:
So, the situation is that i have 2 ldif files i'm recreating the database
from.
/usr/local/libexec/slapadd -l /home/tculjaga/file2.ldif -f
/usr/local/etc/openldap/slapd.conf
/usr/local/libexec/slapadd -l /home/tculjaga/file2.ldif -f
/usr/local/etc/openldap/slapd.conf
I would suggest you just make these a single file, so all the work can be done at one time.
I tried to re-index with /usr/local/libexec/slapindex -f
/usr/local/etc/openldap/slapd.conf -v
restart slapd process, restart the machine ... it is always the same
issue.
Nothing here indicates a problem with your indices. Running slapindex repeatedly is a waste of your time.
[root@l01lnp2 traces]# /usr/local/libexec/slapd -V
@(#) $OpenLDAP: slapd 2.4.16 (Sep 9 2009 14:39:44) $
root@l01lnp2:/home/tculjaga/openldap-2.4.16/servers/slapd
I would strongly urge you to upgrade to 2.4.18 (for reasons I will note further down)
[root@l01lnp2 traces]# /usr/local/BerkeleyDB.4.7/bin/db_stat -V
Berkeley DB 4.7.25: (May 15, 2008) - unpached!
You need to rebuild BDB 4.7.25 with the 4 patches from Oracle. There are known issues when running BDB 4.7 without them.
[root@l01lnp2 traces]# du -c -h /var/lib/ldap/*.bdb
200K /var/lib/ldap/bestMatchPrefix.bdb
3.8G /var/lib/ldap/dn2id.bdb
6.2G /var/lib/ldap/id2entry.bdb
1.8M /var/lib/ldap/objectClass.bdb
1.2M /var/lib/ldap/originatorPrefixID.bdb
48M /var/lib/ldap/uniqueID.bdb
10G total
Since your database is a total of 10 GB in size, for slapadd to work at optimum efficiency, you need at least 10GB of cache for your DB_CONFIG file. Unfortunately, you only have 10GB of RAM. Essentially, your system is under powered for your database size.
[tculjaga@l01lnp2 ~]$ cat ot.ldif | grep -c "dn: "
101588
[tculjaga@l01lnp2 ~]$ cat l01sipdir1.ldif | grep -c "dn: "
9994864
[tculjaga@l01lnp2 ~]$
So you have 10,096,452 entries total.
[root@l01lnp2 traces]# cat /var/lib/ldap/DB_CONFIG | grep -v "#"
set_cachesize 0 3221225472 1
set_lg_regionmax 262144
set_lg_bsize 2097152
You only have a 3GB DB cachesize configured here. Expect things to perform sub optimally. It would have been easier to set this by going
set_cachesize 3 0 1
Which would have the same effect, since the first number is the number of gigabytes to allocate.
Please find attached slapd.conf
Ok, so the relevant bits from here are:
cachesize 2500000
idlcachesize 7500000
cachefree 1000
Which means you have a cachesize of 2.5 million, an idlcachesize of 7.5 million, and (with OL 2.4.16) a dncachesize of 5 million.
I would highly advise you upgrade to OpenLDAP 2.4.18, and change the slapd.conf settings to:
dncachesize 0 (which means unlimited).
And setting no cache or idlcachesize, and fixing your DB_CONFIG. But you also need to buy a substantial amount of RAM for a DB of this size. :P I would advise you upgrade to at least 32GB total. Then you can more optimally tune the system.
--Quanah
--
Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc
--------------------
Zimbra :: the leader in open source messaging and collaboration