[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
[BUG] OpenLDAP or BDB problem?
hi list,
for several reasons we use BDB as backend for ldap.
the following behaviour is reproduceable on the following software
(tested on 3 different hardware plattforms, all x86, Opterons with 2-4GB
RAM):
DEBIAN Sarge
- slapd 2.2.23-8
- libdb4.2 4.2.52-18
- db4.2-util 4.2.52-18
SLES9
- db-utils-4.2.52-86.3
- db-4.2.52-86.3
- openldap2-2.2.24-4.5
- openldap2-client-2.2.24-4.8
we have a little "torture" script to simulate traffic for the upcomming
production environment with samba/openldap. this torture script is perl
and simply writes random values to "userPassword" of each entry (~1100).
we started 10-50 simultanous instances of this script. sometimes it
completes successfully, but very often it crashes openldap and our BDB
database with the following error:
Oct 21 10:39:06 ldapmaster2 slapd[17172]: bdb_modify: retrying...
Oct 21 10:39:06 ldapmaster2 slapd[17172]: bdb(dc=eva,dc=mpg,dc=de):
DB_TXN->abort: Log undo failed for LSN: 3 2173192: DB_NOTFOUND: No
matching key/data pai
r found
Oct 21 10:39:06 ldapmaster2 slapd[17172]: bdb(dc=eva,dc=mpg,dc=de):
PANIC: DB_NOTFOUND: No matching key/data pair found
Oct 21 10:39:06 ldapmaster2 slapd[17172]: send_ldap_result: conn=16
op=10 p=3
Oct 21 10:39:06 ldapmaster2 slapd[17172]: send_ldap_response: msgid=13
tag=103 err=80
Oct 21 10:39:06 ldapmaster2 slapd[17172]: bdb(dc=eva,dc=mpg,dc=de):
PANIC: fatal region error detected; run recovery
Oct 21 10:39:06 ldapmaster2 slapd[17172]: bdb_cache_entry_db_relock:
entry 552, rw 1, rc -30978
Oct 21 10:39:06 ldapmaster2 slapd[17172]: bdb(dc=eva,dc=mpg,dc=de):
PANIC: fatal region error detected; run recovery
Oct 21 10:39:06 ldapmaster2 slapd[17172]: bdb_modify: txn_commit failed:
DB_RUNRECOVERY: Fatal error, run database recovery (-30978)
Oct 21 10:39:06 ldapmaster2 slapd[17172]: send_ldap_result: conn=17
op=11 p=3
Oct 21 10:39:06 ldapmaster2 slapd[17172]: send_ldap_response: msgid=14
tag=103 err=80
then we have to run db_recover to get the system running again. this is
very annoying because the system is not reliable at all!
first we had /var/lib/ldap on the system partition (XFS), after that we
put it on a seperate partition (EXT3), but the result stays the same.
we also tested 2 different BDB config files and played around with some
entries (DB_CONFIG, see bottom).
any help would be great!!!
regards,
micha
# cat /etc/openldap/slapd.conf
##################################################################################
include /etc/openldap/schema/core.schema
include ....
allow bind_v2
pidfile /var/run/slapd/slapd.pid
argsfile /var/run/slapd/slapd.args
loglevel 0
sizelimit -1
concurrency 3
modulepath /usr/lib/openldap/modules
access to
attrs=userPassword,sambaNTPassword,sambaLMPassword,sambaPwdLastSet,sambaPwdMustChange
by dn="uid=sambamanager,dc=my,dc=company" write
by self write
by anonymous auth
by * none
access to dn.regex="cn=[^,]+,dc=my,dc=company"
by dn="uid=sambamanager,dc=my,dc=company" none
by * read
access to
attrs=objectClass,entry,homeDirectory,uid,uidNumber,gidNumber,memberUid
by dn="uid=sambamanager,dc=my,dc=company" write
by * read
access to
attrs=description,telephoneNumber,roomNumber,homePhone,loginShell,gecos,cn,sn,givenname
by dn="uid=sambamanager,dc=my,dc=company" write
by self write
by * read
access to
attrs=cn,sambaLMPassword,sambaNTPassword,sambaPwdLastSet,sambaLogonTime,sambaLogoffTime,sambaKickoffTime,sambaPwdCanChange,sambaPwdMustChange,sambaAcctFlags,displayName,sambaHomePath,sambaHomeDrive,sambaLogonScript,sambaProfilePath,description,sambaUserWorkstations,sambaPrimaryGroupSID,sambaDomainName,sambaMungedDial,sambaBadPasswordCount,sambaBadPasswordTime,sambaPasswordHistory,sambaLogonHours,sambaSID,sambaSIDList,sambaTrustFlags,sambaGroupType,sambaNextRid,sambaNextGroupRid,sambaNextUserRid,sambaAlgorithmicRidBase,sambaShareName,sambaOptionName,sambaBoolOption,sambaIntegerOption,sambaStringOption,sambaStringListoption
by dn="uid=sambamanager,dc=my,dc=company" write
by self read
by * none
access to dn.base="dc=my,dc=company"
by dn="uid=sambamanager,dc=my,dc=company" write
by * none
access to dn="ou=Groups,dc=my,dc=company"
by dn="uid=sambamanager,dc=my,dc=company" write
by * none
access to dn="ou=Computers,dc=my,dc=company"
by dn="uid=sambamanager,dc=my,dc=company" write
by * none
access to *
by self read
by * none
backend bdb
database bdb
checkpoint 1024 5
cachesize 100000
suffix "dc=my,dc=company"
rootdn "uid=cyrus,dc=my,dc=company"
password-hash {SSHA} {CRYPT} {MD5}
rootpw {CRYPT}FFFFFFFFFFFFFFFF
directory /var/lib/ldap
mode 660
index objectClass eq
index sambaSID eq,pres
index uid eq,pres,subany
index sambaPrimaryGroupSID eq,pres
index uidNumber eq,pres
index gidNumber eq,pres
index displayName eq,pres,subany
index cn,sn,givenname,mail,memberuid pres,subany,eq
##################################################################################
# cat /var/lib/ldap/DB_CONFIG
set_cachesize 0 15000000 1
set_lg_bsize 2097152
# cat /var/lib/ldap/DB_CONFIG_DEBIAN
set_cachesize 0 268435456 0
set_lg_regionmax 10048576
set_lg_max 10485760
set_lg_bsize 2097152
set_lg_dir /var/lib/ldap
set_lk_max_objects 1000000
set_lk_max_locks 1000000
set_lk_max_lockers 1000000
--
Michael Gasch
Max Planck Institute for Evolutionary Anthropology
Department of Human Evolution (IT)
Deutscher Platz 6
D-04103 Leipzig
Germany
Phone: 49 (0)341 - 3550 137