[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
(ITS#7378) Slapd hangs on bdb write lock
Full_Name: Nikolai Schupbach
Version: 2.4.31
OS: FreeBSD
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (202.78.158.60)
We are experiencing frequent hangs in slapd. Once hung we can continue to
connect, but all searches will just hang indefinitely until we kill -9 the slapd
process and restart it. The directory is used for mail routing and we have been
migrating to it from an existing directory server over the last 3 weeks - we
have noted the busier the directory becomes the more often it hangs (now once
every 2 days).
We have one master and 10 syncrepl read only replicas - the master is used
mainly for writes and has not hung yet, but most of the replicas have hung at
least once. The replicas receive anywhere between 50 to 300 searches/sec, while
the master would only get 1/sec. There are 45k entries in the directory.
We are running:
FreeBSD 8.3/9.0 x64
OpenLDAP 2.4.31
Berkeley DB 4.6.21
The old directory we are migrating from has the same load and is also running
OpenLDAP, but has been rock solid for 5 years. It is running Berkeley DB 4.3.29
and OpenLDAP 2.3.27.
We have managed to collect db_stat lock information, which indicates the same
issue each time - a write lock on dn2id.bdb.
Locks grouped by object:
Locker Mode Count Status ----------------- Object ---------------
8000a85e READ 1 HELD 0xb26c8 len: 9 data: 60xa800000000000000
8a READ 1 HELD id2entry.bdb handle 0
8c READ 1 HELD dn2id.bdb handle 0
96 READ 1 HELD objectClass.bdb handle 0
93 READ 1 HELD entryCSN.bdb handle 0
90 READ 1 HELD entryUUID.bdb handle 0
8000a85f WRITE 4 HELD dn2id.bdb page 219
80000782 READ 1 HELD dn2id.bdb page 768
80000a45 READ 1 HELD dn2id.bdb page 768
80000b9e READ 1 HELD dn2id.bdb page 768
800006a0 READ 1 HELD dn2id.bdb page 768
80000771 READ 1 HELD dn2id.bdb page 768
80000534 READ 1 HELD dn2id.bdb page 768
80000a44 READ 1 HELD dn2id.bdb page 768
80000641 READ 1 HELD dn2id.bdb page 768
80001049 READ 1 HELD dn2id.bdb page 768
8000104a READ 1 HELD dn2id.bdb page 768
80001048 READ 1 HELD dn2id.bdb page 768
80000783 READ 1 HELD dn2id.bdb page 768
80000535 READ 1 HELD dn2id.bdb page 768
8000066e READ 1 HELD dn2id.bdb page 768
80000697 READ 1 HELD dn2id.bdb page 768
8000a85f READ 1 HELD dn2id.bdb page 768
8000a85e READ 1 HELD 0xb19a8 len: 9 data: 40xa800000000000000
8000a85f READ 1 HELD dn2id.bdb page 933
8000a85f WRITE 2 HELD dn2id.bdb page 933
80001047 WRITE 1 HELD dn2id.bdb page 559
80000782 READ 1 WAIT dn2id.bdb page 559
80000a45 READ 1 WAIT dn2id.bdb page 559
80000b9e READ 1 WAIT dn2id.bdb page 559
800006a0 READ 1 WAIT dn2id.bdb page 559
80000771 READ 1 WAIT dn2id.bdb page 559
80000534 READ 1 WAIT dn2id.bdb page 559
80000a44 READ 1 WAIT dn2id.bdb page 559
80000641 READ 1 WAIT dn2id.bdb page 559
80001049 READ 1 WAIT dn2id.bdb page 559
8000104a READ 1 WAIT dn2id.bdb page 559
80001048 READ 1 WAIT dn2id.bdb page 559
80000783 READ 1 WAIT dn2id.bdb page 559
80000535 READ 1 WAIT dn2id.bdb page 559
8000066e READ 1 WAIT dn2id.bdb page 559
80000697 READ 1 WAIT dn2id.bdb page 559
8000a85f READ 1 WAIT dn2id.bdb page 559
8000a85f READ 2 HELD dn2id.bdb page 1362
8000a85f WRITE 2 HELD dn2id.bdb page 1362
8000a85f READ 2 HELD dn2id.bdb page 1353
8000a85f WRITE 2 HELD dn2id.bdb page 1353
b6 READ 1 HELD uid.bdb handle 0
a5 READ 1 HELD mail.bdb handle 0
af READ 1 HELD mailLocalAddress.bdb handle 0
9b READ 1 HELD miLoginid.bdb handle 0
aa READ 1 HELD mailHost.bdb handle 0
bb READ 1 HELD miDomainName.bdb handle 0
c0 READ 1 HELD mpMailHost.bdb handle 0
a0 READ 1 HELD mpMailUserType.bdb handle 0
We have also collected the backtrace for all the threads which I have uploaded
to:
ftp://ftp.openldap.org/incoming/nikolai-gdb-120902.txt
The full db_stat output is located at:
ftp://ftp.openldap.org/incoming/nikolai-dbstat-120902.txt
Our DB_CONFIG:
# One 512MB cache
set_cachesize 0 536870912 1
# Transaction Log settings
set_lg_regionmax 1048576
set_lg_max 10485760
set_lg_bsize 2097152
set_flags DB_LOG_AUTOREMOVE
# Increase lock maximums
set_lk_max_locks 2000
set_lk_max_lockers 2000
set_lk_max_objects 2000
Our slapd.conf on our replicas:
# Load the following schema files
include /usr/local/etc/openldap/schema/core.schema
include /usr/local/etc/openldap/schema/cosine.schema
include /usr/local/etc/openldap/schema/nis.schema
include /usr/local/etc/openldap/schema/inetorgperson.schema
include /usr/local/etc/openldap/schema/misc.schema
include /usr/local/etc/openldap/schema/mirapoint.schema
include /usr/local/etc/openldap/schema/smp.schema
# Runtime settings for slapd
pidfile /var/run/openldap/slapd.pid
argsfile /var/run/openldap/slapd.args
loglevel none
# TLS security options for slapd.
TLSCipherSuite HIGH
TLSCACertificateFile /usr/local/etc/openldap/tls/ca-cert.pem
TLSCertificateFile /usr/local/etc/openldap/tls/server-cert.pem
TLSCertificateKeyFile /usr/local/etc/openldap/tls/server-key.pem
# This option configures one or more hashes to be used in generation
# of user passwords stored in the userPassword attribute during
# processing of LDAP Password Modify Extended Operations (RFC 3062).
password-hash {SSHA}
# Load dynamic backend modules:
modulepath /usr/local/libexec/openldap
moduleload back_bdb
moduleload back_monitor
# Do not limit size or time of requests.
sizelimit unlimited
timelimit unlimited
# Require authentication prior to directory operations
require authc
###############################################################################
# BDB Database Definitions
#
# The following configuration directives relate to bdb database definitions
###############################################################################
# The remaining configuration directives relate to bdb database definitions
database bdb
suffix "o=top"
rootdn "cn=root,o=top"
# Cleartext passwords, especially for the rootdn, should
# be avoid. See slappasswd(8) and slapd.conf(5) for details.
rootpw {SSHA}**********
# The database directory must exist prior to running slapd and
# should only be accessible by the slapd and slap tools.
directory /var/db/openldap-data
# Indices to maintain
index cn eq,sub,pres
index entryUUID eq
index entryCSN eq
index mail eq,sub,pres
index mailHost eq
index mailLocalAddress eq,sub,pres
index miDomainName eq,sub
index miLoginId eq,pres
index mpMailHost eq
index mpMailUserType eq
index mpSystemRole eq
index objectClass eq,pres
index uid eq,pres
# Specify the number of entries which should be held in memory
cachesize 200000
# Set transactional checkpoint
checkpoint 512 60
###############################################################################
# LDAP Sync Replication
#
# A unique replica id number is required for each replication client
###############################################################################
# LDAP sync replication settings
syncrepl rid=36
provider=ldaps://ldapmaster/
type=refreshAndPersist
retry=30,+
searchbase="o=top"
filter="(objectClass=*)"
scope=sub
attrs="*"
sizelimit=unlimited
timelimit=unlimited
schemachecking=off
bindmethod=simple
binddn="cn=replica,ou=users,ou=directory,o=top"
credentials=**********
# Where to refer ldap updates to
updateref ldaps://ldapmaster/
###############################################################################
# LDAP Statistics
#
# The OpenLDAP server can be configured to provide real time performance
# statistics through the monitor branch.
###############################################################################
# Enable the statistics monitoring database
database monitor
# Allow access to monitoring user only
access to dn.subtree="cn=monitor"
by dn.exact="cn=monitor,ou=users,ou=directory,o=top" read
by * none
Sincerely,
Nikolai Schupbach