[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
MMR (delta-syncrepl): CPU at 100% after replication
- To: openldap-technical@openldap.org
- Subject: MMR (delta-syncrepl): CPU at 100% after replication
- From: Liam Gretton <liam.gretton@leicester.ac.uk>
- Date: Mon, 11 May 2015 10:34:59 +0100
- Organization: IT Services, University Of Leicester
- User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0
I'm building a new setup with the latest OpenLDAP built from source, using mdb, MMR delta-syncrepl over TLS. I'm using very recent sources,
I have two hosts and I'm finding that once the secondary host has synchronised with the first (this takes about 10 minutes for around 40000 entries), slapd on each of the peers remains at close to 100%. Replication is working though.
The sync logs at this point on the first system in the set (where the original data was slapadded) is showing the following entries endlessly:
554fbe2c do_syncrep2: rid=002 CSN too old, ignoring 20131221210532.737643Z#000000#001#000000 (reqStart=20150509214300.000163Z,cn=log)
contextCSN on both systems looks good. ldap1 is serverID 1, rid 1; ldap2 is serverID 2, rid 2. I guess the SID 0 comes from the original data that was imported into ldap1.
# ldap1search -s base contextCSN
dn: dc=example,dc=com
contextCSN: 20150511090001.208713Z#000000#000#000000
contextCSN: 20150511091334.137305Z#000000#001#000000
# ldap2search -s base contextCSN
dn: dc=example,dc=com
contextCSN: 20150511090001.208713Z#000000#000#000000
contextCSN: 20150511091334.137305Z#000000#001#000000
On ldap2 the stats log shows very many corresponding searches of the log DB:
5550763e conn=1000 op=11653 SRCH base="dc=example,dc=com" scope=2 deref=0 filter="(objectClass=*)"
5550763e conn=1000 op=11653 SRCH attr=* +
5550763e conn=1000 op=11653 SEARCH RESULT tag=101 err=0 nentries=0 text=
5550763e conn=1000 op=11654 SRCH base="cn=log" scope=2 deref=0 filter="(&(objectClass=auditWriteObject)(reqResult=0))"
5550763e conn=1000 op=11654 SRCH attr=reqDN reqType reqMod reqNewRDN reqDeleteOldRDN reqNewSuperior entryCSN
5550763e conn=1000 op=11655 ABANDON msg=11655
Both systems have the host name specified in the -h option to slapd. Clocks are synchronised, DNS is working etc.
I can't get to the bottom of this at all. No doubt I've made an error in my MMR config. Does anyone have a clue as to why this could be happening? I'd be very grateful for any ideas.
Here's (most of) the slapd.conf file, which is identical on both. I must admit I'm not sure whether the serverID settings are global or per-database. Moving them into the mdb section doesn't change the behaviour though.
# Server IDs for replication
serverID 1 ldap://ldap1
serverID 2 ldap://ldap2
#############################################################
#
# Access log database configuration
#
# This is also used for delta-syncrepl replication
#
# See slapd-accesslog(5) for details
#
#############################################################
database mdb
maxsize 209715200
suffix cn=log
directory /db/ldap/accesslog
rootdn cn=log
rootpw secret
index entryCSN eq
index objectClass eq
index reqEnd eq
index reqResult eq
index reqStart eq
overlay syncprov
syncprov-nopresent TRUE
syncprov-reloadhint TRUE
limits dn.exact="cn=replication,ou=special users,dc=example,dc=com"
time.soft=unlimited
time.hard=unlimited
size.soft=unlimited
size.hard=unlimited
# Replication user can read (not write) everything
access to *
by dn.exact="cn=replication,ou=special users,dc=example,dc=com" read
by * none break
#############################################################
#
# Database configuration
#
# see slapd-mdb(5) for details
#
#############################################################
database mdb
monitoring on
suffix dc=example,dc=com
directory /db/ldap/example
rootdn "cn=administrator,ou=special users,dc=example,dc=com"
maxsize 209715200
# Default password hashing scheme
password-hash {SSHA}
# memberOf overlay provides reverse-lookups of group membership
overlay memberof
# sssvlv overlay provides server-side sorting
# Used mainly to allow easy sorting of uidNumber/gidNumber values
overlay sssvlv
sssvlv-max 4
sssvlv-maxkeys 5
sssvlv-maxperconn 4
# unique overlay provides attribute uniqueness
# We use this to enforce unique uidNumber/gidNumber values
overlay unique
unique_uri ldap:///ou=people,dc=example,dc=com?uidNumber?one?
unique_uri ldap:///ou=group,dc=example,dc=com?gidNumber?one?
### CONSUMER configuration
syncrepl
rid=1
provider=ldap://ldap1
type=refreshAndPersist
bindmethod=simple
binddn="cn=replication,ou=special users,dc=example,dc=com"
credentials=password
syncdata=accesslog
interval=00:00:00:10
retry="20 10 60 10 120 +"
timeout=1
logbase="cn=log"
searchbase="dc=example,dc=com"
logfilter="(&(objectClass=auditWriteObject)(reqResult=0))"
sizelimit=unlimited
timelimit=unlimited
schemachecking=on
starttls=yes
syncrepl
rid=2
provider=ldap://ldap2
type=refreshAndPersist
bindmethod=simple
binddn="cn=replication,ou=special users,dc=example,dc=com"
credentials=password
syncdata=accesslog
interval=00:00:00:10
retry="20 10 60 10 120 +"
timeout=2
logbase="cn=log"
searchbase="dc=example,dc=com"
logfilter="(&(objectClass=auditWriteObject)(reqResult=0))"
sizelimit=unlimited
timelimit=unlimited
schemachecking=on
starttls=yes
### PROVIDER configuration
overlay syncprov
syncprov-checkpoint 5 5
syncprov-sessionlog 50
mirrormode on
# Access log - used for delta-syncrepl too
overlay accesslog
logdb cn=log
logops writes
logold (objectClass=*)
logsuccess TRUE
logpurge 28+00:00 1+00:00
# Allow unlimited access for replication user
limits
dn.exact="cn=replication,ou=special users,dc=example,dc=com"
size=unlimited
time=unlimited
--
Liam Gretton liam.gretton@le.ac.uk
Systems Specialist http://www.le.ac.uk/its/
IT Services Tel: +44 (0)116 2522254
University Of Leicester, University Road
Leicestershire LE1 7RH, United Kingdom