[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Requesting advice for repairing a syncRepl issue
Hello,
I have a pair of OpenLDAP servers that had been replicating flawlessly
with delta syncRepl for about 10 months. Just the other day, I saw
that modifications were no longer being replicated and these messages
were appearing in the syslog on the master server immediately after
the MOD line:
[ID 651871 local0.debug] => bdb_idl_insert_key: c_get next_dup failed:
DB_NOTFOUND: No matching key/data pair found (-30990)
[ID 809268 local0.debug] => bdb_dn2id_add: parent (cn=log) insert failed: -30990
I assume that something has become corrupted in the BDB database for
cn=log on the master. Does that seem correct? I'm definitely not
seeing any new entries in the cn=log database since those messages
began appearing.
If it is a corrupted index, I think that running "slapindex -b cn=log
-f .... " after stopping the slapd process will fix that. After that
completes, I should be able to restart the slapd and test that writes
to entries under the baseDN do cause new entries to appear in the
cn=log database.
If it's not an index, I have no idea how to repair this. I found the
error message in the sources (servers/slapd/back-bdb/idl.c:789 in
version 2.3.30) but honestly, I have no idea what that code is doing.
Once (if) I can repair things, I can begin worrying about getting
changes to the replica again. Since there are changes missing from
the cn=log database on the master, I assume that I'll need to cause a
complete re-sync. Is there a better way to accomplish that than
removing the entire database on the replica, using slapadd to import a
recent backup of the master, and restarting the replica?
Some specifics in case they matter:
Master:
Solaris10 amd64
BDB 4.2.52 + 5 patches
OpenLDAP 2.3.30
Replica:
Solaris10 amd64
BDB 4.2.52 + 5 patches
OpenLDAP 2.3.38 (upgraded from 2.3.33 the day before the problem began
on the Master)
(What I believe to be the) Relevant portions of slapd.conf file from
the Master (slightly obfuscated) are included at the end of this
message.
Thank you for any help,
-Ben
# access log database (used by syncprov-delta replication)
database bdb
suffix "cn=log"
directory /var/openldap/data/prod/logdb
rootdn "cn=Manager,dc=our,dc=domain"
mode 0660
shm_key 142
index default eq
index objectClass,entryUUID,entryCSN eq
index reqStart,reqEnd,reqResult,reqType eq
access to dn.subtree="cn=log"
by group.exact="cn=DirectoryAdmins,cn=administrators,dc=our,dc=domain" write
by dn.onelevel="cn=SyncUsers,cn=administrators,dc=our,dc=domain" read
by * none
overlay syncprov
syncprov-nopresent TRUE
syncprov-reloadhint TRUE
# This is all one line
limits dn.onelevel="cn=SyncUsers,cn=administrators,dc=our,dc=domain"
time.soft=unlimited time.hard=unlimited size.soft=unlimited
size.hard=unlimited
database hdb
suffix "dc=our,dc=domain"
rootdn "cn=manager,dc=our,dc=domain"
rootpw {SHA}[XXX REMOVED XXX]
directory /var/openldap/data/prod/db
checkpoint 100000 30
mode 0660
shm_key 42
cachesize 500000
idlcacheSize 1500000
index default pres,eq
index givenName,description,uid,cn,sn pres,eq,sub
index objectClass,uniqueMember,member eq
index employeeNumber eq,sub
index entryCSN,entryUUID eq
overlay ppolicy
ppolicy_default cn=standard,cn=policies,dc=our,dc=domain
overlay dynlist
dynlist-attrset groupOfURLs memberURL member
overlay syncprov
syncprov-checkpoint 100000 30
syncprov-sessionlog 300000
overlay accesslog
logdb cn=log
logops writes
logsuccess TRUE
logold (objectClass=inetOrgPerson)
logpurge 28+00:00 01+00:00
# This is all one line
limits dn.onelevel="cn=SyncUsers,cn=administrators,dc=our,dc=domain"
time.soft=unlimited time.hard=unlimited size.soft=unlimited
size.hard=unlimited