[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: (ITS#3421) Replication problem with slurpd
Finally I can reproduce the bug (not always, but often). I am not sure
it is exactly the same bug, but it looks like.
What I have :
1 master (IP : 10.69.247.38) and 2 slaves (10.69.247.57 and
10.69.247.39). All are 2.2.19. I can join my slapd.conf files and my
DB_CONFIG if you want.
What I do :
10 threads add 50000 entries to the master :
dn: uid=userX-Y,o=example,c=fr
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetOrgPerson
sn: userX-Y
cn: userX-Y
uid: userX-Y
with X from 1 to 10 and Y from 1 to 50000
So I should have 50000 entries in the directories + 2 entries :
dn: o=example,c=fr
objectClass: top
objectClass: organization
o: example
dn: cn=repl,o=example,c=fr
cn: repl
uid: repl
sn: repl
objectClass: top
objectClass: person
objectClass: inetOrgPerson
userPassword: repl
But when I do an ldapsearch on the directories, I get :
master :
[...]
# numResponses: 500003
# numEntries: 500002
slave1 :
[...]
# numResponses: 500001
# numEntries: 500000
slave2 :
[...]
# numResponses: 500001
# numEntries: 500000
So it seems that 2 entries are missing. Looking at the .rej files, I have :
ERROR: Invalid syntax: objectClass: value #0 invalid per syntax
replica: 10.69.247.57:10389
time: 1102681517.6
dn: uid=user3-21130,o=example,c=fr
changetype: add
objectClass: top
objectClass: person
objectClass: o
(same thing for the second .rej file)
So it looks like that I have 2 entries missing, but only one in .rej file.
Moreother, this entry is incomplete. Of course it is complete in the
master :
ldapsearch -x -H ldap://10.69.247.38:10389 -b
uid=user3-21130,o=example,c=fr
# user3-21130, example, fr
dn: uid=user3-21130,o=example,c=fr
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetOrgPerson
sn: user3-21130
cn: user3-21130
uid: user3-21130
# search result
search: 2
result: 0 Success
Finally when I look at the other entry missing, it is :
uid=user1-19854,o=example,c=fr
As you suggested, I enabled logs for the master and for one slave (level
256). To enable slurpd logs, I needed to edit servers/slurpd/globals.c
and modify the line :
int ldap_syslog = 0;
into :
int ldap_syslog = 1;
If you want I can attach my logs, but it is a 190M file (logs are
centralized on a single server)...
I think these results are pertinent :
#grep slurpd ldap.log
Dec 10 12:37:09 10.69.247.38 slurpd[7881]: Replica 10.69.247.57:10389,
skip repl record for uid=user1-25962,o=example,c=fr (old)
Dec 10 12:37:09 10.69.247.38 slurpd[7882]: Replica 10.69.247.39:10389,
skip repl record for uid=user1-25962,o=example,c=fr (old)
Dec 10 13:25:17 10.69.247.38 slurpd[7882]: Error: ldap_add_s failed
adding "objectClass: value #0 invalid per syntax":
uid=user3-21130,o=example,c=fr
Dec 10 13:25:17 10.69.247.38 slurpd[7882]: Error: ldap operation failed,
data written to
"/ldap/test-bug-replication/var/openldap-slurp/replica/10.69.247.39:10389.rej"
Dec 10 13:25:17 10.69.247.38 slurpd[7881]: Error: ldap_add_s failed
adding "objectClass: value #0 invalid per syntax":
uid=user3-21130,o=example,c=fr
Dec 10 13:25:17 10.69.247.38 slurpd[7881]: Error: ldap operation failed,
data written to
"/ldap/test-bug-replication/var/openldap-slurp/replica/10.69.247.57:10389.rej"
#grep uid=user3-21130 ldap.log
Dec 10 13:25:16 10.69.247.38 slapd[8047]: conn=2 op=21130 ADD
dn="uid=user3-21130,o=example,c=fr"
Dec 10 13:25:17 10.69.247.38 slurpd[7882]: Error: ldap_add_s failed
adding "objectClass: value #0 invalid per syntax":
uid=user3-21130,o=example,c=fr
Dec 10 13:25:17 10.69.247.57 slapd[29521]: conn=0 op=224452 ADD
dn="uid=user3-21130,o=example,c=fr"
Dec 10 13:25:17 10.69.247.38 slurpd[7881]: Error: ldap_add_s failed
adding "objectClass: value #0 invalid per syntax":
uid=user3-21130,o=example,c=fr
#grep 'slapd\[8047\]' ldap.log | grep -A3 uid=user3-21130
Dec 10 13:25:16 10.69.247.38 slapd[8047]: conn=2 op=21130 ADD
dn="uid=user3-21130,o=example,c=fr"
Dec 10 13:25:17 10.69.247.38 slapd[8047]: conn=2 op=21130 RESULT tag=105
err=0 text=
Dec 10 13:25:17 10.69.247.38 slapd[8047]: conn=6 op=24360 ADD
dn="uid=user8-24360,o=example,c=fr"
Dec 10 13:25:17 10.69.247.38 slapd[8047]: conn=6 op=24360 RESULT tag=105
err=0 text=
#grep 'slapd\[29521\]' ldap.log | grep -A3 uid=user3-21130
Dec 10 13:25:17 10.69.247.57 slapd[29521]: conn=0 op=224452 ADD
dn="uid=user3-21130,o=example,c=fr"
Dec 10 13:25:17 10.69.247.57 slapd[29521]: conn=0 op=224452 RESULT
tag=105 err=21 text=objectClass: value #0 invalid per syntax
Dec 10 13:25:20 10.69.247.57 slapd[29521]: conn=0 op=224454 ADD
dn="uid=user4-25121,o=example,c=fr"
Dec 10 13:25:20 10.69.247.57 slapd[29521]: conn=0 op=224454 RESULT
tag=105 err=0 text=
#grep user1-19854 ldap.log
Dec 10 13:24:34 10.69.247.38 slapd[8042]: conn=9 op=19854 ADD
dn="uid=user1-19854,o=example,c=fr"
#grep 'slapd\[8042\]' ldap.log | grep -A3 uid=user1-19854
Dec 10 13:24:34 10.69.247.38 slapd[8042]: conn=9 op=19854 ADD
dn="uid=user1-19854,o=example,c=fr"
Dec 10 13:24:34 10.69.247.38 slapd[8042]: conn=9 op=19854 RESULT tag=105
err=0 text=
Dec 10 13:24:34 10.69.247.38 slapd[8042]: conn=4 op=25237 ADD
dn="uid=user6-25237,o=example,c=fr"
Dec 10 13:24:34 10.69.247.38 slapd[8042]: conn=4 op=25237 RESULT tag=105
err=0 text=
So what I think :
- the problem is on the master.
- slurpd or slapd have a problem writing replog or slurpd.replog
- maybe it is a lock problem or a flush problem
Hope it helps...
Raphael Ouazana