[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: OpenLDAP syncrepl woes
On Thu, Nov 17, 2011 at 5:50 PM, Howard Chu <hyc@symas.com> wrote:
>
> Jeffrey Crawford wrote:
>>
>> On Wed, Nov 16, 2011 at 1:27 PM, Howard Chu <hyc@symas.com
>> <mailto:hyc@symas.com>> wrote:
>> Jeffrey Crawford wrote:
>> On Wed, Nov 16, 2011 at 7:40 AM, Jeffrey Crawford<jeffreyc@ucsc.edu
>> <mailto:jeffreyc@ucsc.edu>> wrote:
>> On Wed, Nov 16, 2011 at 12:09 AM, Howard Chu<hyc@symas.com
>> <mailto:hyc@symas.com>> wrote:
>> Jeffrey Crawford wrote:
>> I'm trying to stabilize our openldap server farm before
>> going live and
>> am finding that despite the contextCSN matching between
>> providers and
>> replicas, the actual content of the server is getting out
>> of sync.
>> This is most prominent when we are testing our population
>> routine and
>> we need to remove all accounts before starting. right now
>> it's only
>> about 22000 entries (It will get much larger).
>
>> During the mass delete we got the following sprinkled
>> throughout the
>> logs on all machines:
>> ====
>> Nov 15 15:47:16 idm-prod-ldap-2 slapd[33070]:
>> bdb(dc=domain,dc=name):
>> previous transaction deadlock return not resolved
>>
>>
>> Wow. I've never seen this error message before. What version
>> of OpenLDAP and
>> BerkeleyDB are you using?
>>
>>
>> FreeBSD 8.2 with openldap 2.4.26, however like I mentioned before,
>> right now I think we are squeezing ram right now Part of this
>> deployment was to discover how much ram we needed on the virtual
>> machine and it was started pretty low.
>>
>>
>> Oh and we are using bdb 4.6 right now (forgot to answer that)
>>
>>
>> Running out of memory would cause an obvious error message ("no memory")
>> so that's not likely to be the problem here. Might be worth upgrading to
>> at least BDB 4.8, but again, never having seen BDB spit out that error
>> before, that's just a guess.
>
>> Not sure if this is significant but I'm been noticing that this error only
>> shows up on deletes. However it also shows up on deletes on the machine I'm
>> running the ldapdelete against. So perhaps this is more of a software issue.
>> I'll go ahead and run this with more ram and I'll check with the sysadmin if
>> they can compile it against bdb 4.8 and see if that changes anything. But I
>> don't think ITS#7052 applies here because the machine I'm doing this against
>> does not use syncrepl, its the provider to others.
>>
>> This is a machine on a VM. Are there any known issues with that?
>
> Way back in the dawn of time, there were some VMware implementations that didn't support mutexes correctly. I don't think that's been an issue for many years. There ought to be other error messages in your log, immediately preceding the one you quoted. Post those too.
>
> --
> -- Howard Chu
> CTO, Symas Corp. http://www.symas.com
> Director, Highland Sun http://highlandsun.com/hyc/
> Chief Architect, OpenLDAP http://www.openldap.org/project/
There really isn't much there but here is an example really not much
around it: (I've modified the usernames only)
Nov 17 21:11:55 localhost slapd[1912]: conn=1478 op=10706 DEL
dn="uid=user1,ou=people,dc=ucsc,dc=edu"
Nov 17 21:11:55 localhost slapd[1912]: conn=1478 op=10706 RESULT
tag=107 err=0 text=
Nov 17 21:11:55 localhost slapd[1912]: conn=1478 op=10707 DEL
dn="uid=user2,ou=people,dc=ucsc,dc=edu"
Nov 17 21:11:55 localhost slapd[1912]: conn=1478 op=10707 RESULT
tag=107 err=0 text=
Nov 17 21:11:55 localhost slapd[1912]: conn=1478 op=10708 DEL
dn="uid=user3,ou=people,dc=ucsc,dc=edu"
Nov 17 21:11:55 localhost slapd[1912]: conn=1478 op=10708 RESULT
tag=107 err=0 text=
Nov 17 21:11:55 localhost slapd[1912]: conn=1478 op=10709 DEL
dn="uid=user4,ou=people,dc=ucsc,dc=edu"
Nov 17 21:11:55 localhost slapd[1912]: bdb(dc=ucsc,dc=edu): previous
transaction deadlock return not resolved
Nov 17 21:11:55 localhost slapd[1912]: => bdb_idl_delete_key: cursor
failed: Invalid argument (22)
Nov 17 21:11:55 localhost slapd[1912]: conn=1478 op=10709 RESULT
tag=107 err=80 text=entry index delete failed
Nov 17 21:11:55 localhost slapd[1912]: conn=1478 op=10710 DEL
dn="uid=user5,ou=people,dc=ucsc,dc=edu"
Nov 17 21:11:55 localhost slapd[1912]: conn=1478 op=10710 RESULT
tag=107 err=0 text=
Nov 17 21:11:55 localhost slapd[1912]: conn=1478 op=10711 DEL
dn="uid=user6,ou=people,dc=ucsc,dc=edu"
Nov 17 21:11:56 localhost slapd[1912]: conn=1478 op=10711 RESULT
tag=107 err=0 text=
Nov 17 21:11:56 localhost slapd[1912]: conn=1478 op=10712 DEL
dn="uid=user7,ou=people,dc=ucsc,dc=edu"
Nov 17 21:11:56 localhost slapd[1912]: conn=1478 op=10712 RESULT
tag=107 err=0 text=