[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
RE: Testing the state of replicates
<quote who="Aaron Richton">
> [Gavin says]
>> Dig the main source. servers/slapd/syncrepl.c and
>> servers/slapd/overlays/syncprov.c
>
> Hmm, wrong source files. Try libraries/liblutil/csn.c, which sayeth:
>
> * These routines are (loosly) based upon draft-ietf-ldup-model-03.txt,
> * A WORK IN PROGRESS. The format will likely change.
> *
> * The format of a CSN string is: yyyymmddhhmmssz#s#r#c
> * where s is a counter of operations within a timeslice, r is
> * the replica id (normally zero), and c is a counter of
> * modifications within this operation. s, r, and c are
> * represented in hex and zero padded to lengths of 6, 3, and
> * 6, respectively. (In previous implementations r was only 2 digits.)
>
Ah, many thanks.
>
> We use
> http://www.openldap.org/lists/openldap-software/200602/msg00158.html,
> maybe with a small mod or two (I forget), to check that contextCSN isn't
> wedged. This only works when the syncrepl thread is completely borked. A
> better check would be something along the lines of the Net::LDAP ldifdiff
> to make sure that nothing's different. Of course this has race condition
> issues (not that we make writes all that often, but on paper at least). If
> anybody has something like that as a monitoring plugin, you'd erase one
> line off my perpetual todo list...
;-) Plugin for what?
>
> (Yes, that would be of great interest to me. ~93% of syncrepl bugs we've
> seen involve very very very slight errors that only result in an entry or
> two being wrong. contextCSN being wrong...we pretty much only see that in
> the field when tcp keepalives fail to indicate the need for a
> reconnection.)
>
So the entryCSN would be wrong?