[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: syncrepl
Pierangelo Masarati wrote:
Francis Swasey wrote:
Howard,
Given that slurpd will not be included in 2.4, I'm seriously hoping
(as one of the few people still using slurpd in production) that these
changes to syncrepl will be perfected in 2.3 before we are forced into
the syncrepl usage by 2.4.
I don't think slurpd will be wiped out of 2.4 yet. AFAIK, the only
significant removal in 2.4 is back-ldbm.
On the other hand... If we can iron out the last remaining issues for
syncrepl in 2.4, I see no reason to carry slurpd forward. (There's also an
outstanding issue of turning the syncrepl consumer into an overlay, what
happened to that patch?)
So continuing the discussion of what to do with syncrepl and multiple contexts...
1) the provider must be told about all of the sources of changes living
within its context. possible sources are
a) local changes
b) changes received via syncrepl
2) every source of changes must have a unique sid.
a) if it's a syncprov, then it's configured explicitly there
b) if it's a syncrepl consumer pulling from elsewhere, it uses the remote
server's sid.
3) the provider must aggregate all of the cookies for each of these change
sources and send them to consumers pulling from it.
There's some interesting implications here.
The fact that subordinate/glue can be used to put multiple change sources
under a single provider gets us half way to a real multi-master setup
already. In this case, we know that changes are in distinct DIT areas so that
there's no possibility of collision.
There's a desire to be able to configure multiple change sources for the same
context though. E.g., mirrormode is defined to only work with two servers
mirroring each other, it would be nice to be able to extend this to
additional failover servers.
From half-multi-master we can go all the way to multi- if we add collision
detection and conflict resolution. There's a pretty simple way to handle
collision detection - we just need to pass the entry's old entryCSN along
with the rest of the modification info. On the consumer we check and see if
the oldEntryCSN matches the consumer entry's current entryCSN. If they match,
there is no collision. If they don't match, we need to resolve the conflict.
With the syncrepl protocol conflict resolution is pretty easy - just compare
the entry's entryCSN to the received modification's CSN and take whichever is
newer (last writer wins). Either we discard the received mod if it's too old,
or we apply it as normal because it's newer. (There's another case too of
course - the received mod's entryCSN is identical to the current CSN, meaning
we already received this change via another route. We just discard the change
then.) So we can have perfect collision detection, and pretty reliable
conflict resolution, just by adding one more field to syncrepl. To me this is
so easy we can't not do it.
Of course to be able to compare entryCSNs reliably we need high quality, high
resolution timestamps, and all of the participating servers must have tightly
synchronized clocks. This isn't such a troublesome requirement, you just need
to run NTP on all of the servers.
Ideally all of the servers would be NTP peers of each other, that way you
could also query the local NTP server for the degree of clock skew with each
other peer. I'm not sure we need to go to this extreme, but it's worth
considering. (I'm not sure how useful that information is. E.g., if a server
goes down, and it had a lot of local changes on it that hadn't propagated
yet, and they all have valid timestamps, but the server's clock is way off
when it restarts, what can you tell about it?)
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc
OpenLDAP Core Team http://www.openldap.org/project/