[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Syncrepl vs. replication
This is rather long, but I thought it best to go into detail.
Our directory sort of evolved over the years, as problems became apparent
and new releases issued etc. The limitations of the replication protocol
are now becoming serious, but before I investigate syncRepl I'd like to
know whether it will indeed work in our rather baroque situation.
First, some background:
We have offices all through Asia; some applications are local to that
office whilst others are global. Now, the comms links are not the best
(this is not so much a slight on the countries so much as trying to get
*this* country to talk to *that* country i.e. a mixture of DSL/X.25/ISDN
etc) but real-time comms is not so important as a locator service i.e. we
don't care if application ABC on host XYZ cannot be reached, but we do
need to know that application ABC lives on host XYZ, so that a batch job
can be queued for it.
After trying various configurations including some rather long replication
chains I finally decided on something which IMHO was rather elegant: each
office masters its own suffix (e.g. dc=sg,dc=example,dc=com) whilst
carrying slaves for the other zones; a top-level directory hooks them
together by handing out referrals back to the local box and to a central
backup. Each master replicates to this "replication server" which then
replicates back to the other slaves in turn.
Still with me? It was here that the replication protocol began to show
some limitations. Basically, there is an implicit assumption that a
server will handle one suffix: this is apparent in the format of the
slurd.replog file:
replica: ldap2.au.example.com:389
time: 1117516270.0
dn: ou=Admin,dc=sg,dc=example,dc=com
changetype: modify
replace: st
st: xyzzy
-
Note that the "key" is the host, not the suffix.
It's assumed that "ldap2" knows about that suffix; it happens when SLURPD
tries to replicate to all known servers in slapd.conf. I tried fiddling
with the port (389) but that got too messy; instead, I partitioned
slapd.conf into several files, fired up a SLURPD for each one so each
instance knows only about its relevant slaves, and glued the lot together
for SLAPD's benefit.
Well, that was two years ago (assuming anyone is still reading this far)
and it's worked ever since. Now, back to the present...
We're now using multiple disjoint directories (representing companies that
we've acquired) served on one host, and I'm seeing this replication
problem again, but from the other end. Specifically, server "Joe" masters
both "dc=au,dc=example,dc=com" and "dc=company,dc=com,dc=au", replicating
to server "Whopper" (fictitious names to protect the guilty, of course).
Whenever I update the first zone, it tries to update both on the slave;
one works, and the other gets "ERROR: Referral" logged. And vice-versa.
It took me a while to figure out what was happening, because I was seeing
replication both working and logging errors.
I could do funky things with slapd.conf again, but I suspect I've hit the
limitations of the replication protocol. So, would syncRepl be a better
choice here? Think of the general case of several servers, mastering
several disjoint suffixes, replicating out to several arbitrary slaves
(some of which are the aforesaid servers).
We're running 2.2.26, and will move to 2.3 when it's stable (we don't have
the resources to hammer non-STABLE versions).
--
Dave Horsfall DTM VK2KFU daveh@ci.com.au Ph: +61 2 8425-5508 (d) -5500 (sw)
Corinthian Engineering, Level 1, 401 Pacific Hwy, Artarmon, NSW 2064, Australia