[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: Replication questions
Clowser, Jeff (Contractor) wrote:
Agreed - having a single "active" master and a "hot"/active but unused
standby master solves most HA issues without introducing the conflicts a
full active-active multimaster setup creates. But if that master is
there and accepts writes, it's inevitable that someone will some day
write to it out of ignorance, and *may* write a conflicting change, so I
see conflict resolution as a last ditch fallback for this situation (and
nothing more) to prevent corruption or breakage of replication. (Plus, I
like to close up or at least be fully aware of all the edge cases that
exist, so I know how best to avoid them :) ).
Makes sense.
You said at one point that OpenLDAP (2.4.6?) currently does entry level
conflict resolution, and does not do attribute level conflict resolution
yet - i.e. if the entry was updated on 2 separate servers with different
updates, conflicting or not, the most recently changed version of the
*entry* wins. If I change the cn on one master, and after that (but
before replication has occurred) I change the userpassword on another
master, then the sync up occurs, I won't see the entry with the cn and
password changed on all servers, I'll see the entry as it is in the
master most recently changed (i.e. in my example, I'll see a changed
password, but the cn will revert). Is there a roadmap/timeline for doing
attribute level conflict resolution?
There are no set dates, but I expect it to be later in the 2.4 stream.
Also, I was looking at the admin guide and syncprov man pages on how to
set up replication. N-Way multi-mastering details are kinda sparce :).
Is there any documentation elsewhere on setting this up? OR... Is the
setup exactly the same as setting up Mirror-mode (per 2.3.x), but the
2.4.x code just automatically does conflict resolution (i.e. was
mirror-mode a 2.3 feature, with multimaster transparently replacing it
in 2.4 by adding conflict resolution to mirror-mode, using the same
setup?)
Yes, set it up pretty much like MirrorMode. MirrorMode was 2.4.1-2.4.4, which
were only alpha releases, not general/public releases.
Is it possible for a consumer to replicate from multiple masters?
Yes in 2.4.
I'm
thinking along the lines of a master server at 2 locations (for HA/DR
purposes), plus each location also has multiple read-only slave
consumers. My first thought is that these slave servers point to the
local master, but if that master goes down, the slaves under that master
stop getting updates. My second thought is to have a load balancer at
each site, which directs all traffic connecting to a "master ldap" vip
to route connections to the primary master if it's up, or the secondary
master if the primary is unavailable. But... (I'm still absorbing
syncrepl and rfc 4533) will all the contextCSNs and cookies and so forth
match up well enough to allow this kind of failover for *syncrepl*? Is
it possible, and what's the best way to set this up, such that I have
multiple masters for DR purposes, and such that the failure of any
single master does not cause some subset of my read-only slave consumers
to stop getting updated?
Syncrepl (in refreshAndPersist mode), as I understand it, generally has
the slave consumer contacting the master server, retrieving an updated
list of changes since the last time it was running (refresh), then
leaves a persistent search running that gets changed entries from the
master server as they happen (persist), so replication is near
real-time. If the master server crashes and then is restarted or the
connection is broken/dropped (common if a load balancer is inbetween),
how well does the consumer detect this and reconnect, or do the
consumers tend to have to be restarted after this occurs? (This is a
broken/dropped connection, *not* one cleanly closed by a master server
clean shutdown or idle timeout, and many apps have trouble detecting
this - the client still thinks it has a valid tcp connection, but
nothing is coming over it, so never gets new updates. Does the consumer
send keepalive packets or anything to cause it to realize the connection
has died and to reconnect?)
Currently the consumer relies on TCP keepalives. We've discussed adding
LDAP-level keepalives so we're not dependent on the kernel TCP timers, but
that hasn't been done yet.
When initializing a consumer using an LDIF backup of the master, should
this be a slapcat export to get everything needed to support syncrepl
(such as contextCSN, entryUUIDS, etc)?
That's the fastest way. But you can just bring up a consumer with an empty
database and let it pull the entire DB down during its refresh pass, it will
work regardless. Unlike some other replication schemes you may have used, we
don't require any special considerations for initial load vs reload or
recovery. Turn it on and it works.
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/