So, if
I have a 2 master MMR setup, I assume I would want to point half my
replicas at master A and the other half at master B for their updates.
This leads to a problem in my mind, in that if master A goes down, then
half of my replica pool is now going to remain completely out of sync
with the remaining master until master A is recovered. Throwing a Load
balancer in front of the two masters, and pointing the replicas at that
instead, is not a viable option because the two masters may be getting
updates in a different sequence, so if a replica disconnects from the LB
and then reconnects, the updates it could get fed from whatever master
the LB is pointing at could lead to inconsistencies.
What inconsistencies? Each master's changes are stamped with its own sid.
Any consumer is going to know about the contextCSNs of each master it
talks to.
Neither of these seem like a
good option. I don't see a good solution here to resolve this issue,
either, unless the replica could somehow know which master it had been
talking to,
The replica always knows which master it's talking to...
and drop into refresh mode if it found itself talking to a new
master?
Drop into refresh mode? Obviously in persist mode the consumer keeps a
connection open to a specific master; a load balancer can't move an open
connection. So obviously, if a particular master disappears, all of its
clients are going to lose their connections and any consumers set up to
retry are going to have to initiate new sessions. And every new
replication session starts with a refresh phase. So this recovery is
already automatic, it always has been.
I'm also not clear on what happens if your replicas are
delta-syncrepl based, rather than normal syncrepl, in the LB setup.
Not possible. Current delta-sync requires all updates to be logged in
order; in an MMR setup you can't guarantee order so *nobody* can use
delta-sync in this scenario.
For Mirror Mode, I would assume you could point the replicas at the LB
fronting the two masters, since only one master is ever receiving
changes. I also assume delta-syncrepl would be a completely valid option
for replication to the replicas, again because only one master is
getting the updates, so all updates would be logged in the same sequence
on both servers. However, I don't know if this is correct or not, or if
there are limitations here I haven't considered. When I was first
pondering this on the #openldap-devel channel in IRC, Matt Backes made a
comment about delta-syncrepl not working with Mirror Mode.
For MirrorMode, delta-sync should work since there is only ever one
source of changes, and they will be logged in order. There is a window of
vulnerability where a server crashes after committing changes to its
accesslog, before it replicates them to the mirror. Those changes will be
temporarily lost, and create a gap in the mirror's log. When the original
server comes back up, the mirror will receive those lost changes, but the
strict ordering of its log will be broken. In this case though, the
delta-sync consumer will be fine - if the lost changes caused no
conflicts, they will simply be committed. If they do cause a conflict,
the consumer will just fallback to refresh mode and the conflicts will be
erased.
So, basically, I'm at a loss if my understanding things is correct, on
how I provide a consistent replicated environment for my customers,
while also providing master/master failover.
This appears to have been a -software question, not a -devel question.
Perhaps you should summarize back to the -software list and end this
thread here.