[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Replicating from a mirrormode pair to a read-only server

To: Andrew Findlay <andrew.findlay@skills-1st.co.uk>
Subject: Re: Replicating from a mirrormode pair to a read-only server
From: Jonathan CLARKE <jonathan.clarke@normation.com>
Date: Fri, 03 Sep 2010 20:06:31 +0200
Cc: openldap-technical@openldap.org
In-reply-to: <20100903151847.GQ6835@slab.skills-1st.co.uk>
Organization: Normation
References: <20100902142732.GA8937@slab.skills-1st.co.uk> <4C8107AC.4030301@normation.com> <20100903151847.GQ6835@slab.skills-1st.co.uk>
User-agent: Mozilla/5.0 (X11; U; Linux i686; fr; rv:1.9.1.11) Gecko/20100711 Lightning/1.0b1 Thunderbird/3.0.6

Le 03/09/2010 17:18, Andrew Findlay a écrit :

On Fri, Sep 03, 2010 at 04:35:24PM +0200, Jonathan CLARKE wrote:

DB_LOCK_DEADLOCK errors are only a warning: retries should occur until the
operation completes. Of course, if they can be avoided, best avoid!

Question: is this topology sensible? If it is expected to work I
will gather some debug data for an ITS. If not, I will have to
drop back to plan B...


This is an interesting configuration. I would not have proceeded like this
but, as Marc Patermann suggested, I would set up a virtual address that
points to the currently available master, and configure one syncrepl clause
using this address (and all other LDAP clients, in fact). Could this
approach work for you?


That is what I have done now, and it does work. I am still a little
uncertain about it though: when the normal server fails and the DNS
entry or routing changes to point to the hot standby, will this
confuse the consumer slapd? We are effectively telling it that the
second machine *is* the original one, yet it will have a different
serverID and possibly different contents.


Actually, I just set up a few servers to test this out.

I don't have any problems using the 2 syncrepl statements side-by-sideon the slave. When one master goes offline, replication continues fromthe other, etc.


Could your problem be due to an unrelated configuration problem?

If not, in the hypothesis that the simultaneous-ness of changes iscausing problems, maybe try the seqmod overlay? (random idea, I don'tknow this overlay very well)

My testing using one syncrepl statement for a single virtual addressalso works fine in general (replication picks up where it left off),except in one case:- The slave server has a newer CSN for one of the serverIDs than themaster it's talking to. In this case, replication just fails with a"LDAP_RES_SEARCH_RESULT" message.

Of course, this case can only occur if a modification was made, let'ssay, on master1, and master2 didn't replicate it before master1 becameunavailable, and then master2 was then promoted to use the virtualaddress (despite it not being up-to-date with master1). But still...

The reason for this error is that the syncprov overlay on master2detects that one of the slave's CSNs is newer than it's own (the firstin this case), and closes the persistent search (syncprov.c, labelbailout:), even though the other CSN could be older, and thus syncprovcould provide updates.

I'm not sure if this can be considered a bug, but I think so. However,what to do in this case, from syncprov's point of view, is unclear to me...

Using two syncrepl statements is certainly suboptimal, as all modifications
will be replicated twice to all read-only servers. However, I don't see any
reason why it shouldn't work, off the top of my head. Does slapd end up
synchronizing everything?


Not sure - there were only 25000 entries but I gave up and stopped the
consumer server after 30 minutes as it still had not synchronised.

Good point about the double replication, though if it had worked
cleanly it would be OK in the (low modification) environment that I
have. The advantage is that nothing else is needed to manage the
failover / fail-back cases.


Makes sense, and seems like rather an attractive architecture.

Jonathan

--
==========================================
Jonathan CLARKE
------------------------------------------
Normation
44 rue Cauchy, 94110 Arcueil, France
------------------------------------------
Telephone:  +33 (0)1 83 62 26 96
------------------------------------------
Web:        http://www.normation.com/
==========================================

Follow-Ups:
- Re: Replicating from a mirrormode pair to a read-only server
  - From: Andrew Findlay <andrew.findlay@skills-1st.co.uk>

References:
- Replicating from a mirrormode pair to a read-only server
  - From: Andrew Findlay <andrew.findlay@skills-1st.co.uk>
- Re: Replicating from a mirrormode pair to a read-only server
  - From: Jonathan CLARKE <jonathan.clarke@normation.com>
- Re: Replicating from a mirrormode pair to a read-only server
  - From: Andrew Findlay <andrew.findlay@skills-1st.co.uk>

Prev by Date: Re: Defining a password attributetype
Next by Date: Re: Defining a password attributetype
Index(es):
- Chronological
- Thread