[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: Syncrepl connections failing
On Thu, Apr 23, 2009 at 05:41:09PM -0500, John Kane wrote:
> I am having a problem with what appears (to me) to be 'stale' TCP
> connections for syncrepl between the master and a pair of slaves. After
> restarting all, I see changes on the master replicated to both slaves.
> BUT, if I wait about 30 minutes or more, then make a change, the
> replication fails (most of the time). netstat on the LDAP port show the
> connections still established, but queued packets at the master server.
> After about 15 minutes, the master server drops the connection. An
> overnight tcpdump on the master showed LDAP occasionally sending a
> keep-alive, with 2hrs between the keep-alive messages (these keep-alives
> are inconsistent, though, some nights I see none).
> Note: The 2 slaves are running on blades in an IBM chassis, and the
> master is on a 1U Linux server, just 'one-hop' away. Prior to this,
> when I had a master/slave pair running on the blades, syncRepl was
> working fine for several months. It was not until I moved the master to
> the another server did the failures start.
Do you have a firewall or NAT configured on or between any of the
boxes? This sort of problem with long-lived connections is often due
to state being dropped from IP-level devices.
Andrew
--
-----------------------------------------------------------------------
| From Andrew Findlay, Skills 1st Ltd |
| Consultant in large-scale systems, networks, and directory services |
| http://www.skills-1st.co.uk/ +44 1628 782565 |
-----------------------------------------------------------------------