[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Can't contact LDAP server
- To: openldap-technical@openldap.org
- Subject: Can't contact LDAP server
- From: Mark <mah042@gmail.com>
- Date: Fri, 9 Sep 2011 14:02:14 -0500
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; bh=SwHk1GiABaT6DUOYdknxSFBpTgSVsMYNFR3TJoB8gLU=; b=YoxoR6I8Olwu5RyPo8p6EmkTI9xsy5Tp+pSCldiibhpNid6hv02WsaWVD5bsVC6eiT X9wE27VIykAKSJIzMM8oQ3hYwEz7Ae8VYYakO2uk5FbqQ14+V3gzGjJKvHc3ADaqKj1S nxsFt51rQgUpwurUqzi55TELL01mbKLhk9K7Y=
I have a 2-way multi-master setup on srv1.foo.com (EDT) and srv2.foo.com (PDT).
For about 2hrs this morning srv2 was syslog'ing "Can't contact LDAP
server" while it was in a replication conversation with srv1:
Sep 9 05:29:45 srv2 slapd[9413]: do_syncrep2: rid=001 (-1) Can't
contact LDAP server
Sep 9 05:29:45 srv2 slapd[9413]: do_syncrepl: rid=001 rc -1 retrying
Sep 9 05:30:00 srv2 slapd[9413]: do_syncrep2: rid=001
LDAP_RES_INTERMEDIATE - SYNC_ID_SET
Sep 9 05:30:01 srv2 last message repeated 34 times
Sep 9 05:30:01 srv2 slapd[9413]: syncrepl_entry: rid=001
LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD)
Sep 9 05:30:01 srv2 slapd[9413]: syncrepl_entry: rid=001 be_search (0)
Sep 9 05:30:01 srv2 slapd[9413]: syncrepl_entry: rid=001 ...
Sep 9 05:30:01 srv2 slapd[9413]: syncprov_matchops: skipping original sid 003
Sep 9 05:30:01 srv2 slapd[9413]: syncprov_matchops: skipping original sid 003
Sep 9 05:30:01 srv2 slapd[9413]: syncrepl_entry: rid=001 be_modify ... (0)
Sep 9 05:30:01 srv2 slapd[9413]: syncrepl_entry: rid=001
LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD)
Sep 9 05:30:01 srv2 slapd[9413]: syncrepl_entry: rid=001 be_search (0)
Sep 9 05:30:01 srv2 slapd[9413]: syncrepl_entry: rid=001 ...
Sep 9 05:30:01 srv2 slapd[9413]: syncprov_matchops: skipping original sid 003
Sep 9 05:30:01 srv2 slapd[9413]: syncprov_matchops: skipping original sid 003
Sep 9 05:30:01 srv2 slapd[9413]: syncrepl_entry: rid=001 be_modify ... (0
...
Sep 9 05:30:29 srv2 slapd[9413]: syncrepl_entry: rid=001
LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD)
Sep 9 05:30:29 srv2 slapd[9413]: syncrepl_entry: rid=001 be_search (0)
Sep 9 05:30:29 srv2 slapd[9413]: syncrepl_entry: rid=001 ...
Sep 9 05:30:29 srv2 slapd[9413]: syncprov_matchops: skipping original sid 003
Sep 9 05:30:29 srv2 slapd[9413]: syncprov_matchops: skipping original sid 003
Sep 9 05:30:29 srv2 slapd[9413]: syncrepl_entry: rid=001 be_modify ... (0)
Sep 9 05:30:29 srv2 slapd[9413]: do_syncrep2: rid=001 (-1) Can't
contact LDAP server
Sep 9 05:30:29 srv2 slapd[9413]: do_syncrepl: rid=001 rc -1 retrying
Sep 9 05:30:45 srv2 slapd[9413]: do_syncrep2: rid=001
LDAP_RES_INTERMEDIATE - SYNC_ID_SET
Sep 9 05:30:45 srv2 last message repeated 34 times
Sep 9 05:30:45 srv2 slapd[9413]: syncrepl_entry: rid=001
LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD)
Sep 9 05:30:45 srv2 slapd[9413]: syncrepl_entry: rid=001 be_search (0)
Sep 9 05:30:45 srv2 slapd[9413]: syncrepl_entry: rid=001 ...
Sep 9 05:30:45 srv2 slapd[9413]: syncprov_matchops: skipping original sid 003
Sep 9 05:30:45 srv2 slapd[9413]: syncprov_matchops: skipping original sid 003
Sep 9 05:30:45 srv2 slapd[9413]: syncrepl_entry: rid=001 be_modify ... (0)
...
During that time no errors or closes were logged in the srv2 syslog. I
tried bouncing each slapd, but the issue continued. After about 2hrs
it stopped occurring.
My question is: What would cause the "Can't contact LDAP server
message"? The srv1 side doesn't log any error or log that the
connection was closed. The text "Can't contact" would seem to imply
that the error occurred during a connection attempt, but these errors
seemed to occur during a active conversation. Does the srv2 side
notice an read or write error on the socket and abandon the
connection? I've been looking through the code trying to determine
what causes that error message. Does it happen on a single read/write
error? Does it retry a few times? I hate to just say "Must be a
network error" without some due diligence.
My env:
RHEL 5.5
OpenLDAP 2.4.25
BerkeleyDB 4.8.40
OpenSSL 1.0.0d
Cyrus SASL 2.1.23
2-way Multi-master
====================
#srv1 slapd.conf -> slapd.d
include /appl/openldap/etc/schema/core.schema
include /appl/openldap/etc/schema/cosine.schema
include /appl/openldap/etc/schema/nis.schema
include /appl/openldap/etc/schema/inetorgperson.schema
include /appl/openldap/etc/schema/foo.com.schema
argsfile /appl/openldap/var/run/slapd.args
pidfile /appl/openldap/var/run/slapd.pid
threads 16
tool-threads 4
idletimeout 300
writetimeout 5
reverse-lookup off
timelimit time.soft=30 time.hard=300
sizelimit size.soft=500 size.hard=1000
password-hash {SSHA}
loglevel stats sync
serverid 1 ldap://srv1.foo.com:10806
modulepath /appl/openldap/libexec
moduleload back_monitor.la
moduleload back_hdb.la
moduleload syncprov.la
database config
rootdn cn=manager,dc=foo,dc=com
database monitor
rootdn cn=manager,dc=foo,dc=com
database hdb
rootdn cn=manager,dc=foo,dc=com
suffix dc=foo,dc=com
directory /appl/openldap/var/data/dc=foo,dc=com
cachesize 1000
idlcachesize 3000
cachefree 5
checkpoint 128 15
index objectClass eq
index entryCSN eq
index entryUUID eq
syncrepl rid=001
provider=ldap://srv1.foo.com:10806
type=refreshAndPersist
retry="15 +"
bindmethod=simple
binddn="cn=replicator,dc=foo,dc=com"
credentials="secret"
searchbase="dc=foo,dc=com"
starttls=no
schemachecking=off
syncrepl rid=002
provider=ldap://srv2.foo.com:10806
type=refreshAndPersist
retry="15 +"
bindmethod=simple
binddn="cn=replicator,dc=foo,dc=com"
credentials="secret"
searchbase="dc=foo,dc=com"
starttls=no
schemachecking=off
mirrormode TRUE
overlay syncprov
syncprov-checkpoint 50 10
syncprov-sessionlog 100
limits dn.exact="cn=replicator,dc=foo,dc=com"
time.soft=unlimited time.hard=unlimited
size.soft=unlimited size.hard=unlimited
====================
#srv2 slapd.conf -> slapd.d
include /appl/openldap/etc/schema/core.schema
include /appl/openldap/etc/schema/cosine.schema
include /appl/openldap/etc/schema/nis.schema
include /appl/openldap/etc/schema/inetorgperson.schema
include /appl/openldap/etc/schema/foo.com.schema
argsfile /appl/openldap/var/run/slapd.args
pidfile /appl/openldap/var/run/slapd.pid
threads 16
tool-threads 4
idletimeout 300
writetimeout 5
reverse-lookup off
timelimit time.soft=30 time.hard=300
sizelimit size.soft=500 size.hard=1000
password-hash {SSHA}
loglevel stats sync
serverid 2 ldap://srv2.foo.com:10806
modulepath /appl/openldap/libexec
moduleload back_monitor.la
moduleload back_hdb.la
moduleload syncprov.la
database config
rootdn cn=manager,dc=foo,dc=com
database monitor
rootdn cn=manager,dc=foo,dc=com
database hdb
rootdn cn=manager,dc=foo,dc=com
suffix dc=foo,dc=com
directory /appl/openldap/var/data/dc=foo,dc=com
cachesize 1000
idlcachesize 3000
cachefree 5
checkpoint 128 15
index objectClass eq
index entryCSN eq
index entryUUID eq
syncrepl rid=001
provider=ldap://srv1.foo.com:10806
type=refreshAndPersist
retry="15 +"
bindmethod=simple
binddn="cn=replicator,dc=foo,dc=com"
credentials="secret"
searchbase="dc=foo,dc=com"
starttls=no
schemachecking=off
syncrepl rid=002
provider=ldap://srv2.foo.com:10806
type=refreshAndPersist
retry="15 +"
bindmethod=simple
binddn="cn=replicator,dc=foo,dc=com"
credentials="secret"
searchbase="dc=foo,dc=com"
starttls=no
schemachecking=off
mirrormode TRUE
overlay syncprov
syncprov-checkpoint 50 10
syncprov-sessionlog 100
limits dn.exact="cn=replicator,dc=foo,dc=com"
time.soft=unlimited time.hard=unlimited
size.soft=unlimited size.hard=unlimited