[Date Prev][Date Next] [Chronological] [Thread] [Top]

delta-sync replication slave quitting problem - not a single retry



Hello guys,
I have a problem with delta-syn replication (all set up according to 'official' guide - http://www.openldap.org/doc/admin24/replication.html#Delta-syncrepl
I have master instance with logs 'shipped' to a client - it all works fine as long as connection is good.
Getting ready to move into production I'm trying to emulate connectivity problems and here where I got problems.

Specifically - even though I have mirror instance set up as:
syncrepl rid=101
        provider=ldap://192.168.22.62:389
        type=refreshAndPersist
        bindmethod=simple
        binddn="cn=replicator,xxxxx"
        credentials="xxxxxx"
        searchbase="xxxxxxx"
        filter="(objectClass=*)"
        logbase="cn=accesslog"
        logfilter="(&(objectClass=auditWriteObject)(reqResult=0))"
        scope=sub
        attrs="*,+"
        schemachecking=off
        retry="1 +"
        syncdata=accesslog

once I have server disconnected (I sumply restart slapd on master), the client not even tries to re-connect, the log below shows modificatin operation at 18:34:18 that went fine and 11 seconds later I restart master's ldap service (which became immediately available again):

Jul 28 18:34:18 newton slapd[20353]: => entry_encode(0x00000032): mail=xxxxxxxxxxxxxxxxxxxxxxxxxx.
Jul 28 18:34:18 newton slapd[20353]: bdb_modify: updated id=00000032 dn="yyyyyyyyyyyyyyyyyyyyyyyy"
Jul 28 18:34:18 newton slapd[20353]: send_ldap_result: conn=-1 op=0 p=0
Jul 28 18:34:18 newton slapd[20353]: send_ldap_result: err=0 matched="" text=""
Jul 28 18:34:18 newton slapd[20353]: syncrepl_entry: rid 101 be_modify (0)
Jul 28 18:34:18 newton slapd[20353]: bdb_modify: xxxxxxxxxxxxxxxxxx.
Jul 28 18:34:18 newton slapd[20353]: bdb_dn2entry("oxxxxxxxxxxxxxxx")
Jul 28 18:34:18 newton slapd[20353]: bdb_modify_internal: 0x00000001: o=xxxxxxxxxxxxxxxxx.
Jul 28 18:34:18 newton slapd[20353]: <= acl_access_allowed: granted to database root
Jul 28 18:34:18 newton slapd[20353]: bdb_modify_internal: replace contextCSN
Jul 28 18:34:18 newton slapd[20353]: => entry_encode(0x00000001): o=xxxxxxxxxxxxxxxxxxxxx.
Jul 28 18:34:18 newton slapd[20353]: bdb_modify: updated id=00000001 dn="xxxxxxxxxxxxxxxx"
Jul 28 18:34:18 newton slapd[20353]: send_ldap_result: conn=-1 op=0 p=0
Jul 28 18:34:18 newton slapd[20353]: send_ldap_result: err=0 matched="" text=""
Jul 28 18:34:18 newton slapd[20353]: daemon: activity on 1 descriptor
Jul 28 18:34:18 newton slapd[20353]: daemon: activity on:
Jul 28 18:34:18 newton slapd[20353]:
Jul 28 18:34:18 newton slapd[20353]: daemon: epoll: listen=7 active_threads=0 tvp=NULL
Jul 28 18:34:29 newton slapd[20353]: daemon: activity on 1 descriptor
Jul 28 18:34:29 newton slapd[20353]: daemon: activity on:
Jul 28 18:34:29 newton slapd[20353]:  14r
Jul 28 18:34:29 newton slapd[20353]:
Jul 28 18:34:29 newton slapd[20353]: daemon: read active on 14
Jul 28 18:34:29 newton slapd[20353]: daemon: epoll: listen=7 active_threads=0 tvp=NULL
Jul 28 18:34:29 newton slapd[20353]: connection_get(14)
Jul 28 18:34:29 newton slapd[20353]: connection_get(14): got connid=0
Jul 28 18:34:29 newton slapd[20353]: =>do_syncrepl rid 101
Jul 28 18:34:29 newton slapd[20353]: =>do_syncrep2 rid 101
Jul 28 18:34:29 newton slapd[20353]: do_syncrep2: rid 101 Can't contact LDAP server
Jul 28 18:34:29 newton slapd[20353]: connection_get(14)
Jul 28 18:34:29 newton slapd[20353]: connection_get(14): got connid=0
Jul 28 18:34:29 newton slapd[20353]: daemon: removing 14
Jul 28 18:34:29 newton slapd[20353]: daemon: activity on 1 descriptor
Jul 28 18:34:29 newton slapd[20353]: daemon: activity on:
Jul 28 18:34:29 newton slapd[20353]:
Jul 28 18:34:29 newton slapd[20353]: daemon: epoll: listen=7 active_threads=0 tvp=NULL
Jul 28 18:34:29 newton slapd[20353]: do_syncrepl: rid 101 quitting

I'm running openldap 2.3.43-12.el5_5.1 from standard CentOS 5.4 installation.

Do I get something wrong and slave not supposed to re-connect after master service restart or is this some kind of a problem that was fixed in later versions?

Thank you,
Alex