Hi,
I found the reason that slapd was hanging at startup. It turned out to be
a schema, which hadn't been properly replicated after being dynamicly
added.
So not replication is actually moving entries. However... it seems to
constantly loose connection (which may be why the schema sometimes fails
to replicate on load).
The setup is 2 mirrormode servers (slapd 2.4.17). Server 1 has the
database and is trying to replicate it to Server 2 which was empty from
start.
I have syncrepl for both cn=config and for the actual database.
Which means that I should see 4 connections (2 each way) between server 1
and 2. But the last connection (server2->server1) seems to open and close
constantly.
On server 2 I see repeated:
Oct 7 09:47:14 s02 slapd[26723]: do_syncrepl: rid=001 rc -1 retrying
Oct 7 09:47:28 s02 slapd[26723]: do_syncrep2: rid=003 (-1) Can't contact
LDAP server
Oct 7 09:47:28 s02 slapd[26723]: do_syncrepl: rid=003 rc -1 retrying
Oct 7 09:48:49 s02 slapd[26723]: do_syncrep2: rid=003 (-1) Can't contact
LDAP server
When Adding olcLogLevel: conns sync trace none I se the logmessages I
would expect mixed with a lot of these:
Oct 7 10:41:52 s02 slapd[26723]: slap_sl_malloc of 48 bytes failed,
using ch_malloc
Oct 7 10:41:52 s02 slapd[26723]: slap_sl_malloc of 40 bytes failed,
using ch_malloc
... coming in burts with varying number of bytes. However, the machine
doesn't look like it's running out of mem.