[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: Issue in syncprov findcsn code
Rein Tollevik wrote:
Well, a serverID of 0 is basically the same as no serverID. For
mirrormode/multimaster the serverID must be non-zero. For
single-master the serverID must be zero.
This is not how I read the doc nor the source. But if it was like this
then it should be what I need :-) To enforce it syncprov must be changed
so that:
If serverID is 0 it should only allow one contextCSN value, and it
should have 0 in the sid field. Maybe not required to enforce, but it
should help to quickly identify incorrectly configured servers.
If serverID is not 0 it should not accept contextCSN values from
syncrepl with 0 in the sid field, to make sure it don't receives updates
from a single-master configured server.
If serverID is not 0 it must ignore contextCSN values with 0 in the sid
field read from the database. This is to allow a single-master server
to be promoted to a multi-master without leaving the old sid=0 csn
around forever. Hmm, if a csn with sid=0 is found, but none with the
serverID value, then it could maybe be better to replace the sid in that
csn? More hmm, when starting up it would probably be correct to include
entries with 0 in the sid fields of their entryCSN value in those that
could cause the current servers contextCSN to be updated? I expect I'm
not the only one that forgets to add the -S argument to slapadd...
The serverID in existing mirrormode/multimaster configurations that uses
0 as the value must be changed, but this should be all that is needed
when upgrading to this version.
What would be the correct action if a contextCSN with an invalid sid
value is received from syncrepl? Asserting it could be a bit too
strict, better to ignore the value and complain loudly in the logs?
Does this make any sense? If so, I'll volunteer to implement.
To me, it makes a lot of sense and, well explained in the docs, would
greatly help troubleshooting (or even better, set up things the right
way right away).
My concerns are:
- do we need to consider all those cases and try to repair them? I'd
say: no. Just complain (and refuse to start) if the problem can be
solved by running "slapadd -S <SID>" or "slapcat | sed | slapadd".
- the problem should not occur run-time in a homogeneous,
well-configured system (== same versions, consistent configuration). If
it happens, just give up replication and/or commence a full refresh
(agree that assert'ing would be bad).
- slapadd could detect from the configuration whether -S is needed
(don't think it could determine the right SID, but at least it could
complain, and require a --force (to be implemented) if one retains to
know what he's doing).
p.
Ing. Pierangelo Masarati
OpenLDAP Core Team
SysNet s.r.l.
via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
-----------------------------------
Office: +39 02 23998309
Mobile: +39 333 4963172
Fax: +39 0382 476497
Email: ando@sys-net.it
-----------------------------------