[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: null_callbacks after initial sync
Nick Geron wrote:
Howard Chu wrote:
Nick Geron wrote:
We're now thinking some of our issues may be attributable to time
granularity issues. We're seeing missing information on the consumer if
multiple successive writes are attempted via a script. If we slow down
to human speed or insert sleeps in our test code, this gets a little
better. I see that A.2.4 N-Way MultiMaster Replication notes that
entryCSNs now record with microseconds, but does this apply to mirrors
as well?
CSNs were extended to microsecond resolution only for the benefit of
conflict resolution. For all other purposes, the changecount field
ensures sufficient granularity.
In that case, why do we see any difference in propagation between
scripted (quick) updates and hand/command line (slow) modifications? Or
are you simply saying time is not the issue?
Timestamps are not the issue for propagation.
For example - manipulating one particular entry:
1) update server 1 adding 1 attribute = propagates to second server
* wait a few seconds
2) update server 1 adding 4 attributes = first of four propagates to
second server
After waiting a second or so, another successful operation on the
'write' server will propagate all modifications over to the second
server as expected. This behavior is why we suspected a time
granularity issue. It should be noted that this doesn't work for us
(and others as I would expect) as there is no guarantee that another
operation on the 'write' server will occur, thereby propagating the
current entry.
OK, this sounds like the background thread to propagate updates isn't getting
scheduled when it should. That could be a bug in the syncprov overlay.
Can I setup a two node N-Way?
"2" is certainly a valid value of "N".
Well, there's that developer 'charm' I've been reading throughout years
of archives. Since the admin doc make a distinction between the 'hybrid
configuration' of MirrorMode and N-Way Multi-Master, I was more looking
for clarification between the two implementations.
Then that is what you should have asked. "Looking for clarification between
the implementation of MirrorMode and Multi-Master" is a much clearer question
than "Can I setup a two node N-Way", and there is no way one could logically
get from the latter to the former, based on the context of your email. If you
don't ask useful questions, you have only yourself to blame when you don't get
useful answers.
There is no difference now between the MirrorMode and Multi-Master code. The
only difference is purely a matter of usage. In a MirrorMode setup you use an
external frontend that guarantees that writes are only directed to one server.
As long as that guarantee is kept, your servers will have perfect data
consistency. In a Multi-Master setup, you allow writes to any server, and the
data consistency is not guaranteed. In that case the CSNs are used for
conflict resolution; when competing writes are made to the same entries the
last writer wins. (Note - the servers will all eventually converge on a
consistent view of the data, the issue is that the resulting data may not
resemble what you expected. If your servers' clocks are not tightly
synchronized, it's pretty certain to be different from what you expected.)
Syncrepl doesn't write session logs. Read RFC4533.
I'll look into it. Thanks.
Switching gears, what would the devs say is the capabilities in
operations per second with 2.4.7?
I've recently run back-hdb with a 5GB database in back-hdb, 20,000 indexed
searches/second concurrent with 13,000 modifies/second on an 8 core Opteron
server (1.9GHz cores). This was tested using slamd and ~80 client threads,
sustained over a 2 hour run.
I'm seeing a number of aborts when
testing under high load. The latest came from running scripted
ldapsearches and ldapmodifies which resulted in a mutex error (or so I
am told by one of our developers).
Specifically:
1) adding about 100 attributes to an entry
2) diffing the output of ldapsearch between the two nodes in loop
3) once synced, grabbing the attributes, shoving them in a temp file
with delete instructions and using that with ldapmodify.
I complied with debugging on which results in an abort with
"connection.c: 676: connection_state_closing: Assertion 'c_struct_state
== 0x02' failed" logged.
Interesting. It would be useful to get a gdb stack trace from that situation.
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/