[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: OL 2.3.18 syncrepl vs slurpd
Howard Chu wrote:
Francis Swasey wrote:
Folks,
I'm attempting to convert from using slurpd to using syncrepl.
However, my testing is developing a definite belief that syncrepl is
hopelessly unable to keep up.
I have a test situation where I have loaded a 48,819 entry ldif
using slapadd -q -w on the master and slapadd -q on the replica. I
then proceed to perform 12,654 modrdns, 56 modifies, and 961
delete/add actions in rapid succession.
Did you verify that the syncrepl consumer was actually idle before you
started your tests? syncrepl requires a contextCSN attribute to be
present on both the provider and on the consumer. The "-w" option to
slapadd causes the contextCSN attribute to be written, so that means
your provider's database was immediately usable. But then you need to
copy that value over to the consumer. If the LDIF file that you
slapadd'd on the consumer came from slapcat'ing the provider, then
you're all set, because it contains all the operational attributes,
including the contextCSN attribute. But if you slapadd'd a plain input
LDIF file on the consumer, then it had no contextCSN attribute, and so
it would have to suck the entire database down from the provider before
it considered itself sync'd up.
I believe that I verified the syncrepl consumer was idle. I set the
loglevel on the consumer to 16640 (256 stats + 16384 LDAPSync) and the
syslog was quiet for several minutes as well as no timestamps changing
in the database directory before I started the test.
The ldif that I loaded on both the syncrepl provider and syncrepl
consumer was generated by a slapcat on the syncrepl provider after the
original plain ldif file was loaded with the -w flag to generate the
contextCSN attribute and I have verified that the contextCSN is in the
ldif that was loaded -- however, since I used the -w flag on the
slapadd, would that have regenerated a (possibly) different contextCSN
value on the syncrepl provider's database?
With that prerequisite aside, it's well understood that syncrepl is
slower than slurpd for a number of reasons. Since syncrepl sends whole
Ah, it wasn't well understood by me that it was designed to be slower.
entries rather than just modifications, it uses a lot more network
bandwidth than slurpd. It also causes a lot more database update
activity on the consumers. We can take steps to make some of the
database activity more efficient, but the network load is still an
issue. That's why Symas developed the delta-syncrepl mode of operation,
which uses the accesslog data format to propagate modifications instead
of whole entries. Of course, delta-syncrepl has its own performance cost
since it serializes write operations. (The serialization is two-phase,
so you can have two writes in progress at a time.) There's an up-side
and a down-side to this; the downside is serialization limiting the
number of simultaneous write operations, the upside is that you
generally get zero database deadlocks this way so every modification
completes much faster.
I guess I'd better investigate the delta-syncrepl mode.
Frank