[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: slapadd/slapindex
Howard Chu wrote:
Some observations regarding slapadd performance... The ideal is to
have enough memory to configure a BDB cache large enough for all of
the database files. Failing that, it's best to run slapadd and
slapindex separately.
For my test database with 360MB input LDIF and 285,000 entries and 15
indexed attributes, using a 512MB BDB cache, slapadd -q with indexing
took 1 hour 20 minutes.
With the IDL caching patch in HEAD, and IDL cachesize at 50,000, this
dropped to 1 hour even.
Running slapadd -q with no indexing took only 1 minute 15 seconds.
The resulting id2entry database is about 800MB; with all indexing the
total size is around 2.1GB.
Running slapindex with this BDB environment is pretty slow. But, by
setting BDB to mmap files of 800MB or less, and deleting the
environment so that id2entry is mmap'd directly instead of being
double-buffered through the BDB cache, the slapindex -q time drops to
26 minutes without IDL caching, and 20 minutes with caching.
Using two threads (my machine is dual-core) the slapindex -q time is now
only 10 minutes, using a BDB cache of 768MB and no IDL caching. Adding
IDL caching here slows it down to 15 minutes, so I've decided to disable
that bit of code by default.
I've added a new global config parameter "tool-threads" (olcTooThreads)
to control how many threads the indexer will use. The default value is
1. For multiple threads, all of the attributes that need indexing in an
entry are divided up among each of the threads, so only one entry is in
progress at a time. On my tests there was no advantage to using more
threads than there are processors. More testing would be welcome...
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc
OpenLDAP Core Team http://www.openldap.org/project/