[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: slapadding a large bdb database
On Mon, Aug 05, 2002 at 11:13:22AM -0700, Howard Chu wrote:
% > Of the 458,904 entries in the ldif, I can only "find" 31,729 of them.
%
% Not sure what to make of that. If you slapcat the database does the output
% match the LDIF files that you fed into slapadd?
Yup, if I slapcat the database, the number of entries matches exactly.
For laughs, I threw a larger spindle into this machine and added the entire
ldif in one go. It only returns 263,818 entries in this case (out of ~450k).
Removing all the bdb files other than id2entry and reindexing yields the
same results.
What's also interesting is that I get lots of (~45,000):
Aug 8 16:59:47 oh.roc.frontiernet.net slapadd: bdb(o=frontier): Duplicate
data items are not supported with sorted data
when *adding* with slapadd. (I started with a pristine environment - no
files whatsoever.)
Lastly, I noticed that reindexing ~450k entries (330MB of data in ldif
format with about two dozen indices) generates about 6-8GB of BDB
transaction logs. I haven't checked the source, do slap{add,index} perform
all their operations in a single transaction? If so, is there any way to
break these up, perhaps by having slapadd/slapindex use multiple
transactions (i.e., every x entries, recover the database to free up some
transaction logs and start a new transaction)? If not, could I modify the
source to do something like this? I'd like to cut down on disk usage by
purging some transaction logs while slap{add,index} is running.
john
--
John Morrissey _o /\ ---- __o
jwm@horde.net _-< \_ / \ ---- < \,
www.horde.net/ __(_)/_(_)________/ \_______(_) /_(_)__