[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
slapadd, 1 million entries, some numbers
Being one of the folks who said that openldap was slow, I decided
to run some tests.
Yesterday I tried do add about 1 million entries via slapadd. Results
are below. The machine is a P4 2.4GHz, 1Gb RAM, cheap 20G udma IDE disk,
P4PE motherboard.
Configuration:
openldap-2.1.19
db-4.1.25
linux-kernel-2.4.21-pre7-ac1
no indexing, schemacheck off, dbnosync set, DB_TXN_NOSYNC set
# time slapadd -l saida.ldif
real 233m3.877s
user 7m51.880s
sys 1m11.710s
I noticed via top (and according to the time output above) that slapd
spends a considerable amount of time in the D state, that is, waiting for
a system call to complete if I'm not mistaken. I suppose this is due to the
heavy logging the BDB backend uses/makes. I got about 2.7G worth of log files.
I'm about to repeat this test on another machine now with two scsi disks, 2 HT
CPUs, openldap-2.1.20 and with the log dir on the other disk.
# grep dn: saida.ldif |wc -l
1081802
Relevant parts of slapd.conf:
schemacheck off
database bdb
dbnosync
checkpoint 100000 10
DB_CONFIG:
set_flags DB_TXN_NOSYNC
#set_lg_dir /storage/ldap
set_lg_max 104857600
(I will repeat the test with the log dir set to another disk)
Meanwhile, I'm running slapindex on that first machine, and it is also spending
a lot of time in the D state, as I imagine is expected. I will repeat all this in
a SCSI machine as well.
Meanwhile (again :), does anybody see any obvious mistake besides the ones I
already mentioned (ide disk, log in the same disk as the database)? Is cache
relevant for bulk loading data?
A sample of the ldif file, 900 fictious branches, 1200 test subjects:
# head -50 saida.ldif
dn: o=Company
o: SP
objectClass: top
objectClass: organization
dn: ou=Branches, o=Company
ou: Escolas
objectClass: top
objectClass: organizationalUnit
dn: ou=Branch-1, ou=Branches, o=Company
ou: Branch-1
objectClass: top
objectClass: organizationalUnit
dn: ou=People, ou=Branch-1, ou=Branches, o=Company
ou: People
objectClass: top
objectClass: organizationalUnit
dn: uid=Emp-1, ou=People, ou=Branch-1, ou=Branches, o=Company
uid: Emp-1
cn: Emp-1-cn
givenName: Emp-1-gn
sn: Emp-1-sn
mail: Emp-1@Branch-1.company.com
uidNumber: 1001
gidNumber: 1001
homeDirectory: /home/Emp-1
objectClass: person
objectClass: organizationalPerson
objectClass: inetOrgPerson
objectClass: top
objectClass: posixAccount
(...)