[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Index problem with 2.0.11
Hello everyone,
I am somewhat new to OpenLDAP and have been working on a
prototype directory for several weeks now. I am trying
to better understand the performance characteristics of
OpenLDAP. The problem is that I have a rather long-
running EQ query that does not appear to be using the
specified indexes.
First, the basics:
SERVER
Pentium 733
256mb RAM
10gb Hd
Red Hat Linux 7.1
Berkeley Db 3.1.17 (from Red Hat RPMs)
OpenLDAP 2.0.11 (compiled from source)
CLIENT
Pentium 450
128mb RAM
7gb Hd
Windows NT Server 4.0, sp 6
DirectoryMark 1.2
The LDIF file (and the directory) were generated by
DirectoryMark 1.2, made by MindCraft. In case you're
not familiar with it, it's a Windows NT based testing
tool for LDAP servers. It generates a namespace that
looks like this:
dc=company,dc=com
ou=Product Development,dc=company,dc=com
ou=Product Testing,dc=company,dc=com
ou=Accounting,dc=company,dc=com
ou=Human Resouces,dc=company,dc=com
ou=Payroll,dc=company,dc=com
cn=Joelly Chang,ou=Product Testing,dc=safeco,dc=com
I generated an initial set of 1000 entries under the 5
organizational units, and started testing. The average
was 51 operations per second, where each operation was a
CN, SN, or seeAlso search. These were loaded using
ldapadd, and then indexed using slapindex (yes, I shut
down slapd while indexing).
I next generated a set of 250000 entities, and used
ldapadd and slapindex. I then started testing. The
average was 0.1 operations per second, with 151 failures
of 151 operations in 10 minutes. This was not
acceptable. Using ldapsearch, I used the test:
Ldapsearch -x -P 3 -s sub "(cn=Joelly Chang)"
This search returned one record (this is a unique
record) after one minute, and continued to the timelimit
of 6 minutes. I wanted to check my slapd.conf, to make
certain that CN was indexed.
SLAPD.CONF
(global)
LDAP_Version_3
loglevel 0
idletimeout 30
sizelimit 100
timeout 3600
defaultsearchbase "dc=company,dc=com"
(LDBM section)
index cn, sn eq,sub
index description eq
index seeAlso eq
dbnolocking
dbnosync
cachesize 1500
dbcachesize 150000
I then turned on loglevel 32, to see what was
happening. Once again, I used the same query.
SLAPD.LOG
Jun 28 11:04:40 ldap slapd[2065]: EQUALITY
Jun 28 11:04:40 ldap slapd[2065]: end get_filter 0
Jun 28 11:04:40 ldap slapd[2065]: ^IAND
Jun 28 11:04:40 ldap slapd[2065]: ^IDN SUBTREE
Jun 28 11:04:40 ldap slapd[2065]: ^IDR
Jun 28 11:04:40 ldap slapd[2065]: ^INEQUALITY
Jun 28 11:04:40 ldap slapd[2065]: ^INEQUALITY
Jun 28 11:04:40 ldap slapd[2065]: => test_filter
Jun 28 11:04:40 ldap slapd[2065]: EQUALITY
Jun 28 11:04:40 ldap slapd[2065]: <= test_filter 5
The last three lines (above) continue on to the end of
the file. Somewhere in the middle I'm sure that it
actually finds the single record. What's interesting
here is that the file is 750,025 lines long. Assuming a
7 line header, and that each found record is 23 printed
lines (when displayed), plus two for the "test filter",
I will assume that the search looked at each individual
record in the directory. This doesn't make me believe
the indexes are working.
So, to try a different index structure I changed
SLAPD.CONF to the following:
index cn eq,sub
index sn eq
index description eq
index seeAlso eq
(These indexes are based on the types of searches
performed by DirectoryMark. I'd change them in a
production system).
I then shut down slapd, ran slapindex, and started up
slapd. Once again, I used the same "(cn=Joelly Chang)",
and checked the log file 6 minutes later. Again, it was
750,025 lines long, with the same byte size.
It appears as though the index on CN is not being used,
and that the system is in fact looking at each record
individually. While it is good to know that slapd can
manually review 250,000 rows in 6 minutes, I would
rather have it handle 250,000 queries in 6 minutes.
I then tested a database with 10,000 records. Using 7
client threads, DirectoryMark reported 817 operations in
10 minutes. The average time was 5148ms, and this meant
an average of 1.4 operations per second. There were no
fails and no timeouts.
With one client search, and logging turned on, the log
file grew to 30,025 lines but returned the data faster.
However, this still would indicate that the CN index is
not being used.
Any help will be greatly appreciated.
Kayne McGladrey