[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
2.1 & 2.2 statistics, and some odd behavior that needs to be examined.
As I'm sure most (if not all) of you are aware, I've been performing a
number of tests on OpenLDAP 2.1 and 2.2, to see how the products compare to
each other, and how different tuning options in 2.2 affect the outcome of
the tests.
They can be seen at:
<http://www.stanford.edu/~quanah/directories/statistics/>
The above page should always work, but I restructure the pages underneath
it periodically, so don't bookmark them. ;)
For the most part, 2.2 is a clear winner over 2.1.
Some general conclusions:
Btree is a definate win when it comes to running slapadd and slapindex (I
think this should be at least a configure option in 2.2.6)
Memory cache is pretty much essential
HDB is generally better than BDB (but there are some odd issues with the
idlcache I've noted to Howard)
syncRepl as it currently behaves is not ready for production use (Although
I am corresponding with Jong about this regularly, so this may change in
the near future).
However, there is a serious threading issue in 2.2 when it is used with
SASL and a disk-based database cache.
You can see this looking at Tests on Solaris Servers->Performance Tests on
Replica Servers. All of my servers have the same underlying software
packages, so OpenLDAP is the *only* variation on them. This lets me know
that the issue I am seeing must be in OpenLDAP 2.2, or in how OpenLDAP 2.2
interfaces with those packages as compared to how 2.1 interfaces with those
packages. The DB_CONFIG parameters are the same across all the systems.
In 2.1 (using 2.1.24) the system is set up with BDB and a disk based cache.
The performance test shows an average rate of 74.856 answers/second using a
SASL/GSSAPI authenticated bind using a filter of (uid=<whatever>) returning
sumaildrop. This is using a mixed set of accounts (Some uid's exist and
have maildrop, some exist and don't have maildrop, and some don't exist at
all). At the worst, I see a 6 answers/second response rate, and at the
best I see a 94 answers/second response rate in the time this test runs.
The test has 30 hosts querying the server for this information. All of
querying hosts stay querying throughout the test.
In 2.2 (using 2.2.5 with btree patch), the system is set up with BDB and a
disk based cache (I see the same results using btree or hash indices). The
performance test shows an average rate of 28.5958 answers/second (btree) or
32.7432 answers/second (hash). This is half of the performance in 2.1! At
the worst, I see 0 answers a second (22-90 instances) and at the best, I
see 116 answers/second (1 instance). What this also doesn't show, is that
it is *impossible* to keep all 30 hosts querying the server. They get
GSSAPI errors or "Can't contact LDAP server errors", and drop off. Once
about 6 servers drop off, the rest will stay querying the server, with only
occasional dropoffs.
However, if I do this same setup, except that I use a memory based cache
instead of a disk based cache, the performance shoots up to 126
answers/second (BDB) to 177 answers/second (HDB). No hosts die off, and
the range is from 17 answers/second (BDB low) to 189 answers/second (HDB
high).
I have a feeling if whatever is causing the problem in the disk cache
scenario can be resolved, that the memory cache numbers could shoot even
higher.
Another thing to note is I did the same disk cache test, only doing simple
bind (anonymous) instead of SASL/GSSAPI binds. I had 1 host of 30 drop
(Can't contact LDAP Server). The remaining 29 ran the server at a whopping
222 answers/second (178 low, 266 high). That is why I finger the threading
and disk cache as being part of the issue.
Any ideas on where I can proceed from here to help identify where the
issue(s) are occuring?
--Quanah
--
Quanah Gibson-Mount
Principal Software Developer
ITSS/TSS/Computing Systems
ITSS/TSS/Infrastructure Operations
Stanford University
GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html