[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: ldapsearch performance degradation

To: openldap-technical@openldap.org
Subject: Re: ldapsearch performance degradation
From: Tim Dyce <tjdyce@unimelb.edu.au>
Date: Thu, 11 Nov 2010 22:52:24 +1100
Cc: Laurence Field <Laurence.Field@cern.ch>, Andrey Kiryanov <Andrey.Kiryanov@cern.ch>
In-reply-to: <4CDBD74B.1090200@symas.com>
References: <4CC6AFE6.6030001@unimelb.edu.au> <87ocagury5.fsf@magenta.l4b.de> <4CC8071E.1040206@unimelb.edu.au> <87hbg6q0tn.fsf@magenta.l4b.de> <4CD12E2B.3010808@unimelb.edu.au> <874obyflk5.fsf@magenta.l4b.de> <4CDB8441.40603@unimelb.edu.au> <4CDBBAC3.8090600@unimelb.edu.au> <4CDBC797.30400@symas.com> <4CDBCFD0.1040506@unimelb.edu.au> <4CDBD74B.1090200@symas.com>
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.15) Gecko/20101027 Thunderbird/3.0.10

Hi Howard

Thank you very much!
This explains a *lot* :D

For the moment however we have 370 facilities using this informationsystem, and sadly a whole bunch of scripts which will do thier ops fromthe base. Is there anything else you can suggest we do to work around this?


Thanks

Tim

On 11/11/10 22:45, Howard Chu wrote:

Tim Dyce wrote:
Hi Howard,

Thanks for the help :D

We have been testing in ramdisk as well, to make sure that disk
thrashing is not the root cause.

If your searches are not running long enough to show up for profiling,
increase the number of second level entries until you get something you
can profile.
Ah, there's a bug in your script, it's creating the 2nd-level entrieswith the wrong DN so the DB never had more than 250 entries.
Now I've fixed that and run again and can see the behavior you'retalking about. It's actually due to a particular design choice:
Ordinarily at each level of the tree we keep an index tallying all ofthe children beneath that point. In back-bdb this index is used forsubtree searches and for onelevel searches. (In back-hdb it's onlyused for onelevel.) However, as a write optimization, for the rootentry of the DB, we don't bother to maintain this index, it's simplyset to "All entries". (Otherwise in the back-bdb case, there is fartoo much write overhead to maintain this index.)
The problem is that "All entries" is actually a range 1 to N where Nis the ID of the last entry in the DB. (And 1 is the ID of the rootentry.) As you add entries, N keeps increasing, but 1 stays constant.
When you do a subtree search, every entryID in the corresponding indexslot is checked. In this case, with a subtree search starting at theroot entry, you will always be iterating through every ID from 1 thruN, even though many of those IDs have been deleted, and it takes timefor BDB to return "no such object" for all the deleted IDs.
If you do all of your operations under a child entry instead of thedatabase root entry, the performance will be constant. I've alreadyverified this with a modified copy of your test. I can post it if youwish.
Thanks

Tim

On 11/11/10 21:38, Howard Chu wrote:
Tim Dyce wrote:
Hi Dieter,
Thanks for the tips on tuning, sadly the problem is still hauntingus :(
Andrey Kiryanov at CERN has been doing a lot of work on thisperformance
degradation problem as well.
He has tried BDB 4.8.30 and OpenLDAP 2.4.23 but the problem is still
apparent.
I've run the test setup you provided here
http://www.openldap.org/lists/openldap-technical/201010/msg00237.html

but so far I'm seeing constant (0:00.0 second) results from ldapsearch.

Some differences - I used back-hdb, which is going to be superior for
a heavy add/delete workload. Also my test DB is running on a tmpfs
(RAMdisk).
The basic test we are running (sent earlier) creates 100 ou entries in
the root, each with 250 child ou entries, then deletes 20-35% of these
and re-adds them.
For each deletion cycle the ldapsearch performance degrades, taking
longer to complete the search each time.
The performance is consistent, across restarts of slapd, and tiedto the
current state of the database.
I have tried rsyncing out the database, and returning it later, andthe
performance is consistent with the number of deletion cycles the
database has undergone.

The only clue I have is that when dumping the databases which db_dump
it's clear that the ordering of the database becomes increasingly less
aligned with the order of the output data when doing a full treesearch
as we are. Which suggests that the database is writing frequently
accessed entires too often instead of holding them in cache?

I have run cachegrind against the server at 2, 20 and 1000 deletion
iterations and the results are very different -
http://www.ph.unimelb.edu.au/~tjdyce/callgrind.tar.gz
The number of fetches grows massively over time.
Anything you guys can suggest would be much appreciated, it'sstarted to
affect quite a number of our grid sites.

Cheers,

Tim
On 04/11/10 02:56, Dieter Kluenter wrote:
Hi Dieter,
I've done some more testing with openldap 2.3 and 2.4, on Redhatand
Ubuntu.
I even went as far as placing the BDB database directory in a
ramdisk.
But the performance still seems to degrade over time as data isadded
then deleted repeatedly from the ldap server.

It looks like the BDB database starts to fragment or lose structure
over time?
I've tried a few DB options that seem to have some impact.

Any ideas on what I can do from here?
Quite frankly, I have no clue, all i can do is guessing. First let's
define the problem: you have measured the presentation of search
Results the client side, and you observered an  increase of time
required to present the results.
Mostlikely it is either a caching problem, a disk problem or anetwork
problem.
As far as openldap related, there are four caches to watch:
1. the bdb/hdb database (DB_CONFIG, cachesize)
2. the DN cache (dncachesize)
2. the cache of searched and indexed attribute types (idlcachesize)
3. the frontside cache of search results (cachesize)
please check slapd.conf whether appropriate sizes are configured,see
man slapd-bdb(5) and slapd.conf(5) for more information.

But I must admit, a misconfiguration of any of this caches would not
lead to such a degrading in presenting search results.
An other approach would be to check the caching behaviour ofclients,
to check the network cache and the disk cache.

-Dieter


--
----------
Tim Dyce
Research Computing
EPP Group
The School of Physics
The University of Melbourne
+61 3 8344 5462
+61 431 485 166

Follow-Ups:
- Re: ldapsearch performance degradation
  - From: Howard Chu <hyc@symas.com>

References:
- Re: ldapsearch performance degradation
  - From: Tim Dyce <tjdyce@unimelb.edu.au>
- Re: ldapsearch performance degradation
  - From: Howard Chu <hyc@symas.com>
- Re: ldapsearch performance degradation
  - From: Tim Dyce <tjdyce@unimelb.edu.au>
- Re: ldapsearch performance degradation
  - From: Howard Chu <hyc@symas.com>

Prev by Date: Re: ldapsearch performance degradation
Next by Date: Re: ldapsearch performance degradation
Index(es):
- Chronological
- Thread