[Date Prev][Date Next] [Chronological] [Thread] [Top]

slapd stability problems with add/change operations



Hi all,

This posting refers to an older posting this June by Steffen Hansen
<http://article.gmane.org/gmane.network.openldap.general/29440>

> We use OpenLDAP in the Kolab project, but after switching to the bdb
> backend there have been several reports about stability problems. Slapd
> sometimes seems to hang when someone tries to write to the database
> (for example with ldapadd).
[...]

I spend the last three days debugging exactly the same problem:

- Sunday night slapd hung during an add operation (according to the log)
from the meta-db. I love to start a week like this Monday morning :)

- killed with -9, redid the db with db_recover -c -v (-c was necessary
obviously)

- I also asked in the bdb group because I first thought it's bdb (didn't
know the -o option in db_verify till then, see my thread here:
<http://groups.google.com/group/comp.databases.berkeley-db/browse_thread/thread/3d70acda54c7d3c6/fd31c234910588a1#fd31c234910588a1>

- slapd came back but it didn't took long till the next lock.

So I started to debug a bit more, I tried this:

- exported the bdb files with db_dump, reimported the stuff with db_load. This doesn't work at all, half of the OUs were missing and I couldn't find a single user anymore, even if the bdb files itself were about right in size (well, instead of 5.4MB the biggest file was 4.6MB). I am a bit confused that this doesn't work at all.

- slapcat the db to a file, slapadd it to a brand new db. Works for some
time but locked up quite fast again

- same game but I killed all entry*, creat* and modif* entries to be sure that we have a clean base. I almost thought it works like this because it was much more stable than it was before. We could do quite some add operations from the meta database like this, but it still locks from time to time. Overall it is definitely more stable however.

BTW our database is not that big, we have around 3000 entries which
shouldn't be a real problem for OpenLDAP I suppose.

Then I discovered this thread :)

We started with:
- OpenLDAP 2.2.15
- BDB 4.2.52
- FreeBSD 5.3

and I upgraded to:
- OpenLDAP 2.2.27

but same game.

This is getting nasty, as our whole directory depends on OpenLDAP. So I am more than happy to help to debug this stuff. But I'm not that skilled with gdb so if I should try to trace some stuff I need a bit more details about how to do that (or a link with some samples). Can I check where it hangs without debug version of slapd at all? Is it a good idea to start it with strace once (well, not performance wise for sure :)?

Is there hope that this works better with OpenLDAP 2.3.x? Or should I
try another backend than bdb? What would make sense to try?

thanks

Adrian

--
Adrian Gschwend
System Administrator
Berne University of Applied Sciences
Biel, Switzerland