Am Fr, den 11.06.2004 schrieb Ingo Steuwer um 12:38:
[..]
|>From time to time the database seems to hang, ldapsearch gets no answers
| and even db4.2_stat hangs at some point (needs kill -9 then). In this
| cases I need to stop slapd and do a db4.2_recover.
|
Looks like you might be exceeding 2GB of transaction logs which are in use.
[..]
| relevant configruation-parts:
|
| slapd.conf:
| ---------------------------------------------------
| sizelimit unlimited
| modulepath /usr/lib/ldap
| moduleload back_bdb.so
|
| database bdb
|
| cachesize 500000
| index
| objectClass,uidNumber,gidNumber,memberUid,ou,uniqueMember pres,eq
| index uid,cn,sn,givenName,mail,description,displayName
| pres,eq,sub
| index sambaSID,sambaPrimaryGroupSID,sambaDomainName eq
| index default sub
| ---------------------------------------------------
|
You don't appear to have a checkpoint setting, which would mean that all
your transaction logs are open (or something to that effect, based on
what I've seen).
That's true, I added "checkpoint 1024 1" now in slapd.conf. It may take
some days to see if it is the reason.
But is it true that the log-files which contains only 10MB each are held
open ? I thought this would be done only for the last=actual one, I
mean, for what reason should thy all be open ?
Well, as I have 211 logfiles now they are bigger than 2G all together.
It made it worse. As long as I have the checkpoint-Option set (I tried
also checkpoint 2048 2) I had a cpu-eating slapd for _each_ ldapmodify.
The modifications were done, but the correspondig slapd-process goes
mad. strace shows me the already meantioned sched_yield()-calls, but the
more mysterous is that such a process finishes after I try "ltrace -p"
on it.
Well, and after all it corrupted my database again. Back to the defaults
and after doing some changes there is one CPU-eating slapd which will
not finish with ltrace (it gives me countless
"ldap_pvt_thread_yield(0xbfffdf48, 0x40009e90, 0, 0x6044a510,
0x40117f60) = 0") but will also resist an
"/etc/init.d/slapd stop" so I have to do a "kill -9". After a new start
I get "Implementation Specific Error"s on an ldapdelete.
So db_recover is needed again.
I'd be glad for any other hint or correction.
Thanks
Ingo Steuwer