[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: Problem unexpected failing slapd
Sorry, I overlooked this info:
"The server
has no problems, plenty of memory and a fast diskarray (SAS->SATA).
Never technical problems with this server. And it worked without
problems for a long period."
Which tells us that your system is on a metal box.
I am afraid you 've got a hardware problem of some sort.
I advise you to start checking all hardware components (or just replace
the box).
Regards, Kuba
On Sun, 2011-02-27 at 12:57 +0100, Ruud Baart wrote:
> Problem:
> For a customer we use LDAP for many years. Last year suddenly the slapd
> service just stopped without any traces in the logfiles. After a restart
> of slapd everything works fine again. But the problem was there: it was
> not an incident, now and then slapd just stops and always without any
> traces in the logfiles. Sometime three times a day, sometime a week
> without a failure. I can't find a pattern or any relation to any other
> service on the linux server.
>
> Environment:
> - Several (debian squeeze) servers , several windows servers. We use bdb
> database backend.
> - There is one master LDAP server which provides syncprov and two
> replica's LDAP servers (syncrepl). The master server is most intens used
> (mainly samba as primary domain controller: a few hundred useraccounts,
> lot of groupaccounts, workstations, acl's, etc.), one of the replica's
> is not very busy but handles the mail for all users (lookup: amavis,
> postfix, courier-imap, mailaccount settings etc). The third replica is
> not busy at all, it is a remote location.
> - Total LDAP is 3700 dn's, slapcat produces a file of 7,3 Mb.
> - It is only the master LDAP with stops suddenly. I have never seen a
> failure of a replica LDAP.
>
> Because I have no clear idea about the problem I have no idea which
> technical details are relevant:
> DB_CONFIG
> ===========
> set_cachesize 0 10485760 1
> set_lk_max_objects 10000
> set_lk_max_locks 10000
> set_lk_max_lockers 10000
> set_lg_dir /home/ldap-dbd
> The database is stored on a ext3 filesystem, kernel 2.6.32. The server
> has no problems, plenty of memory and a fast diskarray (SAS->SATA).
> Never technical problems with this server. And it worked without
> problems for a long period. Nothing has changed to the environment or
> the LDAP setup (except of course with the upgrade to debian squeeze but
> the problem was already there).
>
> What we have tried:
> - upgrade from openldap 2..4.17 (debian lenny+backports) to openldap
> 2.4.23 (debian squeeze). I saw in the release notes that problems
> related to syncrepl were solved. Therefor we waited for version 2.4.23
> te become available in debian. This upgrade made no difference.
> - reindex, rebuilt the directory. When I rebuilt the LDAP with a clean
> LDIF file on the master LDAP or an other machine with ldapadd there is
> not one error or warning.
>
> The workaround for the moment:
> I have written a process monitor (perl daemon) which monitors the slapd
> daemon and if it suddenly stops, slapd is restarted. It is of course not
> a solution but the 300 user can work. If slapd stops without a restart
> within 1 minute a few hundred people can't work because samba stops working.
>
> I would like to receive suggestions what we can do to find the problem.
> Because there is no pattern, nothing in the logfiles I don't know where
> to start.
>