[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: File descriptor leak, slapd hangs and runs out of file descriptor



Sam Tran skrev, on 02-01-2008 23:14:
Dear All,

We are running OL 2.3.39 on Centos 5 i386 or x86_64. We have one
provider and three consumers (LDAP-sync repl).

Several applications perform LDAP write and read operations on the provider.

For the second time in 2 months, we had what it looked like a file
descriptor leak on the provider: file descriptors were not closed at
all or fast enough. At the same time, slapd was unresponsive. Here is
what the logs shows:

[snip]
Dec 31 10:00:02 info-ldap-003 slapd[29329]: conn=87886 fd=75 ACCEPT
from IP=127.0.0.1:47391 (IP=0.0.0.0:389)
Dec 31 10:00:02 info-ldap-003 slapd[29329]: conn=87887 fd=76 ACCEPT
from IP=172.16.8.39:55065 (IP=0.0.0.0:636)
Dec 31 10:00:02 info-ldap-003 slapd[29329]: conn=87887 fd=76 TLS
established tls_ssf=256 ssf=256
Dec 31 10:00:36 info-ldap-003 slapd[29329]: conn=87888 fd=77 ACCEPT
from IP=172.16.77.245:1285 (IP=0.0.0.0:389)
Dec 31 10:01:01 info-ldap-003 slapd[29329]: conn=87889 fd=78 ACCEPT
from IP=172.16.8.39:55073 (IP=0.0.0.0:636)
Dec 31 10:01:01 info-ldap-003 slapd[29329]: conn=87890 fd=79 ACCEPT
from IP=172.16.8.39:55074 (IP=0.0.0.0:636)
Dec 31 10:01:01 info-ldap-003 slapd[29329]: conn=87891 fd=80 ACCEPT
from IP=172.16.8.39:55075 (IP=0.0.0.0:636)
Dec 31 10:01:01 info-ldap-003 slapd[29329]: conn=87889 fd=78 TLS
established tls_ssf=256 ssf=256
Dec 31 10:01:01 info-ldap-003 slapd[29329]: conn=87890 fd=79 TLS
established tls_ssf=256 ssf=256
Dec 31 10:01:01 info-ldap-003 slapd[29329]: conn=87891 fd=80 TLS
established tls_ssf=256 ssf=256
Dec 31 10:01:28 info-ldap-003 slapd[29329]: conn=87892 fd=81 ACCEPT
from IP=10.0.0.23:47414 (IP=0.0.0.0:389)
Dec 31 10:01:28 info-ldap-003 slapd[29329]: conn=87892 fd=81 closed
(connection lost)
Dec 31 10:01:28 info-ldap-003 slapd[29329]: conn=87893 fd=81 ACCEPT
from IP=10.0.0.23:47415 (IP=0.0.0.0:389)
Dec 31 10:01:40 info-ldap-003 slapd[29329]: conn=87894 fd=82 ACCEPT
from IP=172.16.8.31:47692 (IP=0.0.0.0:636)
Dec 31 10:01:40 info-ldap-003 slapd[29329]: conn=87894 fd=82 closed
(TLS negotiation failure)
Dec 31 10:01:44 info-ldap-003 slapd[29329]: conn=87895 fd=82 ACCEPT
from IP=10.0.0.23:47741 (IP=0.0.0.0:636)
Dec 31 10:01:44 info-ldap-003 slapd[29329]: conn=87895 fd=82 closed
(TLS negotiation failure)
Dec 31 10:01:44 info-ldap-003 slapd[29329]: conn=87896 fd=82 ACCEPT
from IP=172.16.8.31:47757 (IP=0.0.0.0:636)
Dec 31 10:01:44 info-ldap-003 slapd[29329]: conn=87896 fd=82 closed
(TLS negotiation failure)
Dec 31 10:01:46 info-ldap-003 slapd[29329]: conn=87897 fd=82 ACCEPT
from IP=172.16.8.31:47760 (IP=0.0.0.0:389)
Dec 31 10:01:46 info-ldap-003 slapd[29329]: conn=87897 fd=82 closed
(connection lost)
Dec 31 10:01:46 info-ldap-003 slapd[29329]: conn=87898 fd=82 ACCEPT
from IP=172.16.8.31:47761 (IP=0.0.0.0:389)
Dec 31 10:01:49 info-ldap-003 slapd[29329]: conn=87899 fd=83 ACCEPT
from IP=172.16.8.31:47774 (IP=0.0.0.0:389)
Dec 31 10:01:54 info-ldap-003 slapd[29329]: conn=87900 fd=84 ACCEPT
from IP=172.16.8.31:47785 (IP=0.0.0.0:389)
Dec 31 10:01:54 info-ldap-003 slapd[29329]: conn=87900 fd=84 closed
(connection lost)
Dec 31 10:01:54 info-ldap-003 slapd[29329]: conn=87901 fd=84 ACCEPT
from IP=172.16.8.31:47786 (IP=0.0.0.0:389)
Dec 31 10:01:57 info-ldap-003 slapd[29329]: conn=87902 fd=85 ACCEPT
from IP=172.16.8.31:47788 (IP=0.0.0.0:389)
Dec 31 10:02:02 info-ldap-003 slapd[29329]: conn=87903 fd=86 ACCEPT
from IP=172.16.8.39:55083 (IP=0.0.0.0:636)
Dec 31 10:02:02 info-ldap-003 slapd[29329]: conn=87903 fd=86 TLS
established tls_ssf=256 ssf=256
Dec 31 10:02:02 info-ldap-003 slapd[29329]: conn=87904 fd=87 ACCEPT
from IP=172.16.8.39:55086 (IP=0.0.0.0:636)
Dec 31 10:02:02 info-ldap-003 slapd[29329]: conn=87904 fd=87 TLS
established tls_ssf=256 ssf=256
Dec 31 10:02:02 info-ldap-003 slapd[29329]: conn=87905 fd=88 ACCEPT
from IP=172.16.8.39:55088 (IP=0.0.0.0:636)
Dec 31 10:02:02 info-ldap-003 slapd[29329]: conn=87905 fd=88 TLS
established tls_ssf=256 ssf=256
Dec 31 10:02:23 info-ldap-003 slapd[29329]: conn=87906 fd=89 ACCEPT
from IP=172.16.8.31:47816 (IP=0.0.0.0:389)
Dec 31 10:02:23 info-ldap-003 slapd[29329]: conn=87906 fd=89 closed
(connection lost)
Dec 31 10:02:23 info-ldap-003 slapd[29329]: conn=87907 fd=89 ACCEPT
from IP=172.16.8.31:47817 (IP=0.0.0.0:389)
Dec 31 10:02:26 info-ldap-003 slapd[29329]: conn=87908 fd=90 ACCEPT
from IP=172.16.8.31:47819 (IP=0.0.0.0:389)
Dec 31 10:02:31 info-ldap-003 slapd[29329]: conn=87909 fd=91 ACCEPT
from IP=172.16.8.31:47822 (IP=0.0.0.0:389)
Dec 31 10:02:31 info-ldap-003 slapd[29329]: conn=87909 fd=91 closed
(connection lost)
Dec 31 10:02:31 info-ldap-003 slapd[29329]: conn=87910 fd=91 ACCEPT
from IP=172.16.8.31:47823 (IP=0.0.0.0:389)
Dec 31 10:02:34 info-ldap-003 slapd[29329]: conn=87911 fd=92 ACCEPT
from IP=172.16.8.31:47825 (IP=0.0.0.0:389)
Dec 31 10:02:36 info-ldap-003 slapd[29329]: conn=87912 fd=93 ACCEPT
from IP=172.16.8.38:4592 (IP=0.0.0.0:636)
Dec 31 10:02:36 info-ldap-003 slapd[29329]: conn=87912 fd=93 TLS
established tls_ssf=128 ssf=128
Dec 31 10:02:59 info-ldap-003 slapd[29329]: conn=87913 fd=94 ACCEPT
from IP=172.16.8.31:47834 (IP=0.0.0.0:389)
Dec 31 10:02:59 info-ldap-003 slapd[29329]: conn=87913 fd=94 closed
(connection lost)
Dec 31 10:02:59 info-ldap-003 slapd[29329]: conn=87914 fd=94 ACCEPT
from IP=172.16.8.31:47835 (IP=0.0.0.0:389)
Dec 31 10:03:01 info-ldap-003 slapd[29329]: conn=87915 fd=95 ACCEPT
from IP=172.16.8.39:39120 (IP=0.0.0.0:636)
Dec 31 10:03:01 info-ldap-003 slapd[29329]: conn=87916 fd=96 ACCEPT
from IP=172.16.8.39:39121 (IP=0.0.0.0:636)
Dec 31 10:03:01 info-ldap-003 slapd[29329]: conn=87915 fd=95 TLS
established tls_ssf=256 ssf=256
[snip]

I restarted slapd, which fixed the problem.

The first time that problem occurred, slapd ran out of file descriptor.
I don't know what triggered the problem. Prior to the problem there
was no increase in load, all LDAP operations were performed
successfully.

I would appreciate it if anyone could give me some pointers on how to
troubleshoot the problem.

We had this on heavily loaded OL 2.3 servers and solved it by setting a low(ish) idletimeout value (see 'man slapd.conf').


Best,

--Tonni

--
Tony Earnshaw
Email: tonni at hetnet dot nl