[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: MOD attr=uniqueMember - Was: Re: slapd just silently dies



Sigh.

Anyway, I just found that we have a 2GB id2entry.gdbm:

...
-rw-------    1 ldap     ldap          12K Jan 16 02:42 gidNumber.gdbm
-rw-------    1 ldap     ldap         1.2M Apr 26 15:05 givenName.gdbm
-rw-------    1 ldap     ldap         2.0G Apr 26 16:19 id2entry.gdbm
-rw-------    1 ldap     ldap         1.5M Apr 26 16:19 mail.gdbm
...

This is on RHES3. I wonder if we hit a 2GB file size limitation. slapd may be able to work sometimes below the 2GB, but when it hits 2GB it dies.

Yes, we are troubleshooting "blindly" right now. Not everyone runs gdb on a routine basis.

Why would id2entry get so large? All other index files are rather small. Usually less than 10 MB if not smaller.

Howard Chu wrote:

Repeating an earlier post - syslog output is not useful for debugging purposes.

If you want to track down what's going wrong here, run slapd under a debugger and get a backtrace from when it dies.

There's no point to making blind guesses.

fuser9bb@hotpop.com wrote:

Okay, I made some progress. It appears that *sometimes* when a 'MOD attr=uniqueMember' op is performed that slapd silently dies. There is definately a pattern. Following is an example:

Apr 22 13:49:00 serv slapd[5631]: conn=36 op=0 BIND dn="cn=Manager,..." mech=SIMPLE ssf=0
Apr 22 13:49:00 serv slapd[5631]: conn=36 op=0 RESULT tag=97 err=0 text=
Apr 22 13:49:00 serv slapd[5631]: conn=36 op=1 SRCH base="dc=..." scope=0 deref=0 filter="(dc=....)"
Apr 22 13:49:00 serv slapd[5631]: conn=36 op=1 SEARCH RESULT tag=101 err=0 nentries=1 text=
Apr 22 13:49:00 serv slapd[5631]: conn=36 op=2 UNBIND
Apr 22 13:49:00 serv slapd[5631]: conn=36 fd=8 closed
Apr 22 13:49:16 serv slapd[5631]: conn=37 fd=8 ACCEPT from IP=.......:47567 (IP=0.0.0.0:389)
Apr 22 13:49:16 serv slapd[5631]: conn=37 op=0 BIND dn="cn=Manager,..." method=128
Apr 22 13:49:16 serv slapd[5631]: conn=37 op=0 BIND dn="cn=Manager,..." mech=SIMPLE ssf=0
Apr 22 13:49:16 serv slapd[5631]: conn=37 op=0 RESULT tag=97 err=0 text=
Apr 22 13:49:16 serv slapd[5631]: conn=37 op=1 MOD dn="cn=..."
Apr 22 13:49:16 serv slapd[5631]: conn=37 op=1 MOD attr=uniqueMember


(here is where slapd dies and our script notices this and restarts slapd)

Apr 22 13:50:09 serv slapd[5795]: slapd starting

So out of perhaps thirty MOD attr=uniqueMembers ops, several may cause slapd to die. No other operations are causing this, nor or any other attrs with the MOD op that we have found. The MOD attr=uniqueMembers ops occur very fast by an app that we use to manage our users and their group memberships in OpenLDAP.

Could this be caused by a corrupt ldbm file? If so would the following operation fix this problem?

1. stop slapd
2. slapcat > x
3. import slapcat file
4. start slapd

I want to get my ducks lined in a row before we move forward.

Or could this be something else entirely?

fuser9bb@hotpop.com wrote:

Hi. We are using openldap-2.2.15 built from source hosted on RHES3. I know that 2.2.24 is the current version, but I don't think we are too far behind the curve. Would prefer to not upgrade if possible, but we are open to it.

Anyway, here is our problem. We have a master openldap server running 2.2.15. The server runs both slapd and slurpd. In the past few weeks, we have seen slapd just die. We can't find any errors in /var/log/messages or /var/log/openldap.log (where we have slapd messages going via syslog). Frankly, we don't know why slapd is dying.

I would like some pointers on troubleshooting the situation. Currently, I have loglevel set to 256. I can increase the loglevel, but it's hard to leave the loglevel on a high value for long since the log file will grow VERY rapidly. Right now we can't pin down the exact cause or timing of slapd dying, so I would need to leave loglevel high for a while.

Anyway, any known issues that would cause this for 2.2.15? Any suggestions for troubleshooting this problem?