[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: slapcat, ldapsearch, slapindex all do nothing
It happend to me also :-( , no response from ldapsearch, slapd nearly
took 100% of CPU (top), and slapcat did nothing (didn't wait to long
though ...)
I did a db_recover in my database directory, then everything went right
! :-)
[root@corbeau /var/lib/ldap/int]
$ /usr/local/jehan/db-4.1.25.NC/bin/db_recover -v
db_recover: unable to join the environment
db_recover: Finding last valid log LSN: file: 6 offset 1245916
db_recover: Recovery starting from [6][1242448]
db_recover: Recovery complete at Thu Jun 19 09:39:07 2003
db_recover: Maximum transaction ID 80000dbb Recovery checkpoint [6][1247665]
db_recover: Recovery complete at Thu Jun 19 09:39:07 2003
db_recover: Maximum transaction id 80000000 Recovery checkpoint [6][1247665]
However, I still don't know how to check which of db files is corrupted,
and why ? maybe db_* utilities can be used to locate the problem ?
more diagnotics of what happened to my database:
Problem:
$ ldapsearch -x sn=procacc* -h localhost
# extended LDIF
#
# LDAPv3
# base <> with scope sub
# filter: sn=procacc*
# requesting: ALL
#
ZZZZZZZZZZZZZ
logs
Jun 19 08:57:02 corbeau slapd[2495]: bdb_initialize: Sleepycat Software:
Berkeley DB 4.1.25: (December 19, 2002)
Jun 19 08:57:02 corbeau slapd[2495]: /etc/openldap/slapd.conf: line 50:
unknown
directive "defaultaccess" outside backend info and database definitions
(ignored)
Jun 19 08:57:02 corbeau slapd[2495]: bdb_db_init: Initializing BDB database
Jun 19 08:57:03 corbeau slapd[2497]: slapd starting
Jun 19 08:57:18 corbeau slapd[2497]: conn=0 fd=10 ACCEPT from
IP=127.0.0.1:32982 (IP=0.0.0.0:389)
Jun 19 08:57:18 corbeau slapd[2497]: conn=0 op=0 BIND dn="" method=128
Jun 19 08:57:18 corbeau slapd[2497]: conn=0 op=0 RESULT tag=97 err=0 text=
Jun 19 08:57:18 corbeau slapd[2497]: conn=0 op=1 SRCH
base="dc=int-evry,dc=fr"
scope=2 filter="(sn=procacc*)"
ZZZZZZZZZ
[root@corbeau ~]
$ top
08:58:19 up 18 min, 4 users, load average: 0.97, 0.47, 0.27
80 processes: 79 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: 10.9% user 5.4% system 0.0% nice 0.0% iowait 83.6% idle
Mem: 255180k av, 222000k used, 33180k free, 0k shrd, 14720k
buff
183120k actv, 8928k in_d, 1788k in_c
Swap: 281096k av, 196k used, 280900k free 95904k
cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
2497 ldap 25 0 3560 3556 2800 S 97.1 1.3 0:56 0 slapd
2505 root 19 0 1060 1060 852 R 1.9 0.4 0:00 0 top
DAtabase directory
[root@corbeau /var/lib/ldap/int]
$ ls -al
total 89044
drwxr-xr-x 2 root root 4096 Jun 19 08:54 .
drwx------ 8 ldap ldap 4096 Jun 19 08:55 ..
-rw------- 1 ldap ldap 2265088 Jun 17 11:10 cn.bdb
-rw------- 1 ldap ldap 16384 Jun 6 16:32 __db.001
-rw------- 1 ldap ldap 10248192 Jun 6 16:32 __db.002
-rw------- 1 ldap ldap 270336 Jun 6 16:32 __db.003
-rw------- 1 ldap ldap 458752 Jun 6 16:32 __db.004
-rw------- 1 ldap ldap 16384 Jun 6 16:32 __db.005
-rw-r--r-- 1 ldap ldap 304 Jun 6 13:30 DB_CONFIG
-rw------- 1 ldap ldap 831488 Jun 16 19:36 dn2id.bdb
-rw------- 1 ldap ldap 53248 Jun 17 09:20 gidNumber.bdb
-rw------- 1 ldap ldap 622592 Jun 6 19:23 givenName.bdb
-rw------- 1 ldap ldap 21479424 Jun 16 19:36 id2entry.bdb
-rw------- 1 ldap ldap 49152 Jun 6 19:23 IntEPersInetServ.bdb
-rw------- 1 ldap ldap 10485708 Jun 6 16:32 log.0000000001
-rw------- 1 ldap ldap 10485757 Jun 6 16:33 log.0000000002
-rw------- 1 ldap ldap 10485748 Jun 6 16:33 log.0000000003
-rw------- 1 ldap ldap 10483448 Jun 6 16:33 log.0000000004
-rw------- 1 ldap ldap 10485757 Jun 6 16:33 log.0000000005
-rw------- 1 ldap ldap 1245844 Jun 19 08:57 log.0000000006
-rw------- 1 ldap ldap 1503232 Jun 6 19:23 mail.bdb
-rw------- 1 ldap ldap 425984 Jun 17 09:17 objectClass.bdb
-rw------- 1 ldap ldap 614400 Jun 12 23:31 sn.bdb
-rw------- 1 ldap ldap 53248 Jun 17 09:17 uid.bdb
-rw------- 1 ldap ldap 53248 Jun 16 17:17 uidNumber.bdb
[root@corbeau /var/lib/ldap/int]
$ /usr/local/jehan/db-4.1.25.NC/bin/db_archive
db_archive: unable to join the environment
log.0000000001
log.0000000002
log.0000000003
log.0000000004
log.0000000005
[root@corbeau /var/lib/ldap/int]
$ /usr/local/jehan/db-4.1.25.NC/bin/db_stat -e -h /var/lib/ldap/int
db_stat: unable to join the environment
db_stat: DB_ENV->open: /var/lib/ldap/int: Resource temporarily unavailable
$ /usr/local/jehan/db-4.1.25.NC/bin/db_recover -v
db_recover: unable to join the environment
db_recover: Finding last valid log LSN: file: 6 offset 1245916
db_recover: Recovery starting from [6][1242448]
db_recover: Recovery complete at Thu Jun 19 09:39:07 2003
db_recover: Maximum transaction ID 80000dbb Recovery checkpoint [6][1247665]
db_recover: Recovery complete at Thu Jun 19 09:39:07 2003
db_recover: Maximum transaction id 80000000 Recovery checkpoint [6][1247665]
Brian K. Jones wrote:
Hi all.
I had this happen once before, and it caused me to rebuild the entire
server, and now it's happened again, and it's not funny anymore, since
this server will be production.
I'm running 2.1.21 on Redhat 9. Here are some other facts:
I'm using BerkleyDB 4.1.25.
My database directory contains databases, and they are all the same size
as they are on the test server that I migrated the data from. They're
not empty. I've also ensured that this *is* the directory slapd.conf
points at.
I was, earlier yesterday, able to search and browse the directory.
However, I did a 'modrdn' on an 'ou' to put it under another 'ou', and
that's the last change I remember making before things went haywire.
Currently, ldapsearch connects to the server just fine, but always
returns no results, no matter what I search for (including
'objectclass=*'). There's the debug output on the server below. There
are a few errors, but they're ones I've pretty much always gotten -
nothing here looks particularly interesting, just looks like an average
'your search didn't match anything' response.
I noticed, by the way, that doing these searches touches the 'dn2id.bdb'
and 'id2entry.bdb' files - along with one of the logs in my data
directory - but that's it. The rest are untouched since yesterday.
Running 'slapcat -l file.ldif' also produces absolutely no output
whatsoever - and exits with a '0' status. I even fed slapcat the proper
'-f' and '-b' arguments, to no avail.
Is there a way to recover the data I imported into the database files?
Is there a way to get slapd to reindex stuff (BTW - slapindex didn't do
anything either - just immediately exited) or otherwise get slapd to
'understand' the data again? How can I avoid this in the future?
Pointers to anything I might not have read are more than welcome (except
for the damned faq-o-matic. I have no patience for that 'thing'). I've
looked on the mailing lists, man pages, and the O'Reilly LDAP book, but
found nothing.
+++++++++++++++ SNIP ++++++++++
daemon: activity on 1 descriptors
daemon: new connection on 10
ldap_pvt_gethostbyname_a: host=ldap.CS.Princeton.EDU, r=0
str2filter "(objectclass=*)"
put_filter: "(objectclass=*)"
put_filter: simple
put_simple_filter: "objectclass=*"
begin get_filter
PRESENT
ber_scanf fmt (m) ber:
ber_dump: buf=0x081e2c60 ptr=0x081e2c60 end=0x081e2c6d len=13
0000: 87 0b 6f 62 6a 65 63 74 63 6c 61 73 73 ..objectclass
end get_filter 0
daemon: added 10r
daemon: activity on:
daemon: select: listen=6 active_threads=0 tvp=NULL
daemon: activity on 1 descriptors
daemon: activity on: 10r
daemon: read activity on 10
connection_get(10)
connection_get(10): got connid=0
connection_read(10): checking for input on id=0
ber_get_next
ldap_read: want=8, got=8
0000: 30 0c 02 01 01 60 07 02 0....`..
ldap_read: want=6, got=6
0000: 01 03 04 00 80 00 ......
ber_get_next: tag 0x30 len 12 contents:
ber_dump: buf=0x081e2950 ptr=0x081e2950 end=0x081e295c len=12
0000: 02 01 01 60 07 02 01 03 04 00 80 00 ...`........
do_bind
ber_get_next
ldap_read: want=8 error=Resource temporarily unavailable
ber_get_next on fd 10 failed errno=11 (Resource temporarily unavailable)
ber_scanf fmt ({imt) ber:
ber_dump: buf=0x081e2950 ptr=0x081e2953 end=0x081e295c len=9
0000: 60 07 02 01 03 04 00 80 00 `........
ber_scanf fmt (m}) ber:
ber_dump: buf=0x081e2950 ptr=0x081e295a end=0x081e295c len=2
0000: 00 00 ..
dnPrettyNormal: <>
<<< dnPrettyNormal: <>, <>
do_bind: version=3 dn="" method=128
send_ldap_result: conn=0 op=0 p=3
send_ldap_result: err=0 matched="" text=""
send_ldap_response: msgid=1 tag=97 err=0
ber_flush: 14 bytes to sd 10
0000: 30 0c 02 01 01 61 07 0a 01 00 04 00 04 00 0....a........
ldap_write: want=14, written=14
0000: 30 0c 02 01 01 61 07 0a 01 00 04 00 04 00 0....a........
do_bind: v3 anonymous bind
daemon: select: listen=6 active_threads=0 tvp=NULL
daemon: activity on 1 descriptors
daemon: activity on: 10r
daemon: read activity on 10
connection_get(10)
connection_get(10): got connid=0
connection_read(10): checking for input on id=0
ber_get_next
ldap_read: want=8, got=8
0000: 30 3e 02 01 02 63 39 04 0>...c9.
ldap_read: want=56, got=56
0000: 19 64 63 3d 63 73 2c 64 63 3d 70 72 69 6e 63 65 .dc=cs,dc=prince
0010: 74 6f 6e 2c 64 63 3d 65 64 75 0a 01 02 0a 01 00 ton,dc=edu......
0020: 02 01 00 02 01 00 01 01 00 87 0b 6f 62 6a 65 63 ...........objec
0030: 74 63 6c 61 73 73 30 00 tclass0.
ber_get_next: tag 0x30 len 62 contents:
ber_dump: buf=0x081e3c90 ptr=0x081e3c90 end=0x081e3cce len=62
0000: 02 01 02 63 39 04 19 64 63 3d 63 73 2c 64 63 3d ...c9..dc=cs,dc=
0010: 70 72 69 6e 63 65 74 6f 6e 2c 64 63 3d 65 64 75 princeton,dc=edu
0020: 0a 01 02 0a 01 00 02 01 00 02 01 00 01 01 00 87 ................
0030: 0b 6f 62 6a 65 63 74 63 6c 61 73 73 30 00 .objectclass0.
ber_get_next
ldap_read: want=8 error=Resource temporarily unavailable
ber_get_next on fd 10 failed errno=11 (Resource temporarily unavailable)
do_search
ber_scanf fmt ({miiiib) ber:
ber_dump: buf=0x081e3c90 ptr=0x081e3c93 end=0x081e3cce len=59
0000: 63 39 04 19 64 63 3d 63 73 2c 64 63 3d 70 72 69 c9..dc=cs,dc=pri
0010: 6e 63 65 74 6f 6e 2c 64 63 3d 65 64 75 0a 01 02 nceton,dc=edu...
0020: 0a 01 00 02 01 00 02 01 00 01 01 00 87 0b 6f 62 ..............ob
0030: 6a 65 63 74 63 6c 61 73 73 30 00 jectclass0.
dnPrettyNormal: <dc=cs,dc=princeton,dc=edu>
=> ldap_bv2dn(dc=cs,dc=princeton,dc=edu,0)
<= ldap_bv2dn(dc=cs,dc=princeton,dc=edu,0)=0
=> ldap_dn2bv(272)
<= ldap_dn2bv(dc=cs,dc=princeton,dc=edu,272)=0
=> ldap_dn2bv(272)
<= ldap_dn2bv(dc=cs,dc=princeton,dc=edu,272)=0
<<< dnPrettyNormal: <dc=cs,dc=princeton,dc=edu>, <dc=cs,dc=princeton,dc=edu>
SRCH "dc=cs,dc=princeton,dc=edu" 2 0 0 0 0
begin get_filter
PRESENT
ber_scanf fmt (m) ber:
ber_dump: buf=0x081e3c90 ptr=0x081e3cbf end=0x081e3cce len=15
0000: 87 0b 6f 62 6a 65 63 74 63 6c 61 73 73 30 00 ..objectclass0.
end get_filter 0
filter: (objectClass=*)
ber_scanf fmt ({M}}) ber:
ber_dump: buf=0x081e3c90 ptr=0x081e3ccc end=0x081e3cce len=2
0000: 00 00 ..
attrs:
=> bdb_back_search
bdb_dn2entry_rw("dc=cs,dc=princeton,dc=edu")
=> bdb_dn2id_matched( "dc=cs,dc=princeton,dc=edu" )
<= bdb_dn2id_matched: no match
send_ldap_result: conn=0 op=1 p=3
send_ldap_result: err=10 matched="" text=""
send_ldap_response: msgid=2 tag=101 err=32
ber_flush: 14 bytes to sd 10
0000: 30 0c 02 01 02 65 07 0a 01 20 04 00 04 00 0....e... ....
ldap_write: want=14, written=14
0000: 30 0c 02 01 02 65 07 0a 01 20 04 00 04 00 0....e... ....
daemon: select: listen=6 active_threads=0 tvp=NULL
daemon: activity on 1 descriptors
daemon: activity on: 10r
daemon: read activity on 10
connection_get(10)
connection_get(10): got connid=0
connection_read(10): checking for input on id=0
ber_get_next
ldap_read: want=8, got=7
0000: 30 05 02 01 03 42 00 0....B.
ber_get_next: tag 0x30 len 5 contents:
ber_dump: buf=0x081e3d68 ptr=0x081e3d68 end=0x081e3d6d len=5
0000: 02 01 03 42 00 ...B.
ber_get_next
ldap_read: want=8, got=0
ber_get_next on fd 10 failed errno=0 (Success)
connection_read(10): input error=-2 id=0, closing.
connection_closing: readying conn=0 sd=10 for close
connection_close: deferring conn=0 sd=10
do_unbind
connection_resched: attempting closing conn=0 sd=10
connection_close: conn=0 sd=10
daemon: removing 10
daemon: select: listen=6 active_threads=0 tvp=NULL
daemon: activity on 1 descriptors
daemon: select: listen=6 active_threads=0 tvp=NULL
+++++++++++++++ /SNIP +++++++++++++++
Thanks