[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: make test hangs



Hi:

Last week or so, I reported "make test" hangs for openldap.2.1.8 over Tru64 UX5.1 machine which is 64-bit machine. By running debugger against lsapsearch code, I know where the code hangs. The following is the calling stack when the it hangs:

0 __read(0x3ff801becd0, 0xffffffffffffffe6, 0x334, 0x3ffc0087f58, 0x120070348
) [0x3ff800cd848]
1 sb_stream_read(sbiod = 0x14002a120, buf = 0x140033000, len = 16384) ["sockb
uf.c":490, 0x12006fe98]
2 sb_rdahead_read(sbiod = 0x14002a150, buf = 0x14002b037, len = 3) ["sockbuf.
c":651, 0x12007039c]
3 sb_debug_read(sbiod = 0x14002a270, buf = 0x14002b037, len = 17) ["sockbuf.c
":816, 0x120070ad4]
4 ber_int_sb_read(sb = 0x14002a0f0, buf = 0x14002b037, len = 17) ["sockbuf.c"
:405, 0x12006fc68]
5 ber_get_next(sb = 0x14002a0f0, len = 0x11fff9c18, ber = 0x14002b020) ["io.c
":482, 0x12006bdcc]
6 try_read1msg(ld = 0x140028c00, msgid = -1, all = 1, sb = 0x14002a0f0, lc =
0x140031200, result = 0x11fff9e20) ["result.c":438, 0x1200547fc]
7 wait4msg(ld = 0x140028c00, msgid = -1, all = 1, timeout = (nil), result = 0
x11fff9e20) ["result.c":304, 0x120054418]
8 ldap_result(ld = 0x140028c00, msgid = -1, all = 1, timeout = (nil), result
= 0x11fff9e20) ["result.c":113, 0x120053f24]
9 dosearch(ld = 0x140028c00, base = 0x14002a000 = "o=University of Michigan,
c=US", scope = 2, filtpatt = (nil), value = 0x1400005c8 = "(objectclass=*)", att
rs = 0x11fffc078, attrsonly = 0, sctrls = (nil), cctrls = (nil), timeout = (nil)
, sizelimit = -1) ["ldapsearch.c":1154, 0x12003fd3c]
10 main(argc = 13, argv = 0x11fffc018) ["ldapsearch.c":1063, 0x12003f97c]


I also turned on the debug flag to be -d -1. By looking at those files, I realized that it is the last response PDU message to the search request(last 14 bytes), which indicate all the entries have been returned and the server has completed the search request. It is this last message which the client is not processing properly, the other 19 entries have been processed correctly. What happened was the client was trying to read 17 bytes for the header( first read), while only 14 bytes are available. That is when the block occurred. I tested the possible fix ( change BER_TAG_T and BER_TAG_T from long type to int type), since in that case the client need to read 9 bytes for 64-bit type of machine as well as the 32-bit type of machine for the first read (header read), so it won't need to wait to read 17 bytes while there are only 14 bytes being returned for the last packet. The hanging problem is gone now. I also applied the fix for segmentable fault which occurred on 64-bit machine, which Mike told me is available from openldap.2.1.14 release. And 16 testing cases are running O.K now. Thanks.

Regards,

Cindy Wang
KINETWORKS


Cindy Wang wrote:

Hi:

I am finally able to build openldap-2.1.8/cyrus-sasl-2.1.10/krb5-1.2.7 on Tru64 Unix V5.1. But when I do the test of the build (make test), the first time, nothing was successful - it hangs for test000. The second time, test000 succeeded, but the test001 hangs. And the following is the message I got for the second test and later:

cd tests; make test
ln: ./data and ./data are identical.
*** Exit 1 (ignored)
ln: ./schema and ../servers/slapd/schema are identical.
*** Exit 1 (ignored)
ucdata/liblunicode: File exists
*** Exit 1 (ignored)
Initiating LDAP tests for BDB...
>>>>> Executing all LDAP tests...
>>>>> Test Directory: .
>>>>> Backend: bdb
>>>>> Starting test000-rootdse ...
running defines.sh
Datadir is ./data
Cleaning up in ./test-db...
Starting slapd on TCP/IP port 9009...
../servers/slapd/slapd -s0 -f ./test-db/slapd.conf -h ldap://localhost:9009/ -d
5
Using ldapsearch to retrieve the root DSE...
dn:
objectClass: top
objectClass: OpenLDAProotDSE
structuralObjectClass: OpenLDAProotDSE


namingContexts: o=OpenLDAP Project,l=Internet
supportedControl: 2.16.840.1.113730.3.4.2
supportedControl: 1.3.6.1.4.1.4203.1.10.2
supportedControl: 1.2.826.0.1.334810.2.3
supportedExtension: 1.3.6.1.4.1.4203.1.11.3
supportedExtension: 1.3.6.1.4.1.4203.1.11.1
supportedExtension: 1.3.6.1.4.1.1466.20037
supportedFeatures: 1.3.6.1.4.1.4203.1.5.1
supportedFeatures: 1.3.6.1.4.1.4203.1.5.2
supportedFeatures: 1.3.6.1.4.1.4203.1.5.3
supportedFeatures: 1.3.6.1.4.1.4203.1.5.4
supportedFeatures: 1.3.6.1.4.1.4203.1.5.5
supportedLDAPVersion: 3
supportedSASLMechanisms: GSSAPI
supportedSASLMechanisms: OTP
supportedSASLMechanisms: DIGEST-MD5
supportedSASLMechanisms: CRAM-MD5
subschemaSubentry: cn=Subschema

>>>>> Test succeeded
>>>>> ./scripts/test000-rootdse completed OK.
>>>>> waiting 10 seconds for things to exit

>>>>> Starting test001-slapadd ...
running defines.sh
Datadir is ./data
Cleaning up in ./test-db...
Running slapadd to build slapd database...
Starting slapd on TCP/IP port 9009...
Using ldapsearch to retrieve all the entries...

Does anyone have any insight what might be going on? I actually tried to build with and without the thread option (--with-threads=no/yes), and result is the same. And I also tried different kinds of options for the build (i.e. with -pthread option and link with -lpthread library), it doesn't help anything. I am running out of ideas, so if anyone has any insight to this, please help.

Cindy Wang