[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: commit: ldap/tests/scripts defines.sh test035-meta test036-meta-concurrency
Michael Ströder wrote:
Pierangelo Masarati wrote:
test036 should now be ready to be enabled. I'd appreciate if anybody
can try it out and report;
[..]
cd tests
DB_CONFIG=../servers/slapd/DB_CONFIG SLAPD_DEBUG=256 TEST_META=yes ./run
test036
What's mandatory is
TEST_META=yes ./run test036
It sometimes succeeds (as stated in the result message) but fails from
now and then:
------------------------- begin -------------------------
./scripts/test036-meta-concurrency: line 174: 23927 Segmentation fault
$SLAPD -f $CONF3 -h $URI3 -d $LVL $TIMING >$LOG3 2>&1
Using ldapsearch to retrieve all the entries...
./scripts/test036-meta-concurrency: line 188: kill: (23927) - No such
process
------------------------- end -------------------------
Need to investigate; probably, there's an error in recording the PID of
the process inside the test script
Various glibc errors:
------------------------- begin -------------------------
*** glibc detected *** double free or corruption (fasttop): 0x0835a8c8 ***
*** glibc detected *** free(): invalid pointer: 0x082de000 ***
------------------------- end -------------------------
This is something I'd like to be able to trace; can you create a core
and run with MALLOC_CHECK_=2 so that the test aborts immediately? I've
ben running it many times and didn't find any.
Even if the tests succeeds some messages look strange to me:
------------------------- begin -------------------------
ldap_search: No such object (32)
This is (sort of) OK; in some cases, you may get that error if the entry
cannot be fetched. I'm trying to turn i into LDAP_BUSY or so.
ldap_search: No such object (32)
ldap_read: Server is busy (51)
PID=24716 - Read done (51).
PID=24735 - Modify done (0).
ldap_search: No such object (32)
ldap_search: No such object (32)..many of these messages...
------------------------- end -------------------------
I have SuSE Linux 9.3:
- gcc version 3.3.5 20050117 (prerelease) (SUSE Linux)
- glibc-2.3.4-23
- kernel-default-2.6.11.4-20a
- db-4.3.27-3 (Berkeley-DB 4.3.27)
In general, all the LDAP_BUSY and asynchronous calls in the test suite
were added to track some problems arising with internals of back-meta
hanging in some cases under heavy load (ITS#3464) because when back-meta
uses an internal database as target, it comsumes one extra thread per
connection, and using the synchronous calls would lead to a deadlock.
So there's an internal fail-safe mech that aftr a certain numer of
retries gives up and returns LDAP_BUSY to the client. This is a
behavior you won't see e.g. with test008. I want to improve that fix by
making it configurable; for instance, if one can accept sometimes a slow
response, back-meta should retry "forever", as soon as there are no
local targets and, as such, no dealock is possible. In other cases, the
timing of the response may be essential; in these cases, an immediate
LDAP_BUSY would be the best solution.
Usually, under heavy load (test036 plus a few instances of "ls -R /" and
ping -f) I can see slapd-read (the slapd-tester client) returning few
LDAP_BUSY; I've never seen a failure of the write clients, and I've
never seen a failure of the slapd-search clients returning noSuchObject.
If you can provide further feedback, I'd be happy to fix these issues as
well.
Thanks, p.
--
Pierangelo Masarati
mailto:pierangelo.masarati@sys-net.it
SysNet - via Dossi,8 27100 Pavia Tel: +390382573859 Fax: +390382476497