[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: RE24 testing call (2.4.17) round 2
Quanah Gibson-Mount wrote:
Please test RE24 and report any issues. All known regressions are now
believed fixed. Thanks!
All tests passed on OpenSUSE 10.3 x86-64.
test056 failed on Ubuntu 9.04 x86-64 on the connection output. Running again
succeeded. Comparing the slapd.1.log files from the bad run to the good run,
something very odd shows up:
In the good run, the conn=1 op=0 RESULT is logged earlier - at the time it
should be:
@@ -417,12 +417,14 @@
send_ldap_result: err=0 matched="" text=""
send_ldap_response: msgid=1 tag=97 err=0
ber_flush2: 14 bytes to sd 13
+conn=1 op=0 RESULT tag=97 err=0 text=
+do_bind: v3 anonymous bind
connection_get(13)
connection_get(13): got connid=1
connection_read(13): checking for input on id=1
ber_get_next
ber_get_next: tag 0x30 len 422 contents:
-op tag 99, time 1246147222
+op tag 99, time 1246147432
ber_get_next
conn=1 op=1 do_search
ber_scanf fmt ({miiiib) ber:
In the failed run, the op=0 RESULT is logged much later, after op=2 was
already handled:
@@ -470,15 +472,268 @@
connection_read(13): checking for input on id=1
ber_get_next
ber_get_next: tag 0x30 len 5 contents:
-op tag 66, time 1246147222
+op tag 66, time 1246147432
ber_get_next
ber_get_next on fd 13 failed errno=0 (Success)
conn=1 op=2 do_unbind
conn=1 op=2 UNBIND
-conn=1 op=0 RESULT tag=97 err=0 text=
-do_bind: v3 anonymous bind
connection_close: conn=1 sd=13
conn=1 fd=13 closed
+slap_listener_activate(7):
+>>> slap_listener(ldap://localhost:9011/)
+conn=2 fd=13 ACCEPT from IP=127.0.0.1:38898 (IP=127.0.0.1:9011)
+connection_get(13)
+connection_get(13): got connid=2
+connection_read(13): checking for input on id=2
+ber_get_next
+ber_get_next: tag 0x30 len 12 contents:
+op tag 96, time 1246147432
So, the monitor output disagrees:
--- testrun.X/ldapsearch.flt 2009-06-27 17:00:22.000000000 -0700
+++ testdata/monitor1.out 2009-04-27 15:50:11.000000000 -0700
@@ -2,9 +2,9 @@
structuralObjectClass: monitorConnection
monitorConnectionProtocol: 3
monitorConnectionOpsReceived: 2
-monitorConnectionOpsExecuting: 2
+monitorConnectionOpsExecuting: 1
monitorConnectionOpsPending: 0
-monitorConnectionOpsCompleted: 0
+monitorConnectionOpsCompleted: 1
monitorConnectionGet: 2
monitorConnectionRead: 2
monitorConnectionWrite: 0
because op=0 is still outstanding when the op=1 search runs. In both cases,
the actual result was sent to the client at the same time (see the ber_flush2
message), but after it succeeded something prevented it from making further
progress. Which is pretty odd, because after that it just does a cond_signal
and a mutex_unlock. After ber_free_buf() it just gets a mutex on its counters
so it can increment a couple items, but that should be a freebie now since
each operation has its per-thread counter structure. I guess it's possible
that the malloc library is locking us along the way, or even the stdio library.
Not sure if this is identical to ITS#5966, it doesn't seem like it. It also
doesn't seem like something we should hold the release for; the monitor output
is correct even if it's not what we expected to see.
Alternatively, we might consider moving the reset
if ( op->o_conn->c_conn_state == SLAP_C_BINDING )
op->o_conn->c_conn_state = SLAP_C_ACTIVE;
from connection_bind_cb() to connection_bind_cleanup_cb(), to prevent ops from
getting dequeued until we're really free and clear.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/