[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: RE24 testing call (2.4.17) round 2

To: Quanah Gibson-Mount <quanah@zimbra.com>
Subject: Re: RE24 testing call (2.4.17) round 2
From: Howard Chu <hyc@symas.com>
Date: Sat, 27 Jun 2009 17:53:05 -0700
Cc: openldap-devel@openldap.org
In-reply-to: <A169CBFA5D3942A2C20EDF02@[192.168.1.199]>
References: <A169CBFA5D3942A2C20EDF02@[192.168.1.199]>
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; rv:1.9.1b5pre) Gecko/20090611 SeaMonkey/2.0a1pre Firefox/3.0.3

Quanah Gibson-Mount wrote:

Please test RE24 and report any issues.  All known regressions are now
believed fixed.  Thanks!


All tests passed on OpenSUSE 10.3 x86-64.

test056 failed on Ubuntu 9.04 x86-64 on the connection output. Running againsucceeded. Comparing the slapd.1.log files from the bad run to the good run,something very odd shows up:

In the good run, the conn=1 op=0 RESULT is logged earlier - at the time itshould be:


@@ -417,12 +417,14 @@
 send_ldap_result: err=0 matched="" text=""
 send_ldap_response: msgid=1 tag=97 err=0
 ber_flush2: 14 bytes to sd 13
+conn=1 op=0 RESULT tag=97 err=0 text=
+do_bind: v3 anonymous bind
 connection_get(13)
 connection_get(13): got connid=1
 connection_read(13): checking for input on id=1
 ber_get_next
 ber_get_next: tag 0x30 len 422 contents:
-op tag 99, time 1246147222
+op tag 99, time 1246147432
 ber_get_next
 conn=1 op=1 do_search
 ber_scanf fmt ({miiiib) ber:

In the failed run, the op=0 RESULT is logged much later, after op=2 wasalready handled:


@@ -470,15 +472,268 @@
 connection_read(13): checking for input on id=1
 ber_get_next
 ber_get_next: tag 0x30 len 5 contents:
-op tag 66, time 1246147222
+op tag 66, time 1246147432
 ber_get_next
 ber_get_next on fd 13 failed errno=0 (Success)
 conn=1 op=2 do_unbind
 conn=1 op=2 UNBIND
-conn=1 op=0 RESULT tag=97 err=0 text=
-do_bind: v3 anonymous bind
 connection_close: conn=1 sd=13
 conn=1 fd=13 closed
+slap_listener_activate(7):
+>>> slap_listener(ldap://localhost:9011/)
+conn=2 fd=13 ACCEPT from IP=127.0.0.1:38898 (IP=127.0.0.1:9011)
+connection_get(13)
+connection_get(13): got connid=2
+connection_read(13): checking for input on id=2
+ber_get_next
+ber_get_next: tag 0x30 len 12 contents:
+op tag 96, time 1246147432

So, the monitor output disagrees:
--- testrun.X/ldapsearch.flt	2009-06-27 17:00:22.000000000 -0700
+++ testdata/monitor1.out	2009-04-27 15:50:11.000000000 -0700
@@ -2,9 +2,9 @@
 structuralObjectClass: monitorConnection
 monitorConnectionProtocol: 3
 monitorConnectionOpsReceived: 2
-monitorConnectionOpsExecuting: 2
+monitorConnectionOpsExecuting: 1
 monitorConnectionOpsPending: 0
-monitorConnectionOpsCompleted: 0
+monitorConnectionOpsCompleted: 1
 monitorConnectionGet: 2
 monitorConnectionRead: 2
 monitorConnectionWrite: 0

because op=0 is still outstanding when the op=1 search runs. In both cases,the actual result was sent to the client at the same time (see the ber_flush2message), but after it succeeded something prevented it from making furtherprogress. Which is pretty odd, because after that it just does a cond_signaland a mutex_unlock. After ber_free_buf() it just gets a mutex on its countersso it can increment a couple items, but that should be a freebie now sinceeach operation has its per-thread counter structure. I guess it's possiblethat the malloc library is locking us along the way, or even the stdio library.

Not sure if this is identical to ITS#5966, it doesn't seem like it. It alsodoesn't seem like something we should hold the release for; the monitor outputis correct even if it's not what we expected to see.


Alternatively, we might consider moving the reset

    if ( op->o_conn->c_conn_state == SLAP_C_BINDING )
        op->o_conn->c_conn_state = SLAP_C_ACTIVE;

from connection_bind_cb() to connection_bind_cleanup_cb(), to prevent ops fromgetting dequeued until we're really free and clear.


--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

References:
- RE24 testing call (2.4.17) round 2
  - From: Quanah Gibson-Mount <quanah@zimbra.com>

Prev by Date: RE24 testing call (2.4.17) round 2
Next by Date: Re: RE24 testing call (2.4.17) round 2
Index(es):
- Chronological
- Thread