[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
FW: A cache entry addition retry patch (ITS#1643)
ergh. This comment about many race conditions highlights something that's
been worrying me. I wonder if this kind of fix is sufficient. I believe the
right approach may be to change all of back-bdb's ldap_pvt_mutex/rdwr locks
into BDB locks, so that the BDB library can manage all locks and prevent any
race conditions from arising. I've done a very little experimentation in
this direction, but haven't had time to refit the entire back-bdb code. For
the scheme to work, all read operations would also need to be enclosed in
transactions. All operations would then have an associated transaction ID,
which would be used as the locker ID for the BDB locks. This arrangement
would completely eliminate the possibility of undetectable deadlocks, as
well as allowing a certain degree of laziness in error recovery - all locks
associated with an operation are atomically released when the transaction
ends. (I'm sure I raised this subject before, just don't remember if it was
in private or on the -devel list.) The only question in my mind was whether
it made any difference to commit or abort a successful read transaction,
i.e., if either choice had any less overhead. In my mind, they should be
totally equivalent when no changes are made to the data.
-- Howard Chu
Chief Architect, Symas Corp. Director, Highland Sun
http://www.symas.com http://highlandsun.com/hyc
Symas: Premier OpenSource Development and Support
-----Original Message-----
From: owner-openldap-bugs@OpenLDAP.org
[mailto:owner-openldap-bugs@OpenLDAP.org]On Behalf Of
jongchoi@us.ibm.com
Sent: Tuesday, March 12, 2002 3:48 PM
To: openldap-its@OpenLDAP.org
Subject: A cache entry addition retry patch (ITS#1643)
Full_Name: Jong Hyuk Choi
Version: HEAD
OS: RedHat Linux 7.1
URL:
Submission from: (NULL) (198.81.209.16)
If the entry is found already present during the entry addition,
but the entry is deleted before the next retrieval attempt,
then it is reasonable to repeat the entire addition process.
The experiments showed that the number of repetition is very small
even with a very small number of cache entries.
Jong Hyuk Choi
IBM Thomas J. Watson Research Center - Enterprise Linux Group
P.O. Box 218, Yorktown Heights, NY 10598
email: jongchoi@us.ibm.com
(phone) 914-945-3979 (fax) 914-945-4425 TL: 862-3979
=======================================================================
diff -Naur ldap-head-Mar11/servers/slapd/back-bdb/id2entry.c
ldap-head-Mar11new/servers/slapd/back-bdb/id2entry.c
--- ldap-head-Mar11/servers/slapd/back-bdb/id2entry.c Mon Jan 28 23:02:27
2002+++ ldap-head-Mar11new/servers/slapd/back-bdb/id2entry.c Tue Mar
12
17:21:43 2002
@@ -121,13 +121,19 @@
ch_free( data.data );
}
- if (rc == 0 && bdb_cache_add_entry_rw(&bdb->bi_cache, *e, rw) != 0)
{
+ while (rc == 0 && bdb_cache_add_entry_rw(&bdb->bi_cache, *e, rw) !=
0)
{
+ Entry* ee;
+ int add_loop_cnt=0;
if ((*e)->e_private != NULL)
free ((*e)->e_private);
(*e)->e_private = NULL;
- bdb_entry_return (*e);
- if ((*e=bdb_cache_find_entry_id(&bdb->bi_cache,id,rw)) !=
NULL)
{
+ if ((ee=bdb_cache_find_entry_id(&bdb->bi_cache,id,rw)) !=
NULL)
{
+ bdb_entry_return (*e);
+ *e = ee;
return 0;
+ }
+ if (++add_loop_cnt == MAX_ADD_LOOP) {
+ return LDAP_BUSY;
}
}
diff -Naur ldap-head-Mar11/servers/slapd/back-ldbm/id2entry.c
ldap-head-Mar11new/servers/slapd/back-ldbm/id2entry.c
--- ldap-head-Mar11/servers/slapd/back-ldbm/id2entry.c Fri Jan 4 20:17:52
2002+++ ldap-head-Mar11new/servers/slapd/back-ldbm/id2entry.c Tue Mar
12
16:50:34 2002
@@ -256,14 +256,17 @@
e->e_id = id;
- if( cache_add_entry_rw( &li->li_cache, e, rw ) != 0 ) {
- entry_free( e );
+ while ( cache_add_entry_rw( &li->li_cache, e, rw ) != 0 ) {
+ Entry* ee;
+ int add_loop_cnt=0;
/* XXX this is a kludge.
* maybe the entry got added underneath us
* There are many underlying race condtions in the
cache/disk
code.
*/
- if ( (e = cache_find_entry_id( &li->li_cache, id, rw )) !=
NULL
) {
+ if ( (ee = cache_find_entry_id( &li->li_cache, id, rw )) !=
NULL
) {
+ entry_free( e );
+ e = ee;
#ifdef NEW_LOGGING
LDAP_LOG(( "backend", LDAP_LEVEL_DETAIL1,
"id2entry_rw: %s of %ld 0x%lx (cache)\n",
@@ -276,17 +279,20 @@
return( e );
}
+ if (++add_loop_cnt == MAX_ADD_LOOP) {
#ifdef NEW_LOGGING
- LDAP_LOG(( "backend", LDAP_LEVEL_ERR,
- "id2entry_rw: %s of %ld (cache add failed)\n",
- rw ? "write" : "read", id ));
+ LDAP_LOG(( "backend", LDAP_LEVEL_ERR,
+ "id2entry_rw: %s of %ld (DSA busy : cache
add
failed)\n",
+ rw ? "write" : "read", id ));
#else
- Debug( LDAP_DEBUG_TRACE, "<= id2entry_%s( %ld ) (cache addf
ailed)\n",
- rw ? "w" : "r", id, 0 );
+ Debug( LDAP_DEBUG_TRACE, "<= id2entry_%s( %ld ) (DSA
busy : cache add failed)\n",
+ rw ? "w" : "r", id, 0 );
#endif
-
- return NULL;
+ /* No other way of returning error in LDBM */
+ return ( NULL );
+ }
}
+
#ifdef NEW_LOGGING
LDAP_LOG(( "backend", LDAP_LEVEL_ENTRY,
diff -Naur ldap-head-Mar11/servers/slapd/slap.h
ldap-head-Mar11new/servers/slapd/slap.h
--- ldap-head-Mar11/servers/slapd/slap.h Mon Feb 18 19:06:50 2002
+++ ldap-head-Mar11new/servers/slapd/slap.h Mon Mar 11 21:38:16 2002
@@ -1576,6 +1576,11 @@
#define sl_addr sl_sa.sa_in_addr
} Listener;
+/*
+ * backend cache
+ */
+#define MAX_ADD_LOOP 30
+
LDAP_END_DECL
#include "proto-slap.h"