[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: (ITS#5664) Deadlocks when writing in parallell (two processes)
tom.bjorkholm@aastra.com wrote:
> Full_Name: Stelios Grigoriadis & Tom Björkholm
> Version: 2.3.39
> OS: Novell SLES 10
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (194.237.142.7)
>
>
> We get a lot of DB_LOCK_DEADLOCK when using client programs that for a period of
> time continuously writes to OpenLDAP.
> Version is 2.3.39.
>
> The information added is of the form:
> ebcmdCustomer=0+ebcmdDir=220xx,ou=AuthCodes,ebcmdVersion=0,ebcmdProduct=ebcmd,dc=example,dc=com
> where xx varies.
>
> Snippet of the output:
> Mar 27 13:03:21 ldapt1 slapd[7589]: => bdb_dn2id_add: subtree
> (ebcmdCustomer=0+ebcmdDir=22037,ou=authcodes,ebcmdVersion=0,ebcmdProduct=ebcmd,dc=example,dc=com)
> put failed: -30995
> Mar 27 13:03:26 ldapt1 slapd[7589]: => bdb_idl_insert_key: c_put id failed:
> DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995)
> Mar 27 13:03:26 ldapt1 slapd[7589]: => bdb_dn2id_add: parent
> (ou=authcodes,ebcmdVersion=0,ebcmdProduct=ebcmd,dc=example,dc=com) insert
> failed: -30995
> Mar 27 13:03:28 ldapt1 slapd[7589]: => bdb_idl_insert_key: c_put id failed:
> DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995)
> Mar 27 13:03:28 ldapt1 slapd[7589]: => bdb_dn2id_add: parent
> (ou=authcodes,ebcmdVersion=0,ebcmdProduct=ebcmd,dc=example,dc=com) insert
> failed: -30995
> Mar 27 13:03:36 ldapt1 slapd[7589]: => bdb_idl_insert_key: c_put id failed:
> DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995)
> Mar 27 13:03:36 ldapt1 slapd[7589]: => bdb_dn2id_add: parent
> (ou=authcodes,ebcmdVersion=0,ebcmdProduct=ebcmd,dc=example,dc=com) insert
> failed: -30995
> Mar 27 13:03:38 ldapt1 slapd[7589]: => bdb_idl_insert_key: c_put id failed:
> DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995)
>
>
>
We've temporarily fixed the problem by introducing a static mutex before
any add/update operation.
By doing so, we have effectively serialized the add/update operations
from within slapd. This is just
intended as a temporary solution as we hope the issue will be resolved
in future releases.
The patch:
# This patch file is derived from OpenLDAP Software. All of the
modifications to OpenLDAP Software
# represented in the following patch(es) were developed by Stelios
Grigoriadis stelios.xx.grigoriadis@ericsson.com.
# These modifications are not subject to any license of Ericsson AB.
# I, Stelios Grigoriadis, hereby place the following modifications to
OpenLDAP Software (and only these modifications)
# into the public domain. Hence, these modifications may be freely used
and/or redistributed for any purpose with or
# without attribution and/or other notice.
# Bug Fix - This patch fixes the bug ITS#5133.
# The fix works as follows. A periodic check in the runqueue (called
do_mastercheck). The intervall is determined by
# a slapd.conf parameter (mastercheckint) in the syncrepl section and is
optional. If it's not specified, it's not
# inserted in the runqueue.
--- servers/slapd/connection.c 2007-06-15 01:49:38.000000000 +0200
+++ connection.c 2008-06-10 16:30:08.000000000 +0200
@@ -1052,19 +1052,24 @@
/* FIXME: returns 0 in case of failure */
ldap_pvt_mp_add_ulong(slap_counters.sc_ops_initiated, 1);
ldap_pvt_thread_mutex_unlock( &slap_counters.sc_ops_mutex );
+ static pthread_mutex_t op_upd_mutex = PTHREAD_MUTEX_INITIALIZER;
+ int upd_tag=0;
op->o_threadctx = ctx;
#ifdef LDAP_DEVEL
op->o_tid = ldap_pvt_thread_pool_tid( ctx );
#endif /* LDAP_DEVEL */
switch ( tag ) {
- case LDAP_REQ_BIND:
- case LDAP_REQ_UNBIND:
case LDAP_REQ_ADD:
case LDAP_REQ_DELETE:
case LDAP_REQ_MODDN:
case LDAP_REQ_MODIFY:
+ ldap_pvt_thread_mutex_lock( &op_upd_mutex );
+ upd_tag=1;
+ break;
+ case LDAP_REQ_BIND:
+ case LDAP_REQ_UNBIND:
case LDAP_REQ_COMPARE:
case LDAP_REQ_SEARCH:
case LDAP_REQ_ABANDON:
@@ -1178,6 +1183,10 @@
connection_resched( conn );
ldap_pvt_thread_mutex_unlock( &conn->c_mutex );
+ if (upd_tag) {
+ ldap_pvt_thread_mutex_unlock( &op_upd_mutex );
+ }
+
return NULL;
}