[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: back-bdb deadlocks

To: "Kurt D. Zeilenga" <Kurt@OpenLDAP.org>
Subject: RE: back-bdb deadlocks
From: "Howard Chu" <hyc@highlandsun.com>
Date: Wed, 16 Jan 2002 18:50:11 -0800
Cc: <openldap-devel@OpenLDAP.org>
Importance: Normal
In-reply-to: <5.1.0.14.0.20020116182850.016e23f8@127.0.0.1>

At the moment I'm testing without any special ACLs.

  -- Howard Chu
  Chief Architect, Symas Corp.       Director, Highland Sun
  http://www.symas.com               http://highlandsun.com/hyc  
  Symas: Premier OpenSource Development and Support

> -----Original Message-----
> From: Kurt D. Zeilenga [mailto:Kurt@OpenLDAP.org]
> Sent: Wednesday, January 16, 2002 6:34 PM
> To: Howard Chu
> Cc: openldap-devel@OpenLDAP.org
> Subject: Re: back-bdb deadlocks
> 
> 
> I assume you aren't using ACLs which might access other entries?
> 
> Looks like be_group() and be_attribute() are using the transaction
> id passed in via o_private....
> 
> Kurt
> 
> At 04:40 PM 2002-01-16, Howard Chu wrote:
> >I believe a similar solution to this is needed in back-bdb, but I haven't
> >yet spent much time thinking about it can work. The current 
> situation is that
> >back-bdb gets deadlocked very easily by more than 2 simultaneous updaters.
> >There doesn't seem to be any easy solution to this; my debug 
> session with three
> >deadlocked threads showed that two of them were deadlocked in 
> txn_abort - i.e.,
> >deadlock detection already did it's job, but new locks were still needed to
> >unravel the transaction. Pretty rude. (I suspect this situation 
> arose because I
> >was testing bdb_add where the parent entry was obtained outside of the
> >transaction. With the parent entry inside the transaction, the 
> deadlock still
> >occurs, but doesn't get detected!)
> >
> >Someone needs to take a careful look at what records we touch and in what
> >order. My suspicion is that we cannot guarantee that we can keep 
> transactions
> >out of each others' way. Considering that the deeper a tree gets, the more
> >objects get locked for a transaction, the number of locks being 
> held gets huge.
> >I briefly experimented with nested transactions before, with no 
> better success.
> >I guess this is because even after the nested transaction is 
> committed, none of
> >its resources/locks are released until the parent transaction 
> commits. Perhaps
> >I've gone the wrong direction; and maybe all operations (even 
> read-only) need
> >to be wrapped in transactions just to keep locks consistent.
> >
> >  -- Howard Chu
> >  Chief Architect, Symas Corp.       Director, Highland Sun
> >  http://www.symas.com               http://highlandsun.com/hyc
> >  Symas: Premier OpenSource Development and Support
> >
> >-----Original Message-----
> >From: owner-openldap-bugs@OpenLDAP.org
> >[mailto:owner-openldap-bugs@OpenLDAP.org] On Behalf Of
> >adamson@andrew.cmu.edu
> >Sent: Thursday, October 11, 2001 8:04 AM
> >To: openldap-its@OpenLDAP.org
> >Subject: Re: index corruption (1164) still present in 2.0.15 with db
> >3.1.17 (ITS#1359)
> >
> >
> >> I think you've hit on the problem.  This likely effects
> >> DN parent/subtree indices as well.  We can either try to
> >> add some fine grained r/w locks or one giant r/w lock.
> >> I suggest the latter and that it initially be quite
> >> giant.  That is, every "database" have one giant lock
> >> which is acquired for read for LDAP read operations and
> >> acquired for write for LDAP write operations.  This
> >> will serialize all writes within a given "database".
> >
> >I have a program that splices a lot of threads and does a lot of
> >add & delete operations to several attributes in many entries. (My days at
> >Transarc IBM testing AFS are coming back to me) The idea is to have many
> >threads in the slapd server accessing the same keys in the same index
> >files at the same time.  It does a pretty efficient job of causing and
> >detecting index corruption.
> >
> >I did some experimental work on locking by adding a mutex to each dbcache
> >structure in back-ldbm. It is only invoked in key_change() in key.c,
> >around the two calls to change an index item:
> >
> >        ldap_pvt_thread_mutex_lock( &db->dbc_write_mutex );
> >        if (op == SLAP_INDEX_ADD_OP) {
> >            /* Add values */
> >            rc = idl_insert_key( be, db, key, id );
> >
> >        } else {
> >            /* Delete values */
> >            rc = idl_delete_key( be, db, key, id );
> >        }
> >        ldap_pvt_thread_mutex_unlock( &db->dbc_write_mutex );
> >
> >If I placed the lock higher in the stack, such as in indexer(), it also
> >works, but that means the lock is in place while CPU intensive indexing
> >functions for matching rules are being called, which is inefficient.
> >The lock cannot be placed any lower in the stack, for greater granularity,
> >since idl_insert_key and idl_delete_key are the ones doing the
> >modifications.  A greater level of granularity could be achieved by
> >locking on a dbcache-key basis, but this would require a lot more
> >complexity (lists of keys that are being modified).
> >
> >I initialize the mutex when the dbcache is being initialized (e.g. the
> >cache name is being copied in) in ldbm_cache_open() and I destroy the
> >mutex in ldbm_cache_destroy().
> >
> >
> >My particular test program is no longer able to corrupt the index files
> >with this "medium grain" locking.
> >
> >
> >-Mark Adamson
> > Carnegie Mellon
>

References:
- Re: back-bdb deadlocks
  - From: "Kurt D. Zeilenga" <Kurt@OpenLDAP.org>

Prev by Date: Re: back-bdb deadlocks
Next by Date: RE: back-bdb deadlocks
Index(es):
- Chronological
- Thread