[Date Prev][Date Next] [Chronological] [Thread] [Top]

Threads and race conditions in slapd



I'm trying to track down a strange, intermittent, bug with the CVS head (and 
beta1). The symptoms are that slapd crashes having produced a number of 
"bind: ber_scanf failed" errors. 

Further investigation seems to suggest that connection_operation is being 
called multiple times for the same operation. Things which hint at this are 
the fact that the c->ops queue is empty (its the attempt to remove the 
operation from the queue that causes the crash), and that c_n_ops_executing 
is a negative number. The problem seems to only arise when the server is 
under a reasonably heavy load. We're running on a Linux SMP system - if that 
makes any difference.

I've found one possible problem with the thread pool (additions to the free 
list weren't protected by a mutex - ITS#1839) but am unsure how this could 
cause the problem I'm seeing, and I'm even more unsure about where else to 
look. Has anyone else seen this, or have any thoughts on the best line of 
attack?

Cheers,

Simon.
-- 
Simon Wilkinson            <simon@sxw.org.uk>          http://www.sxw.org.uk
"I assure you the thought never even crossed my mind, lord."
"Indeed?  Then if I were you I'd sue my face for slander."
 - Terry Pratchett, "The Colour of Magic"