[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Threads and race conditions in slapd
I'm trying to track down a strange, intermittent, bug with the CVS head (and
beta1). The symptoms are that slapd crashes having produced a number of
"bind: ber_scanf failed" errors.
Further investigation seems to suggest that connection_operation is being
called multiple times for the same operation. Things which hint at this are
the fact that the c->ops queue is empty (its the attempt to remove the
operation from the queue that causes the crash), and that c_n_ops_executing
is a negative number. The problem seems to only arise when the server is
under a reasonably heavy load. We're running on a Linux SMP system - if that
makes any difference.
I've found one possible problem with the thread pool (additions to the free
list weren't protected by a mutex - ITS#1839) but am unsure how this could
cause the problem I'm seeing, and I'm even more unsure about where else to
look. Has anyone else seen this, or have any thoughts on the best line of
attack?
Cheers,
Simon.
--
Simon Wilkinson <simon@sxw.org.uk> http://www.sxw.org.uk
"I assure you the thought never even crossed my mind, lord."
"Indeed? Then if I were you I'd sue my face for slander."
- Terry Pratchett, "The Colour of Magic"