[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: (ITS#5442) slapd_rq not locked before use bugfix
Rein Tollevik wrote:
> On Sun, 30 Mar 2008, hyc@symas.com wrote:
>
>> rein@tollevik.no wrote:
>>> On Sat, 29 Mar 2008, ando@sys-net.it wrote:
>>>> rein@basefarm.no wrote:
>
>>>>> I was seeing random failures of the test050-syncrepl-multimaster test. One of
>>>>> the failures was that it went into a tight loop traversing a circular runqueue
>>>>> it had managed to create in slapd_rq.task_list. It seems as this was caused by
>>>>> missing mutex locks around accesses to slapd_rq, which the patch uploaded to
>>>>> ftp://ftp.openldap.org/incoming/slapd_rq_lock.patch fixes.
>>>>>
>>>>> Before I applied this patch the test failed after being run a few times, with it
>>>>> it has now passed 100 times and is still counting.
>>>> locks in back-bdb/config.c should be pointless, as modifications to the
>>>> configuration should only occur while all threads are paused. The rest
>>>> makes sort of sense, but I'd leave it to Howard.
>> Ignoring the ITS#5403 changes, I don't see anything here that isn't
>> config-related, therefore it's all running single-threaded.
>
> Now that the configuration can be changed dynamically (as this test does)
> I find it a bit odd that the config stuff should always be running
> single-threaded. But there is obviously much I don't know about the
> internals of slapd.
This was discussed at the 2003 OpenLDAP Developers' Day.
http://www.openldap.org/conf/odd-wien-2003/proceedings.html
Page 7 of my slides touched on it.
It's also mentioned again here
http://www.openldap.org/lists/openldap-devel/200505/msg00062.html
The main point is that the original config stuff assumed that it was only
executing during slapd startup, and single-threaded. Putting locks around all
of the potential configurable value accesses would have been too much work, so
the decision was made to force slapd to be single-threaded whenever writing to
the config.
I'm guessing that what's happening here is just a broken assumption -
suspending the thread pool doesn't in fact freeze everything in slapd. In
particular, the slapd listener thread may still wake up for a select() timeout
to handle the slapd runqueue. Even though no tasks submitted as a result of
this will execute (because they'll just get queued into the suspended thread
pool) the runqueue itself is still being manipulated.
Given that explanation, your patches make sense.
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/