[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#4938) hdb_db_close SEGVs



----- richton@nbcs.rutgers.edu wrote:
> I don't have #5 (sleepycat#14657) nor the unofficial 
> http://www.stanford.edu/services/directory/openldap/configuration/patches/db/4252-region-fix.diff
> 
> patch. As for the official one, I'm not sure about its relevance to
> the 
> actual SEGV due to the "recovery...fail" comment. In other words,
> though 
> it may be impacting the ability of alock/db_recover to do its thing, 
> that's just a side effect of the unclean shutdown which is the real
> bug 
> here to my view.




Patch #5 specifically deals with a race condition where a checkpoint is occuring while a cache buffer retrieval is also occuring causing a database corruption that will later not be recoverable from.  At least, that's how I read sleepcat's description:

5. Fix a bug where cache buffer retrieval could race with a checkpoint call, potentially causing database environment recovery to fail. [#14657]

Given that OpenLDAP checkpoints on shutdown, shutting down the server could be what is triggering the issue for you.  I'd suggest applying the patch and seeing if this resolves your problem.

> The region size patch is interesting, but I will tell you that the 
> database in question has
> 
> set_cachesize   0 200000000 0
> 
> and it (to a glance) looks like that only impacts the gig column,
> which I 
> have as zero anyway.

Yeah, the patch may not apply for you (I have a 3.5GB cache, so it does for me).  Wouldn't harm anything, of course, if you decided later you needed a larger BDB cache. ;)

--Quanah

-- 
Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc
--------------------
Zimbra ::  the leader in open source messaging and collaboration