[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: LMDB: issue with mdb_cursor_del
On Mon, 2017-10-16 at 13:58 +0200, Hallvard Breien Furuseth wrote:
> On 16. okt. 2017 12:51, Howard Chu wrote:
> > timur.kristof@gmail.com wrote:
> > > I have an app that uses LMDB, and I've experienced an interesting
> > > issue: when trying to delete a certain item with mdb_cursor_del,
> > > it
> > > crashed with the following backtrace: https://pastebin.com/7p9wtk
> > > j9
>
> Weird backtrace. It says mdb_page_dirty(), which is small, streches
> over 300+ lines (frames #3-#4). And mdb_page_alloc() alone has no
> hex address for prefix. Maybe miscompilation, two liblmdb libraries
> linked into the same executable, or something like that? Or some
> wild pointer write or whatever messed things up.
Not sure what was going on there, maybe -O3 messed it up. Still, the
issue does appear with -O0 too and here is a backtrace with -O0:
https://pastebin.com/SfeMMEPH
> > Most likely the dirty
> > list is too big, which means you're trying to do too much in a
> > single
> > transaction.
>
> Shouldn't happen though. The txn should have failed earlier with
> MDB_TXN_FULL.
>
> Which also shouldn't happen since LMDB should have spilled enough
> pages to
> make room - unless you have hundreds of cursors at modified pages so
> LMDB can't spill enough.
>
> But we should probably test LMDB with impractically tight dirty-list
> arrays
> (i.e. a very small MDB_IDL_UM_MAX), so LMDB keeps running into such
> cases.
I've taken a look at the value of rc (see my reply to Howard), and it
seems to me that Леонид Юрьев's assessment may be correct here. rc is
-1 which indicates that the page (even though newly allocated, maybe a
reused page?) is already on the txn's dirty pages list.
- Timur