[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: ext3/ext4 fsync hack
- To: Hallvard Breien Furuseth <h.b.furuseth@usit.uio.no>, openldap-commit2devel@OpenLDAP.org
- Subject: Re: ext3/ext4 fsync hack
- From: Howard Chu <hyc@symas.com>
- Date: Tue, 06 Jan 2015 14:18:39 +0000
- In-reply-to: <549AABEB.5020707@usit.uio.no>
- References: <E1Y1StJ-0006XK-8o@euler.openldap.org> <549AABEB.5020707@usit.uio.no>
- User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:37.0) Gecko/20100101 Firefox/37.0 SeaMonkey/2.34a1
Hallvard Breien Furuseth wrote:
On 18/12/14 05:40, openldap-commit2devel@OpenLDAP.org wrote:
commit 0018eeb2c3b2239c30def9d47c9d194a4ebf35fe
Author: Howard Chu <hyc@openldap.org>
Date: Thu Dec 18 04:38:53 2014 +0000
Hack for potential ext3/ext4 corruption issue
Use regular fsync() if we think this commit grew the DB file.
This does not catch all cases:
If the new pages below mt_next_pgno were freed instead of
written, me_size becomes too big.
Huh? mt_next_pgno definitively tells how many pages have ever been used
in the DB file.
Later when the file does
grow, me_size may be >= actual filesize so it fdatasync()s.
Similar to b09e46904c1c059bd5086243e3915b6be510e57d
"ITS#7886 fix mdb_copy write size".
We can fix me_size, grow the file anyway (ftruncate), or
give the pages back to mt_next_pgno in mdb_freelist_save().
Another issue: After an MDB_NOSYNC commit, mdb_env_sync()
only fdatasync()s. It does not know when the file grew.
I suppose we can change the FORCE flag to also cause fsync() to be used.
The planned "group commits" may get the same problem if
the user checkpoints with mdb_env_sync().
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/