[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: Backup and bdb-logfile removal
On Fri, 11 Apr 2008, Peter Mogensen wrote:
Howard Chu wrote:
...
No, "db_recover -c" is for recovering from a catastrophic failure. It's not
for creating a backup.
? When reading the docs it seems to me like db_recover -c is an integral part
of making a hot backup??
"db_recover -c" says "perform recovery using all of the txn log files that
are present instead of only going back to the point named in the last
checkpoint". When making a hot backup, you need to do that in case a
checkpoint was taken between when you started the copy of the first
database file and when the copy of the last txn log completed. That
"catastrophic recovery" only needs to be performed on the txn log files
that were copied as part of the hot backup and not txn log files that were
archivable before the first database file was copied.
In theory, it would be possible to perform full catastrophic recovery of a
database from *just* the txn log files starting at log.000000001 and _no_
database files...but that will probably take more time than you really are
interested in spending. The whole point of backing up the database files
is to make it unnecessary to save and process the txn log files whose
contents have been completely checkpointed to the database files.
...
What happens if your environment should crash after you have discarded
these log files, but before you begin your hot backup ?
Their contents have been checkpointed to the database files, so normal
recovery is sufficient.
...
To perform a backup and prune unused logfiles from your active environment:
============= WARNING: Only my guess
1) Run "db_archive" on you active environment to identify unused log files.
Copy them somewhere to keep while doing the backup.
These files are not needed in the backup itself. Indeed, they're only
needed if any of the database files are lost or corrupted without also
losing the txn log files. In my experience, the situations where these
files are useful are better handled by recovering from a replica instead
of trying to perform database level recovery.
(I once helped a site where a backplane failure managed to make fsync()
lie such that a checkpoint completed without the data actually making it
to disk for the database files. The txn log files were fine, so
performing catastrophic recovery with they not-yet-archived txn logs was
sufficient to fix the problem, but that's the *only* time, in 7 years of
intensive commercial BDB usage, where I've seen a use for archivable txn
log files.)
2) Run "db_archive -s" to indentify database files and copy them to your
backup location.
3) Run "db_archive -l" on you active environment to indentify all log files
and copy them to your backup location.
Do be sure to follow the BDB documentation regarding copying of the files.
In particular, use dd instead of cp on Solaris (or write your own program
that uses read() and not mmap()).
4) Run "db_recover -c" on your backup to make it consistent.
5) Since the backup is offline you can safely delete the unused log files
from it. ("db_archive -d")
6) The log files copied in step 1) can now safely be discarded so they don't
exist anywhere - including the active environment.
Then it's my impression that in case the active environment should crash you
should be able to continue from the backup + the logfiles from the active
environment with minimal data loss ???
============== End guess =========
Other than my comments above, this procedure looks good to me. *Do* be
sure to test it, both by forging failures of various part of it (out of
disk space during a copy?) and by actual making sure you have a tested
procedure for restoring a back up.
Philip Guenther