summaryrefslogtreecommitdiff
path: root/lib/tdb
AgeCommit message (Collapse)AuthorFilesLines
2010-02-24tdb: handle processes dying during transaction commit.Rusty Russell3-0/+86
tdb transactions were designed to be robust against the machine powering off, but interestingly were never designed to handle the case where an administrator kill -9's a process during commit. Because recovery is only done on tdb_open, processes with the tdb already mapped will simply use it despite it being corrupt and needing recovery. The solution to this is to check for recovery every time we grab a data lock: we could have gained the lock because a process just died. This has no measurable cost: here is the time for tdbtorture -s 0 -n 1 -l 10000: Before: 2.75 2.50 2.81 3.19 2.91 2.53 2.72 2.50 2.78 2.77 = Avg 2.75 After: 2.81 2.57 3.42 2.49 3.02 2.49 2.84 2.48 2.80 2.43 = Avg 2.74 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24patch tdb-refactor-tdb_lock-and-tdb_lock_nonblock.patchRusty Russell1-16/+13
2010-02-24tdb: add -k option to tdbtortureRusty Russell1-57/+142
To test the case of death of a process during transaction commit, add a -k (kill random) option to tdbtorture. The easiest way to do this is to make every worker a child (unless there's only one child), which is why this patch is bigger than you might expect. Using -k without -t (always transactions) you expect corruption, though it doesn't happen every time. With -t, we currently get corruption but the next patch fixes that. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24tdb: don't truncate tdb on recoveryRusty Russell1-10/+0
The current recovery code truncates the tdb file on recovery. This is fine if recovery is only done on first open, but is a really bad idea as we move to allowing recovery on "live" databases. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24tdb: remove lock opsRusty Russell4-40/+22
Now the transaction code uses the standard allrecord lock, that stops us from trying to grab any per-record locks anyway. We don't need to have special noop lock ops for transactions. This is a nice simplification: if you see brlock, you know it's really going to grab a lock. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24tdb: rename tdb_release_extra_locks() to tdb_release_transaction_locks()Rusty Russell3-13/+9
tdb_release_extra_locks() is too general: it carefully skips over the transaction lock, even though the only caller then drops it. Change this, and rename it to show it's clearly transaction-specific. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24tdb: cleanup: remove ltype argument from _tdb_transaction_cancel.Rusty Russell1-17/+13
Now the transaction allrecord lock is the standard one, and thus is cleaned in tdb_release_extra_locks(), _tdb_transaction_cancel() doesn't need to know what type it is. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17tdb: tdb_allrecord_lock/tdb_allrecord_unlock/tdb_allrecord_upgradeRusty Russell3-29/+62
Centralize locking of all chains of the tdb; rename _tdb_lockall to tdb_allrecord_lock and _tdb_unlockall to tdb_allrecord_unlock, and tdb_brlock_upgrade to tdb_allrecord_upgrade. Then we use this in the transaction code. Unfortunately, if the transaction code records that it has grabbed the allrecord lock read-only, write locks will fail, so we treat this upgradable lock as a write lock, and mark it as upgradable using the otherwise-unused offset field. One subtlety: now the transaction code is using the allrecord_lock, the tdb_release_extra_locks() function drops it for us, so we no longer need to do it manually in _tdb_transaction_cancel. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24tdb: suppress record write locks when allrecord lock is taken.Rusty Russell1-0/+9
Records themselves get (read) locked by the traversal code against delete. Interestingly, this locking isn't done when the allrecord lock has been taken, though the allrecord lock until recently didn't cover the actual records (it now goes to end of file). The write record lock, grabbed by the delete code, is not suppressed by the allrecord lock. This is now bad: it causes us to punch a hole in the allrecord lock when we release the write record lock. Make this consistent: *no* record locks of any kind when the allrecord lock is taken. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24tdb: cleanup: always grab allrecord lock to infinity.Rusty Russell1-7/+3
We were previously inconsistent with our "global" lock: the transaction code grabbed it from FREELIST_TOP to end of file, and the rest of the code grabbed it from FREELIST_TOP to end of the hash chains. Change it to always grab to end of file for simplicity and so we can merge the two. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17tdb: remove num_locksRusty Russell2-11/+2
This was redundant before this patch series: it mirrored num_lockrecs exactly. It still does. Also, skip useless branch when locks == 1: unconditional assignment is cheaper anyway. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17tdb: use tdb_nest_lock() for seqnum lock.Rusty Russell1-3/+3
This is pure overhead, but it centralizes the locking. Realloc (esp. as most implementations are lazy) is fast compared to the fnctl anyway. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24tdb: use tdb_nest_lock() for active lock.Rusty Russell2-5/+18
Use our newly-generic nested lock tracking for the active lock. Note that the tdb_have_extra_locks() and tdb_release_extra_locks() functions have to skip over this lock now it is tracked. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-22tdb: use tdb_nest_lock() for open lock.Rusty Russell3-15/+10
This never nests, so it's overkill, but it centralizes the locking into lock.c and removes the ugly flag in the transaction code to track whether we have the lock or not. Note that we have a temporary hack so this places a real lock, despite the fact that we are in a transaction. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17tdb: use tdb_nest_lock() for transaction lock.Rusty Russell2-32/+23
Rather than a boutique lock and a separate nest count, use our newly-generic nested lock tracking for the transaction lock. Note that the tdb_have_extra_locks() and tdb_release_extra_locks() functions have to skip over this lock now it is tracked. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17tdb: cleanup: find_nestlock() helper.Rusty Russell1-28/+23
Factor out two loops which find locks; we are going to introduce a couple more so a helper makes sense. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24tdb: cleanup: tdb_release_extra_locks() helperRusty Russell3-17/+22
Move locking intelligence back into lock.c, rather than open-coding the lock release in transaction.c. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17tdb: cleanup: tdb_have_extra_locks() helperRusty Russell4-5/+17
In many places we check whether locks are held: add a helper to do this. The _tdb_lockall() case has already checked for the allrecord lock, so the extra work done by tdb_have_extra_locks() is merely redundant. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17tdb: don't suppress the transaction lock because of the allrecord lock.Rusty Russell1-6/+0
tdb_transaction_lock() and tdb_transaction_unlock() do nothing if we hold the allrecord lock. However, the two locks don't overlap, so this is wrong. This simplification makes the transaction lock a straight-forward nested lock. There are two callers for these functions: 1) The transaction code, which already makes sure the allrecord_lock isn't held. 2) The traverse code, which wants to stop transactions whether it has the allrecord lock or not. There have been deadlocks here before, however this should not bring them back (I hope!) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17tdb: cleanup: tdb_nest_lock/tdb_nest_unlockRusty Russell3-45/+67
Because fcntl locks don't nest, we track them in the tdb->lockrecs array and only place/release them when the count goes to 1/0. We only do this for record locks, so we simply place the list number (or -1 for the free list) in the structure. To generalize this: 1) Put the offset rather than list number in struct tdb_lock_type. 2) Rename _tdb_lock() to tdb_nest_lock, make it non-static and move the allrecord check out to the callers (except the mark case which doesn't care). 3) Rename _tdb_unlock() to tdb_nest_unlock(), make it non-static and move the allrecord out to the callers (except mark again). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17tdb: cleanup: rename global_lock to allrecord_lock.Rusty Russell5-29/+29
The word global is overloaded in tdb. The global_lock inside struct tdb_context is used to indicate we hold a lock across all the chains. Rename it to allrecord_lock. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17tdb: cleanup: rename GLOBAL_LOCK to OPEN_LOCK.Rusty Russell3-17/+17
The word global is overloaded in tdb. The GLOBAL_LOCK offset is used at open time to serialize initialization (and by the transaction code to block open). Rename it to OPEN_LOCK. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24tdb: make _tdb_transaction_cancel static.Rusty Russell2-2/+1
Now tdb_open() calls tdb_transaction_cancel() instead of _tdb_transaction_cancel, we can make it static. Signed-off-by: Rusty Russell<rusty@rustcorp.com.au>
2010-02-17tdb: cleanup: split brlock and brunlock methods.Rusty Russell7-117/+235
This is taken from the CCAN code base: rather than using tdb_brlock for locking and unlocking, we split it into brlock and brunlock functions. For extra debugging information, brunlock says what kind of lock it is unlocking (even though fnctl locks don't need this). This requires an extra argument to tdb_transaction_unlock() so we know whether the lock was upgraded to a write lock or not. We also use a "flags" argument tdb_brlock: 1) TDB_LOCK_NOWAIT replaces lck_type = F_SETLK (vs F_SETLKW). 2) TDB_LOCK_MARK_ONLY replaces setting TDB_MARK_LOCK bit in ltype. 3) TDB_LOCK_PROBE replaces the "probe" argument. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-22Spelling fixes for tdb.Brad Hards2-2/+2
Signed-off-by: Matthias Dieter Wallnöfer <mwallnoefer@yahoo.de>
2010-02-13tdb: use fdatasync() instead of fsync() in transactionsAndrew Tridgell1-1/+1
This might help on some filesystems
2010-02-13tdb: Apply some const, just for clarityVolker Lendecke1-1/+1
2010-02-10tdb: fix recovery reuse after crashRusty Russell1-4/+10
If a process (or the machine) dies after just after writing the recovery head (pointing at the end of file), the recovery record will filled with 0x42. This will not invoke a recovery on open, since rec.magic != TDB_RECOVERY_MAGIC. Unfortunately, the first transaction commit will happily reuse that area: tdb_recovery_allocate() doesn't check the magic. The recovery record has length 0x42424242, and it writes that back into the now-valid-looking transaction header) for the next comer (which happens to be tdb_wipe_all in my tests). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-10tdb: give a name to the invalid recovery area constant (0)Rusty Russell3-4/+5
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-08release-scripts: parametrize scriptsSimo Sorce2-48/+67
This should make it easier to keep all release scripts alined as it will reduce the difference between them to ideally a few variables Also moves the tdb script in the scripts directory.
2010-02-06tdb: raise version to 1.2.1Simo Sorce1-1/+1
after recent fixes we need to raise the version to 1.2.1 so that we can require also the right patched version.
2010-02-01tdb: fix an early release of the global lock that can cause data corruptionVolker Lendecke1-5/+10
There was a bug in tdb where the tdb_brlock(tdb, GLOBAL_LOCK, F_UNLCK, F_SETLKW, 0, 1); (ending the transaction-"mutex") was done before the /* remove the recovery marker */ This means that when a transaction is committed there is a window where another opener of the file sees the transaction marker while the transaction committer is still fully functional and working on it. This led to transaction being rolled back by that second opener of the file while transaction_commit() gave no error to the caller. This patch moves the F_UNLCK to after the recovery marker was removed, closing this window.
2010-01-06tdb: fix standalone 'make installdocs'Stefan Metzmacher2-3/+4
metze
2010-01-06tdb: create symbol links to shared libraries see ↵Brian Lu1-0/+4
https://bugzilla.samba.org/show_bug.cgi?id=6991 for details Signed-off-by: Stefan Metzmacher <metze@samba.org>
2009-12-21tdb: Also build and install tdb manpages from standalone tdb.Jelmer Vernooij7-3/+459
2009-12-21tdb: Fix formatting of API check file.Jelmer Vernooij1-1/+1
2009-12-17tdbtool: avoid using c++ reserved words.Günther Deschner1-2/+2
Guenther
2009-12-07Fix release script with newer versins of gitSimo Sorce1-1/+1
2009-11-20tdb tools: Mostly cosmetic adaptionsMatthias Dieter Wallnöfer2-8/+9
Signed-off-by: Stefan Metzmacher <metze@samba.org>
2009-11-20tdb: change version to 1.2.0 after adding TDB_*ALLOW_NESTINGStefan Metzmacher1-1/+1
metze
2009-11-20tdb: add TDB_DISALLOW_NESTING and make TDB_ALLOW_NESTING the default behaviorStefan Metzmacher5-3/+63
We need to keep TDB_ALLOW_NESTING as default behavior, so that existing code continues to work. However we may change the default together with a major version number change in future. metze
2009-11-20New attempt at TDB transaction nesting allow/disallow.Ronnie Sahlberg2-1/+14
Make the default be that transaction is not allowed and any attempt to create a nested transaction will fail with TDB_ERR_NESTING. If an application can cope with transaction nesting and the implicit semantics of tdb_transaction_commit(), it can enable transaction nesting by using the TDB_ALLOW_NESTING flag. (cherry picked from ctdb commit 3e49e41c21eb8c53084aa8cc7fd3557bdd8eb7b6) Signed-off-by: Stefan Metzmacher <metze@samba.org>
2009-11-20tdb: always set tdb->tracefd to -1 to be safe on goto failStefan Metzmacher1-4/+3
metze
2009-11-08tdb: Fix a C++ warningVolker Lendecke1-1/+2
2009-10-29tdb: update README a bitKirill Smelkov1-8/+1
While studying tdb, I've noticed a couple of mismatches between readme and actual code: - tdb_open_ex changed it's log_fn argument to log_ctx - there is now no tdb_update(), which it seems was transformed into non-exported tdb_update_hash() There were other mismatches, but I don't remember them now, sorry. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-10-29tdb: add tests for double .close() in pytdbKirill Smelkov1-0/+9
The reason I do it is that when using older python-tdb as shipped in Debian Lenny, python interpreter crashes on this test: (gdb) bt #0 0xb7f8c424 in __kernel_vsyscall () #1 0xb7df5640 in raise () from /lib/i686/cmov/libc.so.6 #2 0xb7df7018 in abort () from /lib/i686/cmov/libc.so.6 #3 0xb7e3234d in __libc_message () from /lib/i686/cmov/libc.so.6 #4 0xb7e38624 in malloc_printerr () from /lib/i686/cmov/libc.so.6 #5 0xb7e3a826 in free () from /lib/i686/cmov/libc.so.6 #6 0xb7b39c84 in tdb_close () from /usr/lib/libtdb.so.1 #7 0xb7b43e14 in ?? () from /var/lib/python-support/python2.5/_tdb.so #8 0x0a038d08 in ?? () #9 0x00000000 in ?? () master's pytdb does not (we have a check for self->closed in obj_close()), but still... Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-10-29tdb: reset tdb->fd to -1 in tdb_close()Kirill Smelkov1-1/+3
So that erroneous double tdb_close() calls do not try to close() same fd again. This is like SAFE_FREE() but for fd. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-10-29tdb: fix typo in python's Tdb.get() docstringKirill Smelkov1-1/+1
It's Tdb.get(), not Tdb.fetch(). Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-10-29tdb: kill last bits from swigKirill Smelkov2-6/+1
We no longer use swig for pytdb, so there is no need for swig make rules. Also pytdb.c header should be updated. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-10-25tdb: detect tdb store of identical records and skipAndrew Tridgell1-0/+20
This can help with ldb where we rewrite the index records