summaryrefslogtreecommitdiff
path: root/lib/tdb/common
AgeCommit message (Collapse)AuthorFilesLines
2012-10-06tdb: Make tdb robust against improper CLEAR_IF_FIRST restartVolker Lendecke1-4/+28
When winbind is restarted, there is a potential crash in tdb. Following situation: We are in a cluster with ctdb. A winbind child hangs in a request to the DC. Cluster monitoring decides the node has a problem. Cluster monitoring decides to kill ctdbd. winbind child still hangs in a RPC request. winbind parent figures that ctdb is dead and immediately commits suicide. winbind parent is restarted by cluster management, overwriting gencache.tdb with CLEAR_IF_FIRST. The CLEAR_IF_FIRST logic as implemented now will not see that a child still has the tdb open, only the parent holds the ACTIVE_LOCK due to performance reasons. During the CLEAR_IF_FIRST logic is done, there is a very small window where we ftruncate(tfd, 0) the file and re-write a proper header without a lock. When during this small window the winbind child comes back, wanting to store something into gencache.tdb, that winbind child will crash with a SIGBUS. Sounds unlikely? See: [2012/09/29 07:02:31.871607, 0] lib/util.c:1183(smb_panic) PANIC (pid 1814517): internal error [2012/09/29 07:02:31.877596, 0] lib/util.c:1287(log_stack_trace) BACKTRACE: 35 stack frames: #0 winbindd(log_stack_trace+0x1a) [0x7feb7d4ca18a] #1 winbindd(smb_panic+0x2b) [0x7feb7d4ca25b] #2 winbindd(+0x1a3cc4) [0x7feb7d4bacc4] #3 /lib64/libc.so.6(+0x32900) [0x7feb7a929900] #4 /lib64/libc.so.6(memcpy+0x35) [0x7feb7a97f355] #5 /usr/lib64/libtdb.so.1(+0x6e76) [0x7feb7b0b0e76] #6 /usr/lib64/libtdb.so.1(+0x3d37) [0x7feb7b0add37] #7 /usr/lib64/libtdb.so.1(+0x863d) [0x7feb7b0b263d] #8 /usr/lib64/libtdb.so.1(+0x8700) [0x7feb7b0b2700] #9 /usr/lib64/libtdb.so.1(+0x2505) [0x7feb7b0ac505] #10 /usr/lib64/libtdb.so.1(+0x25b7) [0x7feb7b0ac5b7] #11 /usr/lib64/libtdb.so.1(tdb_fetch+0x13) [0x7feb7b0ac633] #12 winbindd(gencache_set_data_blob+0x259) [0x7feb7d4d8449] #13 winbindd(gencache_set+0x53) [0x7feb7d4d85b3] #14 winbindd(gencache_del+0x5e) [0x7feb7d4d879e] #15 winbindd(saf_delete+0x93) [0x7feb7d54b693] #16 winbindd(+0xe507e) [0x7feb7d3fc07e] #17 winbindd(+0xe85e5) [0x7feb7d3ff5e5] #18 winbindd(+0xe65be) [0x7feb7d3fd5be] #19 winbindd(+0xe7562) [0x7feb7d3fe562] #20 winbindd(init_dc_connection+0x2e) [0x7feb7d3fe5be] #21 winbindd(+0xe75d9) [0x7feb7d3fe5d9] #22 winbindd(cm_connect_netlogon+0x58) [0x7feb7d3fe658] #23 winbindd(_wbint_PingDc+0x61) [0x7feb7d410991] #24 winbindd(+0x103175) [0x7feb7d41a175] #25 winbindd(winbindd_dual_ndrcmd+0xb7) [0x7feb7d4107d7] #26 winbindd(+0xf8609) [0x7feb7d40f609] #27 winbindd(+0xf9075) [0x7feb7d410075] #28 winbindd(tevent_common_loop_immediate+0xe8) [0x7feb7d4db198] #29 winbindd(run_events_poll+0x3c) [0x7feb7d4d93fc] #30 winbindd(+0x1c2b52) [0x7feb7d4d9b52] #31 winbindd(_tevent_loop_once+0x90) [0x7feb7d4d9f60] #32 winbindd(main+0x7b3) [0x7feb7d3e7aa3] #33 /lib64/libc.so.6(__libc_start_main+0xfd) [0x7feb7a915cdd] #34 winbindd(+0xce2a9) [0x7feb7d3e52a9] This is in a winbind child, logfiles surrounding indicate the parent was restarted. This patch takes all chain locks around the CLEAR_IF_FIRST introduced tdb_new_database.
2012-10-06tdb: Make robust against shrinking tdbsRusty Russell1-12/+20
When probing for a size change (eg. just before tdb_expand, tdb_check, tdb_rescue) we call tdb_oob(tdb, tdb->map_size, 1, 1). Unfortunately this does nothing if the tdb has actually shrunk, which as Volker demonstrated, can actually happen if a "longlived" parent crashes. So move the map/update size/remap before the limit check. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-10-04tdb: add tdb_rescue()Rusty Russell1-0/+349
This allows for an emergency best-effort dump. It's a little better than strings(1). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-10-02tdb: Fix a typoVolker Lendecke1-1/+1
Autobuild-User(master): Volker Lendecke <vl@samba.org> Autobuild-Date(master): Tue Oct 2 19:52:16 CEST 2012 on sn-devel-104
2012-06-22tdb: make TDB_NOSYNC merely disable sync.Rusty Russell1-9/+8
(As suggested by Stefan Metzmacher, based on the change to ntdb.) Since commit ec96ea690edbe3398d690b4a953d487ca1773f1c, we handle the case where a process dies during a transaction commit. Unfortunately, TDB_NOSYNC means this no longer works, as it disables the recovery area as well as the actual msync/fsync. We should do everything except the syncs. This also means we can do a complete test with $TDB_NO_FSYNC set; just to get more complete coverage, we disable it explicitly for one test (where we override the actual sync calls anyway). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-03-29lib/tdb: Add/expose lock functions to support CTDBAmitay Isaacs1-2/+16
This patch adds two lock functions used by CTDB to perform asynchronous locking. These functions do not actually perform any fcntl operations, but only increment internal counters. - tdb_transaction_write_lock_mark() - tdb_transaction_write_lock_unmark() It also exposes two internal functions - tdb_lock_nonblock() - tdb_unlock() These functions are NOT exposed in include/tdb.h to prevent any further uses of these functions. If you ever need to use these functions, consider using tdb2. Signed-off-by: Amitay Isaacs <amitay@gmail.com>
2012-03-23lib/tdb: fix transaction issue for HAVE_INCOHERENT_MMAP.Rusty Russell1-11/+10
We unmap the tdb on expand, the remap. But when we have INCOHERENT_MMAP (ie. OpenBSD) and we're inside a transaction, doing the expand can mean we need to read from the database to partially fill a transaction block. This fails, because if mmap is incoherent we never allow accessing the database via read/write. The solution is not to unmap and remap until we've actually written the padding at the end of the file. Reported-by: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Autobuild-User: Rusty Russell <rusty@rustcorp.com.au> Autobuild-Date: Fri Mar 23 02:53:15 CET 2012 on sn-devel-104
2012-03-23lib/tdb: fix missing return 0 code.Rusty Russell1-1/+1
fde694274e1e5a11d1473695e7ec7a97f95d39e4 made tdb_mmap return an int, but didn't put the return 0 on the "internal db" case. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-03-22lib/tdb: fix OpenBSD incoherent mmap.Rusty Russell3-20/+35
This comment appears in two places in the code (commit 4c6a8273c6dd3e2aeda5a63c4a62aa55bc133099 from 2001): /* * We must ensure the file is unmapped before doing this * to ensure consistency with systems like OpenBSD where * writes and mmaps are not consistent. */ But this doesn't help, because if one process is using mmap and another using pwrite, we get incoherent results. As demonstrated by OpenBSD's failure on the tdb unit tests. Rather than disable mmap on OpenBSD, we test for this issue and force mmap to be enabled. This means that we will fail on very large TDBs on 32-bit systems, but it's better than the horrendous performance penalty on every OpenBSD system. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-02-14tdb: make tdb_private.h idempotent.Rusty Russell1-0/+3
The most convenient way to write unit tests in C is to directly #include the C files (CCAN uses this, for example). That works quite well, but it means that tdb_private.h now needs to be protected against multiple inclusions. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-06Fix compile when TDB_TRACE is enabled.Ira Cooper1-1/+1
Autobuild-User: Jeremy Allison <jra@samba.org> Autobuild-Date: Fri Jan 6 04:16:41 CET 2012 on sn-devel-104
2011-12-25tdb: Use tdb_parse_record in tdb_update_hashVolker Lendecke1-12/+16
This avoids a tdb_fetch, thus a malloc/memcpy/free in the tdb_store path
2011-12-21tdb: don't free old recovery area when expanding if already at EOF.Rusty Russell1-17/+30
We allocate a new recovery area by expanding the file. But if the recovery area is already at the end of file (as shown in at least one client case), we can simply expand the record, rather than freeing it and creating a new one. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Autobuild-User: Rusty Russell <rusty@rustcorp.com.au> Autobuild-Date: Wed Dec 21 06:25:40 CET 2011 on sn-devel-104
2011-12-21tdb: use same expansion factor logic when expanding for new recovery area.Rusty Russell3-21/+34
If we're expanding because the current recovery area is too small, we expand only the amount we need. This can quickly lead to exponential growth when we have a slowly-expanding record (hence a slowly-expanding transaction size). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-12-19tdb: Avoid a malloc/memcpy in _tdb_storeVolker Lendecke1-17/+8
2011-12-19tdb: be more careful on 4G files.Rusty Russell6-23/+53
I came across a tdb which had wrapped to 4G + 4K, and the contents had been destroyed by processes which thought it only 4k long. Fix this by checking on open, and making tdb_oob() check for wrap itself. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Autobuild-User: Rusty Russell <rusty@rustcorp.com.au> Autobuild-Date: Mon Dec 19 07:52:01 CET 2011 on sn-devel-104
2011-08-16tdb: increment sequence number in tdb_wipe_all().Rusty Russell1-0/+2
TDB2 testing revealed that tdb1 doesn't do this. It's minor, but fix it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Autobuild-User: Rusty Russell <rusty@rustcorp.com.au> Autobuild-Date: Tue Aug 16 10:47:41 CEST 2011 on sn-devel-104
2011-06-08tdb: enable VALGRIND to remove valgrind noise.Rusty Russell1-35/+0
Andrew Bartlett complained that valgrind needs --partial-loads-ok=yes otherwise the Jenkins hash makes it complain. My benchmarking here revealed that at least with modern gcc (4.5) and CPU (Intel i5 32 bit) there's no measurable performance penalty for the "correct" code, so rip out the optimized one. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Autobuild-User: Rusty Russell <rusty@rustcorp.com.au> Autobuild-Date: Wed Jun 8 11:05:47 CEST 2011 on sn-devel-104
2011-04-19tdb: make sure we skip over recovery area correctly.Rusty Russell3-17/+44
If it's really the recovery area, we can trust the rec_len field, and don't have to go groping for bitpatterns. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Autobuild-User: Rusty Russell <rusty@rustcorp.com.au> Autobuild-Date: Tue Apr 19 14:15:22 CEST 2011 on sn-devel-104
2011-04-18tdb_expand: limit the expansion with huge recordsSimo Sorce1-5/+20
ldb can create huge records when saving indexes. Limit the tdb expansion to avoid consuming a lot of memory for no good reason if the record being saved is huge.
2011-04-18tdb: tdb_repack() only when it's worthwhile.Rusty Russell1-6/+31
tdb_repack() is expensive and consumes memory, so we can spend some effort to see if it's worthwhile. In particular, tdbbackup doesn't need to repack: it started with an empty database! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-04-18tdb: fix transaction recovery area for converted tdbs.Rusty Russell1-2/+4
This is why macros are dangerous; these were converting the pointers, not the things pointed to! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-03-30tdb: Fix Coverity ID 2238: SECURE_CODINGVolker Lendecke1-24/+24
2011-03-27tdb: Fix Coverity ID 2192: NO_EFFECTVolker Lendecke1-1/+1
(ret < 0) can never be true
2011-02-12tdb: Fix a C++ warningVolker Lendecke1-1/+1
Autobuild-User: Volker Lendecke <vlendec@samba.org> Autobuild-Date: Sat Feb 12 19:50:55 CET 2011 on sn-devel-104
2010-12-29tdb: tdb_summary() support.Rusty Russell3-2/+195
Autobuild-User: Rusty Russell <rusty@rustcorp.com.au> Autobuild-Date: Wed Dec 29 10:12:05 CET 2010 on sn-devel-104
2010-11-27tdb:common/open.c - use "discard_const_p" for certain "tdb->name" assignmentsMatthias Dieter Wallnöfer1-2/+2
In order to suppress compiler warnings.
2010-11-12tdb: set tdb->name early, as it's needed for tdb_name()Stefan Metzmacher1-6/+27
tdb_name() might be used within the given log function, which might be called from within tdb_open_ex(). metze Autobuild-User: Stefan Metzmacher <metze@samba.org> Autobuild-Date: Fri Nov 12 11:22:21 UTC 2010 on sn-devel-104
2010-10-21tdb: Set _PUBLIC_ in C file rather than header files (Debian bug 600898)Jelmer Vernooij11-66/+64
Autobuild-User: Jelmer Vernooij <jelmer@samba.org> Autobuild-Date: Thu Oct 21 11:47:22 UTC 2010 on sn-devel-104
2010-09-27tdb: TDB_INCOMPATIBLE_HASH, to allow safe changing of default hash.Rusty Russell3-4/+20
This flag to tdb_open/tdb_open_ex effects creation of a new database: 1) Uses the Jenkins lookup3 hash instead of the old gdbm hash if none is specified, 2) Places a non-zero field in header->rwlocks, so older versions of TDB will refuse to open it. This means that the caller (ie Samba) can set this flag to safely change the hash function. Versions of TDB from this one on will either use the correct hash or refuse to open (if a different hash is specified). Older TDB versions will see the nonzero rwlocks field and refuse to open it under any conditions. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-09-27tdb: automatically identify Jenkins hash tdbsRusty Russell1-14/+27
If the caller to tdb_open_ex() doesn't specify a hash, and tdb_old_hash doesn't match, try tdb_jenkins_hash. This was Metze's idea: it makes life simpler, especially with the upcoming TDB_INCOMPATIBLE_HASH flag. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-09-27tdb: add Bob Jenkins lookup3 hash as helper hash.Rusty Russell3-15/+382
This is a better hash than the default: shipping it with tdb makes it easy for callers to use it as the hash by passing it to tdb_open_ex(). This version taken from CCAN and modified, which took it from http://www.burtleburtle.net/bob/c/lookup3.c. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-09-20lib/tdb: fix c++ build warning in tdb_header_hash().Günther Deschner1-1/+1
Guenther
2010-09-16tdb: added TDB_NO_FSYNC env variableAndrew Tridgell1-0/+4
this might help reduce test times and load on test machines
2010-09-13tdb: put example hashes into header, so we notice incorrect hash_fn.Rusty Russell3-2/+65
This is Stefan Metzmacher <metze@samba.org>'s patch with minor changes: 1) Use the TDB_MAGIC constant so both hashes aren't of strings. 2) Check the hash in tdb_check (paranoia, really). 3) Additional check in the (unlikely!) case where both examples hash to 0. 4) Cosmetic changes to var names and complaint message. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-09-13tdb: fix tdb_check() on other-endian tdbs.Rusty Russell1-1/+1
We must not endian-convert the magic string, just the rest. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-09-13tdb: fix tdb_check() on read-only TDBs to actually work.Rusty Russell1-5/+17
Commit bc1c82ea137 "Fix tdb_check() to work with read-only tdb databases." claimed to do this, but tdb_lockall_read() fails on read-only databases. Also make sure we can still do tdb_check() inside a transaction (weird, but we previously allowed it so don't break the API). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-09-13tdb: make check more robust against recovery failures.Rusty Russell1-5/+36
We can end up with dead areas when we die during transaction commit; tdb_check() fails on such a (valid) database. This is particularly noticable now we no longer truncate on recovery; if the recovery area was at the end of the file we used to remove it that way. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-08-14tdb: workaround starvation problem in locking entire database.Rusty Russell1-17/+69
We saw tdb_lockall() take 71 seconds under heavy load; this is because Linux (at least) doesn't prevent new small locks being obtained while we're waiting for a big log. The workaround is to do divide and conquer using non-blocking chainlocks: if we get down to a single chain we block. Using a simple test program where children did "hold lock for 100ms, sleep for 1 second" the time to do tdb_lockall() dropped signifiantly. There are ln(hashsize) locks taken in the contended case, but that's slow anyway. More analysis is given in my blog at http://rusty.ozlabs.org/?p=120 This may also help transactions, though in that case it's the initial read lock which uses this gradual locking routine; the update-to-write-lock code is separate and still tries to update in one go. Even though ABI doesn't change, minor version bumped so behavior change can be easily detected. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-07-29Fix tdb_check() to work with read-only tdb databases. The function ↵Jeremy Allison1-3/+3
tdb_lockall() uses F_WRLCK internally, which doesn't work on a fd opened with O_RDONLY. Use tdb_lockall_read() instead. Jeremy.
2010-07-01tdb: fix the build on mac os x 10.6.4.Günther Deschner1-0/+4
Guenther
2010-05-11tdb: remove unused variable in tdb_new_database().Günther Deschner1-1/+0
Guenther
2010-05-05tdb: fix short write logic in tdb_new_databaseRusty Russell3-17/+17
Commit 207a213c/24fed55d purported to fix the problem of signals during tdb_new_database (which could cause a spurious short write, hence a failure). However, the code is wrong: newdb+written is not correct. Fix this by introducing a general tdb_write_all() and using it here and in the tracing code. Cc: Stefan Metzmacher <metze@samba.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-04-20tdb: update tdb ABI to use hide_symbols=TrueAndrew Tridgell1-1/+1
We now use -fvisibilty=hidden to hide symbols from outside the tdb shared library. This also moved tdb_transaction_recover() into the tdb_private.h header, as it should never have been a public API. For that reason we are changing the version number. We're only doing a minor version increment as it is extremely unlikely that anyone was actually using tdb_transaction_recover() as its locking requirements were rather unusual. Pair-Programmed-With: Rusty Russell <rusty@samba.org>
2010-03-26tdb: Add a non-blocking version of tdb_transaction_startVolker Lendecke4-7/+22
2010-03-25tdb: Fix indentation in tdb_new_database()Volker Lendecke1-1/+1
2010-03-25Fix some nonempty blank linesVolker Lendecke10-45/+44
2010-02-28tdb: If tdb_parse_record does not find a record, return -1 instead of 0Volker Lendecke1-1/+4
2010-02-24tdb: handle processes dying during transaction commit.Rusty Russell3-0/+86
tdb transactions were designed to be robust against the machine powering off, but interestingly were never designed to handle the case where an administrator kill -9's a process during commit. Because recovery is only done on tdb_open, processes with the tdb already mapped will simply use it despite it being corrupt and needing recovery. The solution to this is to check for recovery every time we grab a data lock: we could have gained the lock because a process just died. This has no measurable cost: here is the time for tdbtorture -s 0 -n 1 -l 10000: Before: 2.75 2.50 2.81 3.19 2.91 2.53 2.72 2.50 2.78 2.77 = Avg 2.75 After: 2.81 2.57 3.42 2.49 3.02 2.49 2.84 2.48 2.80 2.43 = Avg 2.74 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24patch tdb-refactor-tdb_lock-and-tdb_lock_nonblock.patchRusty Russell1-16/+13