Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
|
|
|
|
|
|
Also add torture test to check filter parsing.
|
|
to respond to a read or write."
This reverts commit a6ae7a552f851a399991262377cc0e062e40ac20.
This fixes bug #7222 (All users have full rigths on all shares) (CVE-2010-0728).
(cherry picked from commit 1c9494c76cc9686c61e0966f38528d3318f3176f)
|
|
|
|
|
|
|
|
In a cluster, this makes a large difference: For r/w traverse, we have to do a
fetch_locked on every record which for most users of connections_forall is just
overkill.
|
|
|
|
connections_forall is called from count_current_connections() which potentially
deletes dead records. This needs r/w access to connections.tdb.
connections_traverse says it does not provide this. Does not really matter in
the smbd case, because we have opened it before r/w, so this is "just" cleanup.
|
|
Guenther
|
|
|
|
|
|
|
|
|
|
|
|
Detected while showing this code to obnox :-)
|
|
There's no need to still hold the g_lock tdb-level lock while telling the
waiters to retry
|
|
In g_lock_unlock we have a little race between the process_exists and
messaging_send call: We only send to 5 waiters now, they all might have died
between us checking their existence and sending the message. This change makes
g_lock_lock retry at least once every minute.
|
|
Only notify the first 5 pending lock waiters. This avoids a thundering herd
problem that is really nasty in a cluster. It also makes acquiring a lock a bit
more FIFO, lock waiters are added to the end of the array.
|
|
Only check the existence of the lock owner in g_lock_parse, check the rest of
the records only when we got the lock successfully. This reduces the load on
process_exists which can involve a network roundtrip in the clustered case.
|
|
g_lock_parse might have thrown away entries from the locks array because the
processes were not around anymore. Don't store the orphaned entries.
|
|
|
|
A return code of 1 from initgroups() is OK since apparently it means
the gid has already been set. The man page doesn't mention this.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
and 0 in the places where it does.
Jeremy
|
|
Jeremy.
|
|
This made smbd crash in g_lock_lock() when trying to start a
transaction on a db with an already started transaction,
e.g. in a tcon_and_X where the share_info.tdb was not yet
initialized but share_info.tdb was already locked by another
process or writing acces to the winreg rpc pipe where the
registry tdb was already locked by another process.
What we really _want_ to do here by design is to react to
MSG_DBWRAP_G_LOCK_RETRY messages that are either sent
by a client doing g_lock_unlock or by ourselves when
we receive a CTDB_SRVID_SAMBA_NOTIFY or
CTDB_SRVID_RECONFIGURE message from ctdbd, i.e. when
either a client holding a lock or a complete node
has died.
Doing this properly involves calling tevent_loop_once(),
but doing this here with the main ctdbd messaging context
creates a nested event loop when g_lock_lock() is called
from the main event loop.
So as a quick fix, we act a little corasely here: we do
a select on the ctdb connection fd and when it is readable
or we get EINTR, then we retry without actually parsing
any ctdb packages or dispatching messages. This means that
we retry more often than necessary and intended by design,
but this does not harm and it is unobtrusive. When we have
finished, the main loop will pick up all the messages and
ctdb packets. The only extra twist is that we cannot use
timed events here but have to handcode a timeout for select.
Michael
|
|
Michael
|
|
Michael
|
|
The key for reading and writing was inconsistent due to a
off by one data length.
Michael
|
|
This skips update of the __db_sequence_number__ record when nothing else has
been written. There are transactions that are just openend and then nothing
is written until transaction_commit is called. This is for instance the case
with registry initialization routines: They start a transaction and only
write somthing when the registry has not been initialized yet.
So this change will skip many db_seqnum bumps and TRANS3_COMMIT roundtrips.
Michael
|
|
I carefully prepared the return value only to "return 0;" at the bottom. :-(
This may well have hit us for instance in the nested cancel case
and produced random errors.
Michael
|
|
The logic bug was that if a record was found in the marshall buffer,
then always the ctdb header of tha last record in the marshall buffer
was returned, and not the ctdb header of the last occurrence of the
requested record.
This is fixed by introducing an additional temporary variable.
Michael
|
|
Michael
|
|
Michael
|
|
Don't treat this as an error but return seqnum 0 instead.
Michael
|
|
For persistent databases, 64bit integer is kept in a special record
__db_sequence_number__. This record is incremented with each completed
transaction.
The retry mechanism for failing TRANS3_COMMIT controls inside the
db_ctdb_transaction_commit() function now relies one a modified
behaviour of ctdbd's treatment of persistent databases in recoveries.
Recently, a special treatment for persistent databases had been
introduced in ctdb (1.0.108) to work around the problems with the
orinal design of persistent transactions.
Now with the rewrite we need to revert to the old behaviour that
ctdb always takes the newest copies of all records.
This change also paves the way for a next step, which will make
recovery use the db seqnum to tell which node has the newest copy
of a persistent db and use that node's copy. This will greatly
reduce the amount of data transferred with each recovery.
Michael
|
|
The return values calculated by the callers were wrong anyways since
the new marshalling code does not set the local tdbs tdb error code.
Michael
|
|
Michael
|
|
This simplifies the transaction code a lot:
* transaction_start essentially consists of acquiring a global lock.
* No write operations at all are performed on the local database
until the transaction is committed: Every store operation is just
going into the marshall buffer.
* The commit operation calls a new simplified TRANS3_COMMIT control
in ctdb which rolls out thae changes to all nodes including the
node that is performing the transaction.
Michael
|
|
|
|
This is the basis to implement global locks in ctdb without depending on a
shared file system. The initial goal is to make ctdb persistent transactions
deterministic without too many timeouts.
|