Age | Commit message (Collapse) | Author | Files | Lines |
|
(Cleaning "ensure we exit with non-zero status on EOF on socket"
after rebasing to v3-3-test which has no "make proto" anymore.)
Michael
(This used to be commit a958c6bf1e0394e98df286974d78d3b07498e0b4)
|
|
can trigger a brlock db cleanup
(This used to be commit bbd49f9e1c4b50c4a596fb991f3306e1e90c0177)
|
|
(This used to be commit 6fe27d296c389473c24e8c627a61bd56b364ad9f)
|
|
(This used to be commit 30b83245a22ebd5e4fa4739dd2aa1805373a7eb2)
|
|
(This used to be commit 9d3217bb28765e107c230fb90b578dcc6f5d4375)
|
|
allowed for tdb. This is needed for the registry db backend.
(This used to be commit 4b04ec29c76df837a7909725bbbf4c79d5abdb4d)
|
|
(This used to be commit a2f70fc175b748ef160a998d0563c28381ea3466)
|
|
out of sync
(This used to be commit 571ec7893c8b40959c005d510c039e3f231ffc67)
|
|
thinking it was a failure of a transaction cancel
(This used to be commit 22dbe158ed62ae47bbcb41bba3db345294f75437)
|
|
(This used to be commit ddf3022595fe8ca378c5f52107f42e296f852685)
|
|
(This used to be commit fe6a03e7b11cd859fddae5ba924ea5e071b8ccea)
|
|
1) when all nodes write the same value to the record, or when writing
a value that is already there, we can skip the write and save
ourselves a network transactions
2) when all remote nodes fail an update, and we then fail a replay, we
don't need to trigger a recovery. This solves a corner case where
we could get into a recovery loop
(This used to be commit 2481bfce4307274806584b0d8e295cc7f638e184)
|
|
could lead to it blocking forever
(This used to be commit a633390d3a7cb04a7c4e14cba9c533621793287e)
|
|
(This used to be commit ba64a757f86fb60994e12e81416083ac0fa11c21)
|
|
(This used to be commit 30a697c82db53f9d801e220a7c6277f873ebce67)
|
|
(This used to be commit 76fbe56e827193d939676da23a580aa0f9394dd1)
|
|
(This used to be commit 32b8db27652a66a2ade547a6d27f34d0816f7296)
|
|
(This used to be commit 037516f1362c8d64da1d47a0cdaf83198d3eaeaf)
|
|
(This used to be commit 21729256a550509c3c038efa5acdd6ac39027dce)
|
|
(This used to be commit 2e85cbe88b3d1674b915f62e02be7d005fddaa39)
|
|
(This used to be commit f91a3e0f7b7737c1d0667cd961ea950e2b93e592)
|
|
(This used to be commit 126f4ac8e85458ee4693b89a184b99420f1b6bee)
|
|
fetch mapping.
Michael
(This used to be commit cb4c74c9c206e5a445ca636fa6562ce721ea5839)
|
|
Michael
(This used to be commit d776d8df262e1753fb428450140df94e63035af5)
|
|
1. use the return value that idmap_tdb2_open_perm_db() gives us
2. don't delete frep the local db if deleting from the perm db failed.
3. fix wrong interpretation of return value of the local delete
Michael
(This used to be commit 147573d7f6faab0ad90258b6a28c4b9575ccb6ea)
|
|
1. use the return value that idmap_tdb2_open_perm_db() gives us
2. don't write to the local db if writing to the perm db failed.
3. fix wrong interpretation of return value of the local store
Michael
(This used to be commit be8c6b4f2f40014313899b5cbc1da9d390d94fee)
|
|
Only retry when ctdbd_persisten_update() failed.
Michael
(This used to be commit ff413a4614c8b272a34b2a9e56a329a8e8749a34)
|
|
store.
Michael
(This used to be commit eaf76c751f9bde2843174b400c109304831df83e)
|
|
as delete_rec operation from fetch_locked()
Michael
(This used to be commit f4aab595a0219305fbedf8890e787b690660a55a)
|
|
to reduce code duplication.
Michael
(This used to be commit 09a197e756459877cab7b4d09f534c6a41cfdd71)
|
|
This is because ctdbd can fail in performing the persistent_store
due to race conditions, and this does not mean it can't succeed
the next time.
To not loop infinitely, this makes use of a new parametric option:
"dbwrap ctdb:max store retries" (integer) which defaults to 5
and sets the upper limit for the number or repeats of the
fetch/store cycle.
Michael
(This used to be commit 2bcc9e6ecef876030e552a607d92597f60203db2)
|
|
in the persistent db_ctdb_store operation.
This is to prevent deadlocks in db_ctdb_persistent_store().
There is a tradeoff: Usually, the record is still locked
after db->store operation. This lock is usually released
via the talloc destructor with the TALLOC_FREE to
the record. So we have two choices:
- Either re-lock the record after the call to persistent_store
or cancel_persistent update and this way not changing any
assumptions callers may have about the state, but possibly
introducing new race conditions.
- Or don't lock the record again but just remove the
talloc_destructor. This is less racy but assumes that
the lock is always released via TALLOC_FREE of the record.
I choose the first variant for now since it seems less racy.
We can't guarantee that we succeed in getting the lock
anyways. The only real danger here is that a caller
performs multiple store operations after a fetch_locked()
which is currently not the case.
Michael
(This used to be commit d004c9a7281d2577c3ba2012c8f790cc198ea700)
|
|
Michael
(This used to be commit c939c55e5182258092faceefa58a7f328f18619e)
|
|
database in an inconsistent state if we crash during the operation
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
(This used to be commit 09329f1f9114af44fc4e5e4f29a7315912313125)
|
|
This can not go upstream yet because it uses the non-GPL libgpfs. So it will
not be compiled by default and will not be included in the SOFS RPMs. But upon
Sven's request, we include it in the git tree and the source RPMs, so that it
can be built for in-house tests.
(This used to be commit fc9b30bed2d774dca6660b497cb50f982b23b885)
|
|
(This used to be commit 549db133df6782bcca7d033e8573e47716877cbd)
|
|
(This used to be commit 5fd51833a31b326d83ac2f76d06560920547f657)
|
|
(This used to be commit 2856d2e4a43fbcc6c8f8ac7b1613828170362861)
|
|
This is a band-aid for the rather convoluted offline/online mess in winbind
right now. Winbind re-uses the offline functionality that is targeted at domain
client installations on laptops to not overload disfunctional DCs. It uses the
winbind cache timeout as the retry timeout after a DC reboot.
I am using a parametric options because when this mess is cleaned up, that
parameter needs to go away again.
I'd recommend to use something like
winbind:online check timeout = 30
in typical LAN environments. This means a reconnect is attempted every 30
seconds.
Volker
(This used to be commit 9920473cc165e75ee9aa5cbb9e568eb5fb67e9e6)
|
|
With the ctdb checkin dde9f3f006 tdb optimized out write lock checks for
write-enabled transaction. Sadly, this also removed the possibility to ever
remove dead records left over from tdb_delete calls within a transaction.
Tridge, please check this! Did dde9f3f006 have any reason beyond performance
optimizations?
Thanks,
Volker
(This used to be commit 3f884c4ae36f3260e63626bdd4989d9258ae6497)
|
|
(cherry picked from commit 666bf8456ac44cbbbd5524af2bf4fd89e18ddf62)
(This used to be commit 8819c51809cabe6ad0843f3838de53e785a10b47)
|
|
(This used to be commit 8cb7ae011c8b8cb244e9b87a3ad51e27646411b6)
|
|
Here is a patch to allow many subsystems to be re-initialized. The only
functional change I made was to remove the null context tracking, as the memory
allocated here is designed to be left for the complete lifetime of the program.
Freeing this early (when all smb contexts are destroyed) could crash other
users of talloc.
Jeremy.
(This used to be commit 8c630efd25cf17aff59448ca05c1b44a41964b16)
|
|
(cherry picked from commit cee044bc42d955c535dbb6bb372af01089d37756)
(This used to be commit 2462562b5c90bc1c46237cd980810b0a69cd116d)
|
|
scan_directory).
Michael
(This used to be commit 15fc2427f91da697e0e91f7f34b0f0c6e230a9a5)
|
|
map_nt_error_from_unix() now assumes that it is called in
an error path and returns an error even for a given errno == 0.
The original behaviour of unix_convert() used the mapping
of errno == 0 ==> NT_STATUS_OK to return success through
an error path.
I think this must have been an oversight, and unix_convert() worked
only by coincidence (or because explicitly using the knowledge
of the conceptually wrong working of map_nt_error_from_unix().
This patch puts this straight by not interpreting errno == 0
as an error condition and proceeding in that case.
Jeremy - please check!
Michael
(This used to be commit ec5956ab0df1b3f567470b2481b73da9c3c67371)
|
|
one of our virtualised functions, such as db_open(), but error is only
set when a system call fails, and it is not uncommon for us to fail a
function internally without ever making a system call. That led to us
passing back success when a function had in fact failed.
I found two places where we relied on map_nt_error_from_unix()
returning success when errno==0, but lots and lots of places where we
relied on the reverse, so I fixed those two places.
map_nt_error_from_unix() will now always return an error, returning
NT_STATUS_UNSUCCESSFUL if errno is 0
(cherry picked from commit 69d40ca4c1af925d4b0e59ddc69ef8c26e6501d1)
(This used to be commit 834684a524a24bb4eb46b4af583d39947dc87d95)
|
|
for one
(This used to be commit 469ba9b87103aa0053c371e481acc5acf0f98ac1)
|
|
When a request-key upcall exits without instantiating a key, the kernel
will negatively instantiate the key with a 60s timeout. Older kernels,
however seem to also link that key into the session keyring. This
behavior can interefere with subsequent mount attempts until the
key times out. The next request_key() call will get this negative key
even if the upcall would have worked the second time.
Fix this by having cifs.upcall negatively instantiate the key itself
with a 1s timeout and don't attach it to the session keyring.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
(This used to be commit f760dd3f3128c846cdeab16cc52bbb5189427955)
|
|
(This used to be commit 257b0401ee675b6b7eddf2b46a0f8115940e6640)
|