s3-brlock: add a minimim retry time for pending blocking locks

When we are waiting on a pending byte range lock, another smbd might exit uncleanly, and therefore not notify us of the removal of the lock, and thus not trigger the lock to be retried. We coped with this up to now by adding a message_send_all() in the SIGCHLD and cluster reconfigure handlers to send a MSG_SMB_UNLOCK to all smbd processes. That would generate O(N^2) work when a large number of clients disconnected at once (such as on a network outage), which could leave the whole system unusable for a very long time (many minutes, or even longer). By adding a minimum re-check time for pending byte range locks we avoid this problem by ensuring that pending locks are retried at a more regular interval.
author: Andrew Tridgell <tridge@samba.org> 2010-02-05 20:59:43 -0800
committer: Jeremy Allison <jra@samba.org> 2010-02-05 22:17:17 -0800
commit: 5b398edbee672392f2cea260ab17445ecca927d7 (patch)
tree: 8beebd8ccfb2770f550d8525d59e594a4daf4c42 /source3
parent: 5bb89bc47cbba73c732ea6873b72849e9f239503 (diff)
download: samba-5b398edbee672392f2cea260ab17445ecca927d7.tar.gz
samba-5b398edbee672392f2cea260ab17445ecca927d7.tar.bz2
samba-5b398edbee672392f2cea260ab17445ecca927d7.zip
1 files changed, 20 insertions, 0 deletions
diff --git a/source3/smbd/blocking.c b/source3/smbd/blocking.c
index deb7f8f221..6c7c167ab5 100644
--- a/source3/smbd/blocking.c
+++ b/source3/smbd/blocking.c
@@ -72,6 +72,7 @@ static bool recalc_brl_timeout(void)
 {
 	struct blocking_lock_record *blr;
 	struct timeval next_timeout;
+	int max_brl_timeout = lp_parm_int(-1, "brl", "recalctime", 5);
 
 	TALLOC_FREE(brl_timeout);
 
@@ -100,6 +101,25 @@ static bool recalc_brl_timeout(void)
 		return True;
 	}
 
+	/* 
+	 to account for unclean shutdowns by clients we need a
+	 maximum timeout that we use for checking pending locks. If
+	 we have any pending locks at all, then check if the pending
+	 lock can continue at least every brl:recalctime seconds
+	 (default 5 seconds).
+
+	 This saves us needing to do a message_send_all() in the
+	 SIGCHLD handler in the parent daemon. That
+	 message_send_all() caused O(n^2) work to be done when IP
+	 failovers happened in clustered Samba, which could make the
+	 entire system unusable for many minutes.
+	*/
+
+	if (max_brl_timeout > 0) {
+		struct timeval min_to = timeval_current_ofs(max_brl_timeout, 0);
+		next_timeout = timeval_min(&next_timeout, &min_to);             
+	}
+
 	if (DEBUGLVL(10)) {
 		struct timeval cur, from_now;
author	Andrew Tridgell <tridge@samba.org>	2010-02-05 20:59:43 -0800
committer	Jeremy Allison <jra@samba.org>	2010-02-05 22:17:17 -0800
commit	5b398edbee672392f2cea260ab17445ecca927d7 (patch)
tree	8beebd8ccfb2770f550d8525d59e594a4daf4c42 /source3
parent	5bb89bc47cbba73c732ea6873b72849e9f239503 (diff)
download	samba-5b398edbee672392f2cea260ab17445ecca927d7.tar.gz samba-5b398edbee672392f2cea260ab17445ecca927d7.tar.bz2 samba-5b398edbee672392f2cea260ab17445ecca927d7.zip