Age | Commit message (Collapse) | Author | Files | Lines |
|
When a winbind child exits, we need to immediately close the socket. If not,
the next request to that child will be sent to a socket without a listener,
leading to a failed request. This failed request will then trigger a proper
re-init.
This patch avoids the one failed request.
Autobuild-User: Volker Lendecke <vlendec@samba.org>
Autobuild-Date: Wed May 4 13:32:16 CEST 2011 on sn-devel-104
|
|
needed).
Guenther
|
|
In the clustering case if ctdb is unhappy, winbindd_reinit_after_fork fails.
This can lead to an endless loop depending on the scheduling of the parent vs
child. Parent forks, child is immediately scheduled and exits. Parent gets
SIGCHLD, parent is then scheduled before it sends the request out to the child.
Parent tries to fork again immediately.
The code before this patch did not really take into account that
reinit_after_fork can fail. The code now sends the result of
winbindd_reinit_after_fork to the parent and the parent only considers the
child alive when it got NT_STATUS_OK.
This was seen in 3.4 winbind. winbind has changed significantly since then, so
it might be possible that this does not happen anymore in exactly this way. But
passing up the status of reinit_after_fork and only consider the child alive
when that's ok is the correct thing to do anyway.
Autobuild-User: Volker Lendecke <vlendec@samba.org>
Autobuild-Date: Fri Apr 29 17:58:19 CEST 2011 on sn-devel-104
|
|
|
|
This should further reduce fd load in winbind children
|
|
Guenther
Autobuild-User: Günther Deschner <gd@samba.org>
Autobuild-Date: Fri Apr 29 14:00:30 CEST 2011 on sn-devel-104
|
|
Guenther
|
|
Guenther
|
|
Guenther
|
|
This is a real bug: tevent_req_set_endtime already calls tevent_req_nomem.
Autobuild-User: Volker Lendecke <vlendec@samba.org>
Autobuild-Date: Mon Mar 21 16:29:22 CET 2011 on sn-devel-104
|
|
|
|
Guenther
|
|
Autobuild-User: Volker Lendecke <vlendec@samba.org>
Autobuild-Date: Wed Feb 2 18:10:45 CET 2011 on sn-devel-104
|
|
main loop"
This reverts commit 455fccf86b6544cd17a2571c63a88f8aebff3f74.
I'll add a more generic fix for this problem.
metze
|
|
This makes us scale better with many simultaneous winbind requests,
some of which might be slow.
This implementation breaks offline logons, as the cached credentials are
maintained in a child (this needs fixing). So, if the offline logons are
active, only allow one DC connection.
Probably the offline logon and the scalable file server cases are
separate enough so that this patch is useful even with the restriction.
|
|
pass this in as the &now parameter. Push this call inside of
event_add_to_select_args() to the correct point so it doesn't
get called unless needed.
Jeremy.
Autobuild-User: Jeremy Allison <jra@samba.org>
Autobuild-Date: Thu Dec 23 01:08:11 CET 2010 on sn-devel-104
|
|
|
|
|
|
If a child dies, the parent process right away closes the socket.
This is wrong, with tevent we still have events pending. This works
fine for epoll but does not for at least the FreeBSD select variant.
Tevent sticks a closed socket into the select masks. This then
returns an error EBADF. When this happens, the parent winbind dies
instead of forking a new child.
This moves the socket close from the SIGCHLD cleanup function to
the socket receiver. I could not reproduce the parent death anymore
and it did not create an obvious fd leak.
Autobuild-User: Jeremy Allison <jra@samba.org>
Autobuild-Date: Mon Dec 6 23:21:02 CET 2010 on sn-devel-104
|
|
|
|
This will reduce the noise from merges of the rest of the
libcli/security code, without this commit changing what code
is actually used.
This includes (along with other security headers) dom_sid.h and
security_token.h
Andrew Bartlett
Autobuild-User: Andrew Bartlett <abartlet@samba.org>
Autobuild-Date: Tue Oct 12 05:54:10 UTC 2010 on sn-devel-104
|
|
Previously, only one fd handler was being called per main message loop
in all smbd child processes.
In the case where multiple fds are available for reading the fd
corresponding to the event closest to the beginning of the event list
would be run. Obviously this is arbitrary and could cause unfairness.
Usually, the first event fd is the network socket, meaning heavy load
of client requests can starve out other fd events such as oplock
or notify upcalls from the kernel.
In this patch, I have changed the behavior of run_events() to unset
any fd that it has already called a handler function, as well
as decrement the number of fds that were returned from select().
This allows the caller of run_events() to iterate it, until all
available fds have been handled.
I then changed the main loop in smbd child processes to iterate
run_events(). This way, all available fds are handled on each wake
of select, while still checking for timed or signalled events between
each handler function call. I also added an explicit check for
EINTR from select(), which previously was masked by the fact that
run_events() would handle any signal event before the return code
was checked.
This required a signature change to run_events() but all other callers
should have no change in their behavior. I also fixed a bug in
run_events() where it could be called with a selrtn value of -1,
doing unecessary looping through the fd_event list when no fds were
available.
Also, remove the temporary echo handler hack, as all fds should be
treated fairly now.
|
|
Guenther
|
|
This is supposed to improve the winbind reconnect time after an ip address
has been moved away from a box. Any kind of HA scenario will benefit from
this, because winbindd does not have to wait for the TCP timeout to kick in
when a local IP address has been dropped and DC replies are not received
anymore.
|
|
Giving the parent pid to reinit_after_fork is not a good idea....
None of the other callers do this, checked it.
|
|
|
|
metze
|
|
metze
|
|
Guenther
|
|
|
|
Andreas, please check.
Guenther
|
|
If log file is set in the config file, we should create the log files of
the winbind child processes in the same directory.
|
|
|
|
|
|
This shrinks include/includes.h.gch by the size of 7 MB and reduces build time
as follows:
ccache build w/o patch
real 4m21.529s
ccache build with patch
real 3m6.402s
pch build w/o patch
real 4m26.318s
pch build with patch
real 3m6.932s
Guenther
|
|
|
|
|
|
The main problem is that we call CatchChild() within the
parent winbindd, which overwrites the signal handler
that was registered by winbindd_setup_sig_chld_handler().
That means winbindd_sig_chld_handler() and winbind_child_died()
are never triggered when a winbindd domain child dies.
As a result will get "broken pipe" for all requests to that domain.
To reduce the risk of similar bugs in future we call
CatchChild() in winbindd_reinit_after_fork() now.
We also use a full winbindd_reinit_after_fork() in the
cache validation child now instead instead of just resetting
the SIGCHLD handler by hand. This will also fix possible
tdb problems on systems without pread/pwrite and disabled mmap
as we now correctly reopen the tdb handle for the child.
metze
|
|
metze
|
|
reported by valgrind
The timeval passed to event_add_to_select_args() must be initialized
as event_add_to_select_args() uses a timeval_min() on this and next_event.
|
|
|
|
|
|
|
|
|
|
|
|
The machine password handler has code to deal with every node in the cluster
trying to change the machine password at the same time. However, it is not very
nice to the DC if everyone tries this simultaneously. This adds a random 0-255
second offset to our timed event. When this fires a bit later than strictly
calculated, someone else might have stepped in and have already changed it. The
timed event handler will handle this gracefully, it won't even try to do it
again.
|
|
When there is a temporary problem changing passwords we flooded the DC with
pwchange requests. This gives the DC a 60-second break to recover.
|
|
Someone else might have come in between and changed the password since we
created that timed request
|
|
|
|
|