Age | Commit message (Collapse) | Author | Files | Lines |
|
sys_poll() is only needed if the signal pipe is set up and used, but as
no signal handler ever writes to the pipe, this can all be removed.
signal based events are now handled via tevent.
Andrew Bartlett
Signed-off-by: Jeremy Allison <jra@samba.org>
|
|
All callers to messaging_[re]init only used procid_self()
|
|
This fixes a race condition that leads to the winbindd_children list becoming
corrupted. It happens when on a busy winbind SIGCHLD is a bit late.
Imagine a winbind with multiple requests in the queue for a single child. Child
dies, and before the SIGCHLD handler is called we find the socket to be dead.
wb_child_request_done is called, receiving an error from wb_simple_trans_recv.
It closes the socket. Then immediately the wb_child_request_trigger will do
another fork_domain_child before the signal handler is called. This means that
we do another fork_domain_child, we have child->sock==-1 at this point.
fork_domain_child will do a DLIST_ADD(winbindd_children, child) a second time
where the child is already part of that list. This corrupts the list. Then the
signal handler kicks in, spinning in
for (child = winbindd_children; child != NULL; child = child->next) {
forever. Not good. This patch makes sure that both conditions (sock==-1 and not
part of the list) for a winbindd_child struct match up.
Autobuild-User: Volker Lendecke <vlendec@samba.org>
Autobuild-Date: Fri Aug 26 18:51:24 CEST 2011 on sn-devel-104
|
|
Counterpart for last checkin. A lot less likely, but not impossible in a child.
Autobuild-User: Volker Lendecke <vlendec@samba.org>
Autobuild-Date: Fri Aug 26 13:14:27 CEST 2011 on sn-devel-104
|
|
I've seen
[2011/08/26 01:44:10.872057, 1] winbindd/winbindd_dual.c:1336(fork_domain_child)
fork_domain_child: Could not read child status: nread=-1, error=Interrupted system call
on a customer box. Not good.
|
|
This fixes a few Coverity errors
|
|
Using the standard macro makes it easier to move code into common, as
TALLOC_ZERO_P isn't standard talloc.
|
|
|
|
When a winbind child exits, we need to immediately close the socket. If not,
the next request to that child will be sent to a socket without a listener,
leading to a failed request. This failed request will then trigger a proper
re-init.
This patch avoids the one failed request.
Autobuild-User: Volker Lendecke <vlendec@samba.org>
Autobuild-Date: Wed May 4 13:32:16 CEST 2011 on sn-devel-104
|
|
needed).
Guenther
|
|
In the clustering case if ctdb is unhappy, winbindd_reinit_after_fork fails.
This can lead to an endless loop depending on the scheduling of the parent vs
child. Parent forks, child is immediately scheduled and exits. Parent gets
SIGCHLD, parent is then scheduled before it sends the request out to the child.
Parent tries to fork again immediately.
The code before this patch did not really take into account that
reinit_after_fork can fail. The code now sends the result of
winbindd_reinit_after_fork to the parent and the parent only considers the
child alive when it got NT_STATUS_OK.
This was seen in 3.4 winbind. winbind has changed significantly since then, so
it might be possible that this does not happen anymore in exactly this way. But
passing up the status of reinit_after_fork and only consider the child alive
when that's ok is the correct thing to do anyway.
Autobuild-User: Volker Lendecke <vlendec@samba.org>
Autobuild-Date: Fri Apr 29 17:58:19 CEST 2011 on sn-devel-104
|
|
|
|
This should further reduce fd load in winbind children
|
|
Guenther
Autobuild-User: Günther Deschner <gd@samba.org>
Autobuild-Date: Fri Apr 29 14:00:30 CEST 2011 on sn-devel-104
|
|
Guenther
|
|
Guenther
|
|
Guenther
|
|
This is a real bug: tevent_req_set_endtime already calls tevent_req_nomem.
Autobuild-User: Volker Lendecke <vlendec@samba.org>
Autobuild-Date: Mon Mar 21 16:29:22 CET 2011 on sn-devel-104
|
|
|
|
Guenther
|
|
Autobuild-User: Volker Lendecke <vlendec@samba.org>
Autobuild-Date: Wed Feb 2 18:10:45 CET 2011 on sn-devel-104
|
|
main loop"
This reverts commit 455fccf86b6544cd17a2571c63a88f8aebff3f74.
I'll add a more generic fix for this problem.
metze
|
|
This makes us scale better with many simultaneous winbind requests,
some of which might be slow.
This implementation breaks offline logons, as the cached credentials are
maintained in a child (this needs fixing). So, if the offline logons are
active, only allow one DC connection.
Probably the offline logon and the scalable file server cases are
separate enough so that this patch is useful even with the restriction.
|
|
pass this in as the &now parameter. Push this call inside of
event_add_to_select_args() to the correct point so it doesn't
get called unless needed.
Jeremy.
Autobuild-User: Jeremy Allison <jra@samba.org>
Autobuild-Date: Thu Dec 23 01:08:11 CET 2010 on sn-devel-104
|
|
|
|
|
|
If a child dies, the parent process right away closes the socket.
This is wrong, with tevent we still have events pending. This works
fine for epoll but does not for at least the FreeBSD select variant.
Tevent sticks a closed socket into the select masks. This then
returns an error EBADF. When this happens, the parent winbind dies
instead of forking a new child.
This moves the socket close from the SIGCHLD cleanup function to
the socket receiver. I could not reproduce the parent death anymore
and it did not create an obvious fd leak.
Autobuild-User: Jeremy Allison <jra@samba.org>
Autobuild-Date: Mon Dec 6 23:21:02 CET 2010 on sn-devel-104
|
|
|
|
This will reduce the noise from merges of the rest of the
libcli/security code, without this commit changing what code
is actually used.
This includes (along with other security headers) dom_sid.h and
security_token.h
Andrew Bartlett
Autobuild-User: Andrew Bartlett <abartlet@samba.org>
Autobuild-Date: Tue Oct 12 05:54:10 UTC 2010 on sn-devel-104
|
|
Previously, only one fd handler was being called per main message loop
in all smbd child processes.
In the case where multiple fds are available for reading the fd
corresponding to the event closest to the beginning of the event list
would be run. Obviously this is arbitrary and could cause unfairness.
Usually, the first event fd is the network socket, meaning heavy load
of client requests can starve out other fd events such as oplock
or notify upcalls from the kernel.
In this patch, I have changed the behavior of run_events() to unset
any fd that it has already called a handler function, as well
as decrement the number of fds that were returned from select().
This allows the caller of run_events() to iterate it, until all
available fds have been handled.
I then changed the main loop in smbd child processes to iterate
run_events(). This way, all available fds are handled on each wake
of select, while still checking for timed or signalled events between
each handler function call. I also added an explicit check for
EINTR from select(), which previously was masked by the fact that
run_events() would handle any signal event before the return code
was checked.
This required a signature change to run_events() but all other callers
should have no change in their behavior. I also fixed a bug in
run_events() where it could be called with a selrtn value of -1,
doing unecessary looping through the fd_event list when no fds were
available.
Also, remove the temporary echo handler hack, as all fds should be
treated fairly now.
|
|
Guenther
|
|
This is supposed to improve the winbind reconnect time after an ip address
has been moved away from a box. Any kind of HA scenario will benefit from
this, because winbindd does not have to wait for the TCP timeout to kick in
when a local IP address has been dropped and DC replies are not received
anymore.
|
|
Giving the parent pid to reinit_after_fork is not a good idea....
None of the other callers do this, checked it.
|
|
|
|
metze
|
|
metze
|
|
Guenther
|
|
|
|
Andreas, please check.
Guenther
|
|
If log file is set in the config file, we should create the log files of
the winbind child processes in the same directory.
|
|
|
|
|
|
This shrinks include/includes.h.gch by the size of 7 MB and reduces build time
as follows:
ccache build w/o patch
real 4m21.529s
ccache build with patch
real 3m6.402s
pch build w/o patch
real 4m26.318s
pch build with patch
real 3m6.932s
Guenther
|
|
|
|
|
|
The main problem is that we call CatchChild() within the
parent winbindd, which overwrites the signal handler
that was registered by winbindd_setup_sig_chld_handler().
That means winbindd_sig_chld_handler() and winbind_child_died()
are never triggered when a winbindd domain child dies.
As a result will get "broken pipe" for all requests to that domain.
To reduce the risk of similar bugs in future we call
CatchChild() in winbindd_reinit_after_fork() now.
We also use a full winbindd_reinit_after_fork() in the
cache validation child now instead instead of just resetting
the SIGCHLD handler by hand. This will also fix possible
tdb problems on systems without pread/pwrite and disabled mmap
as we now correctly reopen the tdb handle for the child.
metze
|
|
metze
|
|
reported by valgrind
The timeval passed to event_add_to_select_args() must be initialized
as event_add_to_select_args() uses a timeval_min() on this and next_event.
|
|
|
|
|