From 1782e2a1566971cb9ff2ef33c5ff32d43fbf35f1 Mon Sep 17 00:00:00 2001 From: Jelmer Vernooij Date: Sun, 15 Sep 2002 13:26:16 +0000 Subject: Add document on tracing system calls (This used to be commit 1e14815f4cd9e8a9ca5c0d41de902f9c0dd6410c) --- docs/docbook/devdoc/Tracing.sgml | 129 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 129 insertions(+) create mode 100644 docs/docbook/devdoc/Tracing.sgml (limited to 'docs/docbook/devdoc') diff --git a/docs/docbook/devdoc/Tracing.sgml b/docs/docbook/devdoc/Tracing.sgml new file mode 100644 index 0000000000..3a0e4ba1a9 --- /dev/null +++ b/docs/docbook/devdoc/Tracing.sgml @@ -0,0 +1,129 @@ + + + + AndrewTridgell + + Samba Team + + + + +Tracing samba system calls + + +This file describes how to do a system call trace on Samba to work out +what its doing wrong. This is not for the faint of heart, but if you +are reading this then you are probably desperate. + + + +Actually its not as bad as the the above makes it sound, just don't +expect the output to be very pretty :-) + + + +Ok, down to business. One of the big advantages of unix systems is +that they nearly all come with a system trace utility that allows you +to monitor all system calls that a program is making. This is +extremely using for debugging and also helps when trying to work out +why something is slower than you expect. You can use system tracing +without any special compilation options. + + + +The system trace utility is called different things on different +systems. On Linux systems its called strace. Under SunOS 4 its called +trace. Under SVR4 style systems (including solaris) its called +truss. Under many BSD systems its called ktrace. + + + +The first thing you should do is read the man page for your native +system call tracer. In the discussion below I'll assume its called +strace as strace is the only portable system tracer (its available for +free for many unix types) and its also got some of the nicest +features. + + + +Next, try using strace on some simple commands. For example, strace +ls or strace echo hello. + + + +You'll notice that it produces a LOT of output. It is showing you the +arguments to every system call that the program makes and the +result. Very little happens in a program without a system call so you +get lots of output. You'll also find that it produces a lot of +"preamble" stuff showing the loading of shared libraries etc. Ignore +this (unless its going wrong!) + + + +For example, the only line that really matters in the strace echo +hello output is: + + + +write(1, "hello\n", 6) = 6 + + +all the rest is just setting up to run the program. + + +Ok, now you're familiar with strace. To use it on Samba you need to +strace the running smbd daemon. The way I tend ot use it is to first +login from my Windows PC to the Samba server, then use smbstatus to +find which process ID that client is attached to, then as root I do +strace -p PID to attach to that process. I normally redirect the +stderr output from this command to a file for later perusal. For +example, if I'm using a csh style shell: + + +strace -f -p 3872 >& strace.out + +or with a sh style shell: + +strace -f -p 3872 > strace.out 2>&1 + + +Note the "-f" option. This is only available on some systems, and +allows you to trace not just the current process, but any children it +forks. This is great for finding printing problems caused by the +"print command" being wrong. + + + +Once you are attached you then can do whatever it is on the client +that is causing problems and you will capture all the system calls +that smbd makes. + + + +So how do you interpret the results? Generally I search through the +output for strings that I know will appear when the problem +happens. For example, if I am having touble with permissions on a file +I would search for that files name in the strace output and look at +the surrounding lines. Another trick is to match up file descriptor +numbers and "follow" what happens to an open file until it is closed. + + + +Beyond this you will have to use your initiative. To give you an idea +of what you are looking for here is a piece of strace output that +shows that /dev/null is not world writeable, which +causes printing to fail with Samba: + + + +[pid 28268] open("/dev/null", O_RDWR) = -1 EACCES (Permission denied) +[pid 28268] open("/dev/null", O_WRONLY) = -1 EACCES (Permission denied) + + + +The process is trying to first open /dev/null read-write +then read-only. Both fail. This means /dev/null has +incorrect permissions. + + + -- cgit