diff options
author | Jelmer Vernooij <jelmer@samba.org> | 2002-08-29 13:28:17 +0000 |
---|---|---|
committer | Jelmer Vernooij <jelmer@samba.org> | 2002-08-29 13:28:17 +0000 |
commit | 544919436a3fcc1a21d5ffce61a666d151273136 (patch) | |
tree | 6886611cf0f7955352a00b618f8bec9c88995d7e /docs/docbook | |
parent | e095e77a5c83261a67ec42f37cf2ec6e8f3d545e (diff) | |
download | samba-544919436a3fcc1a21d5ffce61a666d151273136.tar.gz samba-544919436a3fcc1a21d5ffce61a666d151273136.tar.bz2 samba-544919436a3fcc1a21d5ffce61a666d151273136.zip |
Add more documents to the developers guide
(This used to be commit e05bdd9eab760b5dc6a4442dc89752080ff1d2c1)
Diffstat (limited to 'docs/docbook')
-rw-r--r-- | docs/docbook/devdoc/CodingSuggestions.sgml | 237 | ||||
-rw-r--r-- | docs/docbook/devdoc/architecture.sgml | 184 | ||||
-rw-r--r-- | docs/docbook/devdoc/debug.sgml | 321 | ||||
-rw-r--r-- | docs/docbook/devdoc/dev-doc.sgml | 12 | ||||
-rw-r--r-- | docs/docbook/devdoc/internals.sgml | 439 | ||||
-rw-r--r-- | docs/docbook/devdoc/parsing.sgml | 239 | ||||
-rw-r--r-- | docs/docbook/devdoc/unix-smb.sgml | 311 |
7 files changed, 1743 insertions, 0 deletions
diff --git a/docs/docbook/devdoc/CodingSuggestions.sgml b/docs/docbook/devdoc/CodingSuggestions.sgml new file mode 100644 index 0000000000..bdf6d3d17d --- /dev/null +++ b/docs/docbook/devdoc/CodingSuggestions.sgml @@ -0,0 +1,237 @@ +<chapter id="CodingSuggestions"> +<chapterinfo> + <author> + <firstname>Steve</firstname><surname>French</surname> + </author> + <author> + <firstname>Simo</firstname><surname>Sorce</surname> + </author> + <author> + <firstname>Andrew</firstname><surname>Bartlett</surname> + </author> + <author> + <firstname>Tim</firstname><surname>Potter</surname> + </author> + <author> + <firstname>Martin</firstname><surname>Pool</surname> + </author> +</chapterinfo> + +<title>Coding Suggestions</title> + +<para> +So you want to add code to Samba ... +</para> + +<para> +One of the daunting tasks facing a programmer attempting to write code for +Samba is understanding the various coding conventions used by those most +active in the project. These conventions were mostly unwritten and helped +improve either the portability, stability or consistency of the code. This +document will attempt to document a few of the more important coding +practices used at this time on the Samba project. The coding practices are +expected to change slightly over time, and even to grow as more is learned +about obscure portability considerations. Two existing documents +<filename>samba/source/internals.doc</filename> and +<filename>samba/source/architecture.doc</filename> provide +additional information. +</para> + +<para> +The loosely related question of coding style is very personal and this +document does not attempt to address that subject, except to say that I +have observed that eight character tabs seem to be preferred in Samba +source. If you are interested in the topic of coding style, two oft-quoted +documents are: +</para> + +<para> +<ulink url="http://lxr.linux.no/source/Documentation/CodingStyle">http://lxr.linux.no/source/Documentation/CodingStyle</ulink> +</para> + +<para> +<ulink url="http://www.fsf.org/prep/standards_toc.html">http://www.fsf.org/prep/standards_toc.html</ulink> +</para> + +<para> +But note that coding style in Samba varies due to the many different +programmers who have contributed. +</para> + +<para> +Following are some considerations you should use when adding new code to +Samba. First and foremost remember that: +</para> + +<para> +Portability is a primary consideration in adding function, as is network +compatability with de facto, existing, real world CIFS/SMB implementations. +There are lots of platforms that Samba builds on so use caution when adding +a call to a library function that is not invoked in existing Samba code. +Also note that there are many quite different SMB/CIFS clients that Samba +tries to support, not all of which follow the SNIA CIFS Technical Reference +(or the earlier Microsoft reference documents or the X/Open book on the SMB +Standard) perfectly. +</para> + +<para> +Here are some other suggestions: +</para> + +<orderedlist> + +<listitem><para> + use d_printf instead of printf for display text + reason: enable auto-substitution of translated language text +</para></listitem> + +<listitem><para> + use SAFE_FREE instead of free + reason: reduce traps due to null pointers +</para></listitem> + +<listitem><para> + don't use bzero use memset, or ZERO_STRUCT and ZERO_STRUCTP macros + reason: not POSIX +</para></listitem> + +<listitem><para> + don't use strcpy and strlen (use safe_* equivalents) + reason: to avoid traps due to buffer overruns +</para></listitem> + +<listitem><para> + don't use getopt_long, use popt functions instead + reason: portability +</para></listitem> + +<listitem><para> + explicitly add const qualifiers on parm passing in functions where parm + is input only (somewhat controversial but const can be #defined away) +</para></listitem> + +<listitem><para> + when passing a va_list as an arg, or assigning one to another + please use the VA_COPY() macro + reason: on some platforms, va_list is a struct that must be + initialized in each function...can SEGV if you don't. +</para></listitem> + +<listitem><para> + discourage use of threads + reason: portability (also see architecture.doc) +</para></listitem> + +<listitem><para> + don't explicitly include new header files in C files - new h files + should be included by adding them once to includes.h + reason: consistency +</para></listitem> + +<listitem><para> + don't explicitly extern functions (they are autogenerated by + "make proto" into proto.h) + reason: consistency +</para></listitem> + +<listitem><para> + use endian safe macros when unpacking SMBs (see byteorder.h and + internals.doc) + reason: not everyone uses Intel +</para></listitem> + +<listitem><para> + Note Unicode implications of charset handling (see internals.doc). See + pull_* and push_* and convert_string functions. + reason: Internationalization +</para></listitem> + +<listitem><para> + Don't assume English only + reason: See above +</para></listitem> + +<listitem><para> + Try to avoid using in/out parameters (functions that return data which + overwrites input parameters) + reason: Can cause stability problems +</para></listitem> + +<listitem><para> + Ensure copyright notices are correct, don't append Tridge's name to code + that he didn't write. If you did not write the code, make sure that it + can coexist with the rest of the Samba GPLed code. +</para></listitem> + +<listitem><para> + Consider usage of DATA_BLOBs for length specified byte-data. + reason: stability +</para></listitem> + +<listitem><para> + Take advantage of tdbs for database like function + reason: consistency +</para></listitem> + +<listitem><para> + Don't access the SAM_ACCOUNT structure directly, they should be accessed + via pdb_get...() and pdb_set...() functions. + reason: stability, consistency +</para></listitem> + +<listitem><para> + Don't check a password directly against the passdb, always use the + check_password() interface. + reason: long term pluggability +</para></listitem> + +<listitem><para> + Try to use asprintf rather than pstrings and fstrings where possible +</para></listitem> + +<listitem><para> + Use normal C comments / * instead of C++ comments // like + this. Although the C++ comment format is part of the C99 + standard, some older vendor C compilers do not accept it. +</para></listitem> + +<listitem><para> + Try to write documentation for API functions and structures + explaining the point of the code, the way it should be used, and + any special conditions or results. Mark these with a double-star + comment start / ** so that they can be picked up by Doxygen, as in + this file. +</para></listitem> + +<listitem><para> + Keep the scope narrow. This means making functions/variables + static whenever possible. We don't want our namespace + polluted. Each module should have a minimal number of externally + visible functions or variables. +</para></listitem> + +<listitem><para> + Use function pointers to keep knowledge about particular pieces of + code isolated in one place. We don't want a particular piece of + functionality to be spread out across lots of places - that makes + for fragile, hand to maintain code. Instead, design an interface + and use tables containing function pointers to implement specific + functionality. This is particularly important for command + interpreters. +</para></listitem> + +<listitem><para> + Think carefully about what it will be like for someone else to add + to and maintain your code. If it would be hard for someone else to + maintain then do it another way. +</para></listitem> + +</orderedlist> + +<para> +The suggestions above are simply that, suggestions, but the information may +help in reducing the routine rework done on new code. The preceeding list +is expected to change routinely as new support routines and macros are +added. +</para> +</chapter> diff --git a/docs/docbook/devdoc/architecture.sgml b/docs/docbook/devdoc/architecture.sgml new file mode 100644 index 0000000000..312a63af97 --- /dev/null +++ b/docs/docbook/devdoc/architecture.sgml @@ -0,0 +1,184 @@ +<chapter id="architecture"> +<chapterinfo> + <author> + <firstname>Dan</firstname><surname>Shearer</surname> + </author> + <pubdate> November 1997</pubdate> +</chapterinfo> + +<title>Samba Architecture</title> + +<sect1> +<title>Introduction</title> + +<para> +This document gives a general overview of how Samba works +internally. The Samba Team has tried to come up with a model which is +the best possible compromise between elegance, portability, security +and the constraints imposed by the very messy SMB and CIFS +protocol. +</para> + +<para> +It also tries to answer some of the frequently asked questions such as: +</para> + +<orderedlist> +<listitem><para> + Is Samba secure when running on Unix? The xyz platform? + What about the root priveliges issue? +</para></listitem> + +<listitem><para>Pros and cons of multithreading in various parts of Samba</para></listitem> + +<listitem><para>Why not have a separate process for name resolution, WINS, and browsing?</para></listitem> + +</orderedlist> + +</sect1> + +<sect1> +<title>Multithreading and Samba</title> + +<para> +People sometimes tout threads as a uniformly good thing. They are very +nice in their place but are quite inappropriate for smbd. nmbd is +another matter, and multi-threading it would be very nice. +</para> + +<para> +The short version is that smbd is not multithreaded, and alternative +servers that take this approach under Unix (such as Syntax, at the +time of writing) suffer tremendous performance penalties and are less +robust. nmbd is not threaded either, but this is because it is not +possible to do it while keeping code consistent and portable across 35 +or more platforms. (This drawback also applies to threading smbd.) +</para> + +<para> +The longer versions is that there are very good reasons for not making +smbd multi-threaded. Multi-threading would actually make Samba much +slower, less scalable, less portable and much less robust. The fact +that we use a separate process for each connection is one of Samba's +biggest advantages. +</para> + +</sect1> + +<sect1> +<title>Threading smbd</title> + +<para> +A few problems that would arise from a threaded smbd are: +</para> + +<orderedlist> +<listitem><para> + It's not only to create threads instead of processes, but you + must care about all variables if they have to be thread specific + (currently they would be global). +</para></listitem> + +<listitem><para> + if one thread dies (eg. a seg fault) then all threads die. We can + immediately throw robustness out the window. +</para></listitem> + +<listitem><para> + many of the system calls we make are blocking. Non-blocking + equivalents of many calls are either not available or are awkward (and + slow) to use. So while we block in one thread all clients are + waiting. Imagine if one share is a slow NFS filesystem and the others + are fast, we will end up slowing all clients to the speed of NFS. +</para></listitem> + +<listitem><para> + you can't run as a different uid in different threads. This means + we would have to switch uid/gid on _every_ SMB packet. It would be + horrendously slow. +</para></listitem> + +<listitem><para> + the per process file descriptor limit would mean that we could only + support a limited number of clients. +</para></listitem> + +<listitem><para> + we couldn't use the system locking calls as the locking context of + fcntl() is a process, not a thread. +</para></listitem> + +</orderedlist> + +</sect1> + +<sect1> +<title>Threading nmbd</title> + +<para> +This would be ideal, but gets sunk by portability requirements. +</para> + +<para> +Andrew tried to write a test threads library for nmbd that used only +ansi-C constructs (using setjmp and longjmp). Unfortunately some OSes +defeat this by restricting longjmp to calling addresses that are +shallower than the current address on the stack (apparently AIX does +this). This makes a truly portable threads library impossible. So to +support all our current platforms we would have to code nmbd both with +and without threads, and as the real aim of threads is to make the +code clearer we would not have gained anything. (it is a myth that +threads make things faster. threading is like recursion, it can make +things clear but the same thing can always be done faster by some +other method) +</para> + +<para> +Chris tried to spec out a general design that would abstract threading +vs separate processes (vs other methods?) and make them accessible +through some general API. This doesn't work because of the data +sharing requirements of the protocol (packets in the future depending +on packets now, etc.) At least, the code would work but would be very +clumsy, and besides the fork() type model would never work on Unix. (Is there an OS that it would work on, for nmbd?) +</para> + +<para> +A fork() is cheap, but not nearly cheap enough to do on every UDP +packet that arrives. Having a pool of processes is possible but is +nasty to program cleanly due to the enormous amount of shared data (in +complex structures) between the processes. We can't rely on each +platform having a shared memory system. +</para> + +</sect1> + +<sect1> +<title>nbmd Design</title> + +<para> +Originally Andrew used recursion to simulate a multi-threaded +environment, which use the stack enormously and made for really +confusing debugging sessions. Luke Leighton rewrote it to use a +queuing system that keeps state information on each packet. The +first version used a single structure which was used by all the +pending states. As the initialisation of this structure was +done by adding arguments, as the functionality developed, it got +pretty messy. So, it was replaced with a higher-order function +and a pointer to a user-defined memory block. This suddenly +made things much simpler: large numbers of functions could be +made static, and modularised. This is the same principle as used +in NT's kernel, and achieves the same effect as threads, but in +a single process. +</para> + +<para> +Then Jeremy rewrote nmbd. The packet data in nmbd isn't what's on the +wire. It's a nice format that is very amenable to processing but still +keeps the idea of a distinct packet. See "struct packet_struct" in +nameserv.h. It has all the detail but none of the on-the-wire +mess. This makes it ideal for using in disk or memory-based databases +for browsing and WINS support. +</para> + +</sect1> +</chapter> diff --git a/docs/docbook/devdoc/debug.sgml b/docs/docbook/devdoc/debug.sgml new file mode 100644 index 0000000000..7e81cc825d --- /dev/null +++ b/docs/docbook/devdoc/debug.sgml @@ -0,0 +1,321 @@ +<chapter id="debug"> +<chapterinfo> + <author> + <firstname>Chris</firstname><surname>Hertel</surname> + </author> + <pubdate>July 1998</pubdate> +</chapterinfo> + +<title>The samba DEBUG system</title> + +<sect1> +<title>New Output Syntax</title> + +<para> + The syntax of a debugging log file is represented as: +</para> + +<para><programlisting> + >debugfile< :== { >debugmsg< } + + >debugmsg< :== >debughdr< '\n' >debugtext< + + >debughdr< :== '[' TIME ',' LEVEL ']' FILE ':' [FUNCTION] '(' LINE ')' + + >debugtext< :== { >debugline< } + + >debugline< :== TEXT '\n' +</programlisting></para> + +<para> +TEXT is a string of characters excluding the newline character. +</para> + +<para> +LEVEL is the DEBUG level of the message (an integer in the range + 0..10). +</para> + +<para> +TIME is a timestamp. +</para> + +<para> +FILE is the name of the file from which the debug message was +generated. +</para> + +<para> +FUNCTION is the function from which the debug message was generated. +</para> + +<para> +LINE is the line number of the debug statement that generated the +message. +</para> + +<para>Basically, what that all means is:</para> +<orderedlist> +<listitem><para> +A debugging log file is made up of debug messages. +</para></listitem> +<listitem><para> +Each debug message is made up of a header and text. The header is +separated from the text by a newline. +</para></listitem> +<listitem><para> +The header begins with the timestamp and debug level of the +message enclosed in brackets. The filename, function, and line +number at which the message was generated follow. The filename is +terminated by a colon, and the function name is terminated by the +parenthesis which contain the line number. Depending upon the +compiler, the function name may be missing (it is generated by the +__FUNCTION__ macro, which is not universally implemented, dangit). +</para></listitem> +<listitem><para> +The message text is made up of zero or more lines, each terminated +by a newline. +</para></listitem> +</orderedlist> + +<para>Here's some example output:</para> + +<para><programlisting> + [1998/08/03 12:55:25, 1] nmbd.c:(659) + Netbios nameserver version 1.9.19-prealpha started. + Copyright Andrew Tridgell 1994-1997 + [1998/08/03 12:55:25, 3] loadparm.c:(763) + Initializing global parameters +</programlisting></para> + +<para> +Note that in the above example the function names are not listed on +the header line. That's because the example above was generated on an +SGI Indy, and the SGI compiler doesn't support the __FUNCTION__ macro. +</para> + +</sect1> + +<sect1> +<title>The DEBUG() Macro</title> + +<para> +Use of the DEBUG() macro is unchanged. DEBUG() takes two parameters. +The first is the message level, the second is the body of a function +call to the Debug1() function. +</para> + +<para>That's confusing.</para> + +<para>Here's an example which may help a bit. If you would write</para> + +<para><programlisting> +printf( "This is a %s message.\n", "debug" ); +</programlisting></para> + +<para> +to send the output to stdout, then you would write +</para> + +<para><programlisting> +DEBUG( 0, ( "This is a %s message.\n", "debug" ) ); +</programlisting></para> + +<para> +to send the output to the debug file. All of the normal printf() +formatting escapes work. +</para> + +<para> +Note that in the above example the DEBUG message level is set to 0. +Messages at level 0 always print. Basically, if the message level is +less than or equal to the global value DEBUGLEVEL, then the DEBUG +statement is processed. +</para> + +<para> +The output of the above example would be something like: +</para> + +<para><programlisting> + [1998/07/30 16:00:51, 0] file.c:function(128) + This is a debug message. +</programlisting></para> + +<para> +Each call to DEBUG() creates a new header *unless* the output produced +by the previous call to DEBUG() did not end with a '\n'. Output to the +debug file is passed through a formatting buffer which is flushed +every time a newline is encountered. If the buffer is not empty when +DEBUG() is called, the new input is simply appended. +</para> + +<para> +...but that's really just a Kludge. It was put in place because +DEBUG() has been used to write partial lines. Here's a simple (dumb) +example of the kind of thing I'm talking about: +</para> + +<para><programlisting> + DEBUG( 0, ("The test returned " ) ); + if( test() ) + DEBUG(0, ("True") ); + else + DEBUG(0, ("False") ); + DEBUG(0, (".\n") ); +</programlisting></para> + +<para> +Without the format buffer, the output (assuming test() returned true) +would look like this: +</para> + +<para><programlisting> + [1998/07/30 16:00:51, 0] file.c:function(256) + The test returned + [1998/07/30 16:00:51, 0] file.c:function(258) + True + [1998/07/30 16:00:51, 0] file.c:function(261) + . +</programlisting></para> + +<para>Which isn't much use. The format buffer kludge fixes this problem. +</para> + +</sect1> + +<sect1> +<title>The DEBUGADD() Macro</title> + +<para> +In addition to the kludgey solution to the broken line problem +described above, there is a clean solution. The DEBUGADD() macro never +generates a header. It will append new text to the current debug +message even if the format buffer is empty. The syntax of the +DEBUGADD() macro is the same as that of the DEBUG() macro. +</para> + +<para><programlisting> + DEBUG( 0, ("This is the first line.\n" ) ); + DEBUGADD( 0, ("This is the second line.\nThis is the third line.\n" ) ); +</programlisting></para> + +<para>Produces</para> + +<para><programlisting> + [1998/07/30 16:00:51, 0] file.c:function(512) + This is the first line. + This is the second line. + This is the third line. +</programlisting></para> + +</sect1> + +<sect1> +<title>The DEBUGLVL() Macro</title> + +<para> +One of the problems with the DEBUG() macro was that DEBUG() lines +tended to get a bit long. Consider this example from +nmbd_sendannounce.c: +</para> + +<para><programlisting> + DEBUG(3,("send_local_master_announcement: type %x for name %s on subnet %s for workgroup %s\n", + type, global_myname, subrec->subnet_name, work->work_group)); +</programlisting></para> + +<para> +One solution to this is to break it down using DEBUG() and DEBUGADD(), +as follows: +</para> + +<para><programlisting> + DEBUG( 3, ( "send_local_master_announcement: " ) ); + DEBUGADD( 3, ( "type %x for name %s ", type, global_myname ) ); + DEBUGADD( 3, ( "on subnet %s ", subrec->subnet_name ) ); + DEBUGADD( 3, ( "for workgroup %s\n", work->work_group ) ); +</programlisting></para> + +<para> +A similar, but arguably nicer approach is to use the DEBUGLVL() macro. +This macro returns True if the message level is less than or equal to +the global DEBUGLEVEL value, so: +</para> + +<para><programlisting> + if( DEBUGLVL( 3 ) ) + { + dbgtext( "send_local_master_announcement: " ); + dbgtext( "type %x for name %s ", type, global_myname ); + dbgtext( "on subnet %s ", subrec->subnet_name ); + dbgtext( "for workgroup %s\n", work->work_group ); + } +</programlisting></para> + +<para>(The dbgtext() function is explained below.)</para> + +<para>There are a few advantages to this scheme:</para> +<orderedlist> +<listitem><para> +The test is performed only once. +</para></listitem> +<listitem><para> +You can allocate variables off of the stack that will only be used +within the DEBUGLVL() block. +</para></listitem> +<listitem><para> +Processing that is only relevant to debug output can be contained +within the DEBUGLVL() block. +</para></listitem> +</orderedlist> + +</sect1> + +<sect1> +<title>New Functions</title> + +<sect2> +<title>dbgtext()</title> +<para> +This function prints debug message text to the debug file (and +possibly to syslog) via the format buffer. The function uses a +variable argument list just like printf() or Debug1(). The +input is printed into a buffer using the vslprintf() function, +and then passed to format_debug_text(). + +If you use DEBUGLVL() you will probably print the body of the +message using dbgtext(). +</para> +</sect2> + +<sect2> +<title>dbghdr()</title> +<para> +This is the function that writes a debug message header. +Headers are not processed via the format buffer. Also note that +if the format buffer is not empty, a call to dbghdr() will not +produce any output. See the comments in dbghdr() for more info. +</para> + +<para> +It is not likely that this function will be called directly. It +is used by DEBUG() and DEBUGADD(). +</para> +</sect2> + +<sect2> +<title>format_debug_text()</title> +<para> +This is a static function in debug.c. It stores the output text +for the body of the message in a buffer until it encounters a +newline. When the newline character is found, the buffer is +written to the debug file via the Debug1() function, and the +buffer is reset. This allows us to add the indentation at the +beginning of each line of the message body, and also ensures +that the output is written a line at a time (which cleans up +syslog output). +</para> +</sect2> +</sect1> +</chapter> diff --git a/docs/docbook/devdoc/dev-doc.sgml b/docs/docbook/devdoc/dev-doc.sgml index f84c129f00..76ad512add 100644 --- a/docs/docbook/devdoc/dev-doc.sgml +++ b/docs/docbook/devdoc/dev-doc.sgml @@ -1,5 +1,11 @@ <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.1//EN" [ <!ENTITY NetBIOS SYSTEM "NetBIOS.sgml"> +<!ENTITY Architecture SYSTEM "architecture.sgml"> +<!ENTITY debug SYSTEM "debug.sgml"> +<!ENTITY internals SYSTEM "internals.sgml"> +<!ENTITY parsing SYSTEM "parsing.sgml"> +<!ENTITY unix-smb SYSTEM "unix-smb.sgml"> +<!ENTITY CodingSuggestions SYSTEM "CodingSuggestions.sgml"> ]> <book id="Samba-Developer-Documentation"> @@ -40,5 +46,11 @@ url="http://www.fsf.org/licenses/gpl.txt">http://www.fsf.org/licenses/gpl.txt</u <!-- Chapters --> &NetBIOS; +&Architecture; +&debug; +&CodingSuggestions; +&internals; +&parsing; +&unix-smb; </book> diff --git a/docs/docbook/devdoc/internals.sgml b/docs/docbook/devdoc/internals.sgml new file mode 100644 index 0000000000..79524347b6 --- /dev/null +++ b/docs/docbook/devdoc/internals.sgml @@ -0,0 +1,439 @@ +<chapter id="internals"> +<chapterinfo> + <author> + <firstname>David</firstname><surname>Chappell</surname> + <affiliation> + <address><email>David.Chappell@mail.trincoll.edu</email></address> + </affiliation> + </author> + <pubdate>8 May 1996</pubdate> +</chapterinfo> + +<title>Samba Internals</title> + +<sect1> +<title>Character Handling</title> +<para> +This section describes character set handling in Samba, as implemented in +Samba 3.0 and above +</para> + +<para> +In the past Samba had very ad-hoc character set handling. Scattered +throughout the code were numerous calls which converted particular +strings to/from DOS codepages. The problem is that there was no way of +telling if a particular char* is in dos codepage or unix +codepage. This led to a nightmare of code that tried to cope with +particular cases without handlingt the general case. +</para> + +<sect1> +<title>The new functions</title> + +<para> +The new system works like this: +</para> + +<orderedlist> +<listitem><para> + all char* strings inside Samba are "unix" strings. These are + multi-byte strings that are in the charset defined by the "unix + charset" option in smb.conf. +</para></listitem> + +<listitem><para> + there is no single fixed character set for unix strings, but any + character set that is used does need the following properties: + </para> + <orderedlist> + + <listitem><para> + must not contain NULLs except for termination + </para></listitem> + + <listitem><para> + must be 7-bit compatible with C strings, so that a constant + string or character in C will be byte-for-byte identical to the + equivalent string in the chosen character set. + </para></listitem> + + <listitem><para> + when you uppercase or lowercase a string it does not become + longer than the original string + </para></listitem> + + <listitem><para> + must be able to correctly hold all characters that your client + will throw at it + </para></listitem> + </orderedlist> + + <para> + For example, UTF-8 is fine, and most multi-byte asian character sets + are fine, but UCS2 could not be used for unix strings as they + contain nulls. + </para> +</listitem> + +<listitem><para> + when you need to put a string into a buffer that will be sent on the + wire, or you need a string in a character set format that is + compatible with the clients character set then you need to use a + pull_ or push_ function. The pull_ functions pull a string from a + wire buffer into a (multi-byte) unix string. The push_ functions + push a string out to a wire buffer. +</para></listitem> + +<listitem><para> + the two main pull_ and push_ functions you need to understand are + pull_string and push_string. These functions take a base pointer + that should point at the start of the SMB packet that the string is + in. The functions will check the flags field in this packet to + automatically determine if the packet is marked as a unicode packet, + and they will choose whether to use unicode for this string based on + that flag. You may also force this decision using the STR_UNICODE or + STR_ASCII flags. For use in smbd/ and libsmb/ there are wrapper + functions clistr_ and srvstr_ that call the pull_/push_ functions + with the appropriate first argument. + </para> + + <para> + You may also call the pull_ascii/pull_ucs2 or push_ascii/push_ucs2 + functions if you know that a particular string is ascii or + unicode. There are also a number of other convenience functions in + charcnv.c that call the pull_/push_ functions with particularly + common arguments, such as pull_ascii_pstring() + </para> +</listitem> + +<listitem><para> + The biggest thing to remember is that internal (unix) strings in Samba + may now contain multi-byte characters. This means you cannot assume + that characters are always 1 byte long. Often this means that you will + have to convert strings to ucs2 and back again in order to do some + (seemingly) simple task. For examples of how to do this see functions + like strchr_m(). I know this is very slow, and we will eventually + speed it up but right now we want this stuff correct not fast. +</para></listitem> + +<listitem><para> + all lp_ functions now return unix strings. The magic "DOS" flag on + parameters is gone. +</para></listitem> + +<listitem><para> + all vfs functions take unix strings. Don't convert when passing to them +</para></listitem> + +</orderedlist> + +</sect1> + +<sect1> +<title>Macros in byteorder.h</title> + +<para> +This section describes the macros defined in byteorder.h. These macros +are used extensively in the Samba code. +</para> + +<sect2> +<title>CVAL(buf,pos)</title> + +<para> +returns the byte at offset pos within buffer buf as an unsigned character. +</para> +</sect2> + +<sect2> +<title>PVAL(buf,pos)</title> +<para>returns the value of CVAL(buf,pos) cast to type unsigned integer.</para> +</sect2> + +<sect2> +<title>SCVAL(buf,pos,val)</title> +<para>sets the byte at offset pos within buffer buf to value val.</para> +</sect2> + +<sect2> +<title>SVAL(buf,pos)</title> +<para> + returns the value of the unsigned short (16 bit) little-endian integer at + offset pos within buffer buf. An integer of this type is sometimes + refered to as "USHORT". +</para> +</sect2> + +<sect2> +<title>IVAL(buf,pos)</title> +<para>returns the value of the unsigned 32 bit little-endian integer at offset +pos within buffer buf.</para> +</sect2> + +<sect2> +<title>SVALS(buf,pos)</title> +<para>returns the value of the signed short (16 bit) little-endian integer at +offset pos within buffer buf.</para> +</sect2> + +<sect2> +<title>IVALS(buf,pos)</title> +<para>returns the value of the signed 32 bit little-endian integer at offset pos +within buffer buf.</para> +</sect2> + +<sect2> +<title>SSVAL(buf,pos,val)</title> +<para>sets the unsigned short (16 bit) little-endian integer at offset pos within +buffer buf to value val.</para> +</sect2> + +<sect2> +<title>SIVAL(buf,pos,val)</title> +<para>sets the unsigned 32 bit little-endian integer at offset pos within buffer +buf to the value val.</para> +</sect2> + +<sect2> +<title>SSVALS(buf,pos,val)</title> +<para>sets the short (16 bit) signed little-endian integer at offset pos within +buffer buf to the value val.</para> +</sect2> + +<sect2> +<title>SIVALS(buf,pos,val)</title> +<para>sets the signed 32 bit little-endian integer at offset pos withing buffer +buf to the value val.</para> +</sect2> + +<sect2> +<title>RSVAL(buf,pos)</title> +<para>returns the value of the unsigned short (16 bit) big-endian integer at +offset pos within buffer buf.</para> +</sect2> + +<sect2> +<title>RIVAL(buf,pos)</title> +<para>returns the value of the unsigned 32 bit big-endian integer at offset +pos within buffer buf.</para> +</sect2> + +<sect2> +<title>RSSVAL(buf,pos,val)</title> +<para>sets the value of the unsigned short (16 bit) big-endian integer at +offset pos within buffer buf to value val. +refered to as "USHORT".</para> +</sect2> + +<sect2> +<title>RSIVAL(buf,pos,val)</title> +<para>sets the value of the unsigned 32 bit big-endian integer at offset +pos within buffer buf to value val.</para> +</sect2> + +</sect1> + + +<sect1> +<title>LAN Manager Samba API</title> + +<para> +This section describes the functions need to make a LAN Manager RPC call. +This information had been obtained by examining the Samba code and the LAN +Manager 2.0 API documentation. It should not be considered entirely +reliable. +</para> + +<para> +<programlisting> +call_api(int prcnt, int drcnt, int mprcnt, int mdrcnt, + char *param, char *data, char **rparam, char **rdata); +</programlisting> +</para> + +<para> +This function is defined in client.c. It uses an SMB transaction to call a +remote api. +</para> + +<sect2> +<title>Parameters</title> + +<para>The parameters are as follows:</para> + +<orderedlist> +<listitem><para> + prcnt: the number of bytes of parameters begin sent. +</para></listitem> +<listitem><para> + drcnt: the number of bytes of data begin sent. +</para></listitem> +<listitem><para> + mprcnt: the maximum number of bytes of parameters which should be returned +</para></listitem> +<listitem><para> + mdrcnt: the maximum number of bytes of data which should be returned +</para></listitem> +<listitem><para> + param: a pointer to the parameters to be sent. +</para></listitem> +<listitem><para> + data: a pointer to the data to be sent. +</para></listitem> +<listitem><para> + rparam: a pointer to a pointer which will be set to point to the returned + paramters. The caller of call_api() must deallocate this memory. +</para></listitem> +<listitem><para> + rdata: a pointer to a pointer which will be set to point to the returned + data. The caller of call_api() must deallocate this memory. +</para></listitem> +</orderedlist> + +<para> +These are the parameters which you ought to send, in the order of their +appearance in the parameter block: +</para> + +<orderedlist> + +<listitem><para> +An unsigned 16 bit integer API number. You should set this value with +SSVAL(). I do not know where these numbers are described. +</para></listitem> + +<listitem><para> +An ASCIIZ string describing the parameters to the API function as defined +in the LAN Manager documentation. The first parameter, which is the server +name, is ommited. This string is based uppon the API function as described +in the manual, not the data which is actually passed. +</para></listitem> + +<listitem><para> +An ASCIIZ string describing the data structure which ought to be returned. +</para></listitem> + +<listitem><para> +Any parameters which appear in the function call, as defined in the LAN +Manager API documentation, after the "Server" and up to and including the +"uLevel" parameters. +</para></listitem> + +<listitem><para> +An unsigned 16 bit integer which gives the size in bytes of the buffer we +will use to receive the returned array of data structures. Presumably this +should be the same as mdrcnt. This value should be set with SSVAL(). +</para></listitem> + +<listitem><para> +An ASCIIZ string describing substructures which should be returned. If no +substructures apply, this string is of zero length. +</para></listitem> + +</orderedlist> + +<para> +The code in client.c always calls call_api() with no data. It is unclear +when a non-zero length data buffer would be sent. +</para> + +</sect2> + +<sect2> +<title>Return value</title> + +<para> +The returned parameters (pointed to by rparam), in their order of appearance +are:</para> + +<orderedlist> + +<listitem><para> +An unsigned 16 bit integer which contains the API function's return code. +This value should be read with SVAL(). +</para></listitem> + +<listitem><para> +An adjustment which tells the amount by which pointers in the returned +data should be adjusted. This value should be read with SVAL(). Basically, +the address of the start of the returned data buffer should have the returned +pointer value added to it and then have this value subtracted from it in +order to obtain the currect offset into the returned data buffer. +</para></listitem> + +<listitem><para> +A count of the number of elements in the array of structures returned. +It is also possible that this may sometimes be the number of bytes returned. +</para></listitem> +</orderedlist> + +<para> +When call_api() returns, rparam points to the returned parameters. The +first if these is the result code. It will be zero if the API call +suceeded. This value by be read with "SVAL(rparam,0)". +</para> + +<para> +The second parameter may be read as "SVAL(rparam,2)". It is a 16 bit offset +which indicates what the base address of the returned data buffer was when +it was built on the server. It should be used to correct pointer before +use. +</para> + +<para> +The returned data buffer contains the array of returned data structures. +Note that all pointers must be adjusted before use. The function +fix_char_ptr() in client.c can be used for this purpose. +</para> + +<para> +The third parameter (which may be read as "SVAL(rparam,4)") has something to +do with indicating the amount of data returned or possibly the amount of +data which can be returned if enough buffer space is allowed. +</para> + +</sect2> +</sect1> + +<sect1> +<title>Code character table</title> +<para> +Certain data structures are described by means of ASCIIz strings containing +code characters. These are the code characters: +</para> + +<orderedlist> +<listitem><para> +W a type byte little-endian unsigned integer +</para></listitem> +<listitem><para> +N a count of substructures which follow +</para></listitem> +<listitem><para> +D a four byte little-endian unsigned integer +</para></listitem> +<listitem><para> +B a byte (with optional count expressed as trailing ASCII digits) +</para></listitem> +<listitem><para> +z a four byte offset to a NULL terminated string +</para></listitem> +<listitem><para> +l a four byte offset to non-string user data +</para></listitem> +<listitem><para> +b an offset to data (with count expressed as trailing ASCII digits) +</para></listitem> +<listitem><para> +r pointer to returned data buffer??? +</para></listitem> +<listitem><para> +L length in bytes of returned data buffer??? +</para></listitem> +<listitem><para> +h number of bytes of information available??? +</para></listitem> +</orderedlist> + +</sect1> +</chapter> diff --git a/docs/docbook/devdoc/parsing.sgml b/docs/docbook/devdoc/parsing.sgml new file mode 100644 index 0000000000..0121935d26 --- /dev/null +++ b/docs/docbook/devdoc/parsing.sgml @@ -0,0 +1,239 @@ +<chapter id="parsing"> +<chapterinfo> + <author> + <firstname>Chris</firstname><surname>Hertel</surname> + </author> + <pubdate>November 1997</pubdate> +</chapterinfo> + +<title>The smb.conf file</title> + +<sect1> +<title>Lexical Analysis</title> + +<para> +Basically, the file is processed on a line by line basis. There are +four types of lines that are recognized by the lexical analyzer +(params.c): +</para> + +<orderedlist> +<listitem><para> +Blank lines - Lines containing only whitespace. +</para></listitem> +<listitem><para> +Comment lines - Lines beginning with either a semi-colon or a +pound sign (';' or '#'). +</para></listitem> +<listitem><para> +Section header lines - Lines beginning with an open square bracket ('['). +</para></listitem> +<listitem><para> +Parameter lines - Lines beginning with any other character. +(The default line type.) +</para></listitem> +</orderedlist> + +<para> +The first two are handled exclusively by the lexical analyzer, which +ignores them. The latter two line types are scanned for +</para> + +<orderedlist> +<listitem><para> + - Section names +</para></listitem> +<listitem><para> + - Parameter names +</para></listitem> +<listitem><para> + - Parameter values +</para></listitem> +</orderedlist> + +<para> +These are the only tokens passed to the parameter loader +(loadparm.c). Parameter names and values are divided from one +another by an equal sign: '='. +</para> + +<sect2> +<title>Handling of Whitespace</title> + +<para> +Whitespace is defined as all characters recognized by the isspace() +function (see ctype(3C)) except for the newline character ('\n') +The newline is excluded because it identifies the end of the line. +</para> + +<orderedlist> +<listitem><para> +The lexical analyzer scans past white space at the beginning of a line. +</para></listitem> + +<listitem><para> +Section and parameter names may contain internal white space. All +whitespace within a name is compressed to a single space character. +</para></listitem> + +<listitem><para> +Internal whitespace within a parameter value is kept verbatim with +the exception of carriage return characters ('\r'), all of which +are removed. +</para></listitem> + +<listitem><para> +Leading and trailing whitespace is removed from names and values. +</para></listitem> + +</orderedlist> + +</sect2> + +<sect2> +<title>Handling of Line Continuation</title> + +<para> +Long section header and parameter lines may be extended across +multiple lines by use of the backslash character ('\\'). Line +continuation is ignored for blank and comment lines. +</para> + +<para> +If the last (non-whitespace) character within a section header or on +a parameter line is a backslash, then the next line will be +(logically) concatonated with the current line by the lexical +analyzer. For example: +</para> + +<para><programlisting> + param name = parameter value string \ + with line continuation. +</programlisting></para> + +<para>Would be read as</para> + +<para><programlisting> + param name = parameter value string with line continuation. +</programlisting></para> + +<para> +Note that there are five spaces following the word 'string', +representing the one space between 'string' and '\\' in the top +line, plus the four preceeding the word 'with' in the second line. +(Yes, I'm counting the indentation.) +</para> + +<para> +Line continuation characters are ignored on blank lines and at the end +of comments. They are *only* recognized within section and parameter +lines. +</para> + +</sect2> + +<sect2> +<title>Line Continuation Quirks</title> + +<para>Note the following example:</para> + +<para><programlisting> + param name = parameter value string \ + \ + with line continuation. +</programlisting></para> + +<para> +The middle line is *not* parsed as a blank line because it is first +concatonated with the top line. The result is +</para> + +<para><programlisting> +param name = parameter value string with line continuation. +</programlisting></para> + +<para>The same is true for comment lines.</para> + +<para><programlisting> + param name = parameter value string \ + ; comment \ + with a comment. +</programlisting></para> + +<para>This becomes:</para> + +<para><programlisting> +param name = parameter value string ; comment with a comment. +</programlisting></para> + +<para> +On a section header line, the closing bracket (']') is considered a +terminating character, and the rest of the line is ignored. The lines +</para> + +<para><programlisting> + [ section name ] garbage \ + param name = value +</programlisting></para> + +<para>are read as</para> + +<para><programlisting> + [section name] + param name = value +</programlisting></para> + +</sect2> +</sect1> + +<sect1> +<title>Syntax</title> + +<para>The syntax of the smb.conf file is as follows:</para> + +<para><programlisting> + <file> :== { <section> } EOF + <section> :== <section header> { <parameter line> } + <section header> :== '[' NAME ']' + <parameter line> :== NAME '=' VALUE NL +</programlisting><para> + +<para>Basically, this means that</para> + +<orderedlist> +<listitem><para> + a file is made up of zero or more sections, and is terminated by + an EOF (we knew that). +</para></listitem> + +<listitem><para> + A section is made up of a section header followed by zero or more + parameter lines. +</para></listitem> + +<listitem><para> + A section header is identified by an opening bracket and + terminated by the closing bracket. The enclosed NAME identifies + the section. +</para></listitem> + +<listitem><para> + A parameter line is divided into a NAME and a VALUE. The *first* + equal sign on the line separates the NAME from the VALUE. The + VALUE is terminated by a newline character (NL = '\n'). +</para></listitem> + +</orderedlist> + +<sect2> +<title>About params.c</title> + +<para> +The parsing of the config file is a bit unusual if you are used to +lex, yacc, bison, etc. Both lexical analysis (scanning) and parsing +are performed by params.c. Values are loaded via callbacks to +loadparm.c. +</para> +</sect2> +</sect1> +</chapter> diff --git a/docs/docbook/devdoc/unix-smb.sgml b/docs/docbook/devdoc/unix-smb.sgml new file mode 100644 index 0000000000..be79698857 --- /dev/null +++ b/docs/docbook/devdoc/unix-smb.sgml @@ -0,0 +1,311 @@ +<chapter id="unix-smb"> +<chapterinfo> + <author> + <firstname>Andrew</firstname><surname>Tridgell</surname> + </author> + <pubdate>April 1995</pubdate> +</chapterinfo> + +<title>NetBIOS in a Unix World</title> + +<sect1> +<title>Introduction</title> +<para> +This is a short document that describes some of the issues that +confront a SMB implementation on unix, and how Samba copes with +them. They may help people who are looking at unix<->PC +interoperability. +</para> + +<para> +It was written to help out a person who was writing a paper on unix to +PC connectivity. +</para> + +</sect1> + +<sect1> +<title>Usernames</title> +<para> +The SMB protocol has only a loose username concept. Early SMB +protocols (such as CORE and COREPLUS) have no username concept at +all. Even in later protocols clients often attempt operations +(particularly printer operations) without first validating a username +on the server. +</para> + +<para> +Unix security is based around username/password pairs. A unix box +should not allow clients to do any substantive operation without some +sort of validation. +</para> + +<para> +The problem mostly manifests itself when the unix server is in "share +level" security mode. This is the default mode as the alternative +"user level" security mode usually forces a client to connect to the +server as the same user for each connected share, which is +inconvenient in many sites. +</para> + +<para> +In "share level" security the client normally gives a username in the +"session setup" protocol, but does not supply an accompanying +password. The client then connects to resources using the "tree +connect" protocol, and supplies a password. The problem is that the +user on the PC types the username and the password in different +contexts, unaware that they need to go together to give access to the +server. The username is normally the one the user typed in when they +"logged onto" the PC (this assumes Windows for Workgroups). The +password is the one they chose when connecting to the disk or printer. +</para> + +<para> +The user often chooses a totally different username for their login as +for the drive connection. Often they also want to access different +drives as different usernames. The unix server needs some way of +divining the correct username to combine with each password. +</para> + +<para> +Samba tries to avoid this problem using several methods. These succeed +in the vast majority of cases. The methods include username maps, the +service%user syntax, the saving of session setup usernames for later +validation and the derivation of the username from the service name +(either directly or via the user= option). +</para> + +</sect1> + +<sect1> +<title>File Ownership</title> + +<para> +The commonly used SMB protocols have no way of saying "you can't do +that because you don't own the file". They have, in fact, no concept +of file ownership at all. +</para> + +<para> +This brings up all sorts of interesting problems. For example, when +you copy a file to a unix drive, and the file is world writeable but +owned by another user the file will transfer correctly but will +receive the wrong date. This is because the utime() call under unix +only succeeds for the owner of the file, or root, even if the file is +world writeable. For security reasons Samba does all file operations +as the validated user, not root, so the utime() fails. This can stuff +up shared development diectories as programs like "make" will not get +file time comparisons right. +</para> + +<para> +There are several possible solutions to this problem, including +username mapping, and forcing a specific username for particular +shares. +</para> + +</sect1> + +<sect1> +<title>Passwords</title> + +<para> +Many SMB clients uppercase passwords before sending them. I have no +idea why they do this. Interestingly WfWg uppercases the password only +if the server is running a protocol greater than COREPLUS, so +obviously it isn't just the data entry routines that are to blame. +</para> + +<para> +Unix passwords are case sensitive. So if users use mixed case +passwords they are in trouble. +</para> + +<para> +Samba can try to cope with this by either using the "password level" +option which causes Samba to try the offered password with up to the +specified number of case changes, or by using the "password server" +option which allows Samba to do its validation via another machine +(typically a WinNT server). +</para> + +<para> +Samba supports the password encryption method used by SMB +clients. Note that the use of password encryption in Microsoft +networking leads to password hashes that are "plain text equivalent". +This means that it is *VERY* important to ensure that the Samba +smbpasswd file containing these password hashes is only readable +by the root user. See the documentation ENCRYPTION.txt for more +details. +</para> + +</sect1> + +<sect1> +<title>Locking</title> +<para> +The locking calls available under a DOS/Windows environment are much +richer than those available in unix. This means a unix server (like +Samba) choosing to use the standard fcntl() based unix locking calls +to implement SMB locking has to improvise a bit. +</para> + +<para> +One major problem is that dos locks can be in a 32 bit (unsigned) +range. Unix locking calls are 32 bits, but are signed, giving only a 31 +bit range. Unfortunately OLE2 clients use the top bit to select a +locking range used for OLE semaphores. +</para> + +<para> +To work around this problem Samba compresses the 32 bit range into 31 +bits by appropriate bit shifting. This seems to work but is not +ideal. In a future version a separate SMB lockd may be added to cope +with the problem. +</para> + +<para> +It also doesn't help that many unix lockd daemons are very buggy and +crash at the slightest provocation. They normally go mostly unused in +a unix environment because few unix programs use byte range +locking. The stress of huge numbers of lock requests from dos/windows +clients can kill the daemon on some systems. +</para> + +<para> +The second major problem is the "opportunistic locking" requested by +some clients. If a client requests opportunistic locking then it is +asking the server to notify it if anyone else tries to do something on +the same file, at which time the client will say if it is willing to +give up its lock. Unix has no simple way of implementing +opportunistic locking, and currently Samba has no support for it. +</para> + +</sect1> + +<sect1> +<title>Deny Modes</title> + +<para> +When a SMB client opens a file it asks for a particular "deny mode" to +be placed on the file. These modes (DENY_NONE, DENY_READ, DENY_WRITE, +DENY_ALL, DENY_FCB and DENY_DOS) specify what actions should be +allowed by anyone else who tries to use the file at the same time. If +DENY_READ is placed on the file, for example, then any attempt to open +the file for reading should fail. +</para> + +<para> +Unix has no equivalent notion. To implement this Samba uses either lock +files based on the files inode and placed in a separate lock +directory or a shared memory implementation. The lock file method +is clumsy and consumes processing and file resources, +the shared memory implementation is vastly prefered and is turned on +by default for those systems that support it. +</para> + +</sect1> + +<sect1> +<title>Trapdoor UIDs</title> +<para> +A SMB session can run with several uids on the one socket. This +happens when a user connects to two shares with different +usernames. To cope with this the unix server needs to switch uids +within the one process. On some unixes (such as SCO) this is not +possible. This means that on those unixes the client is restricted to +a single uid. +</para> + +<para> +Note that you can also get the "trapdoor uid" message for other +reasons. Please see the FAQ for details. +</para> + +</sect1> + +<sect1> +<title>Port numbers</title> +<para> +There is a convention that clients on sockets use high "unprivilaged" +port numbers (>1000) and connect to servers on low "privilaged" port +numbers. This is enforced in Unix as non-root users can't open a +socket for listening on port numbers less than 1000. +</para> + +<para> +Most PC based SMB clients (such as WfWg and WinNT) don't follow this +convention completely. The main culprit is the netbios nameserving on +udp port 137. Name query requests come from a source port of 137. This +is a problem when you combine it with the common firewalling technique +of not allowing incoming packets on low port numbers. This means that +these clients can't query a netbios nameserver on the other side of a +low port based firewall. +</para> + +<para> +The problem is more severe with netbios node status queries. I've +found that WfWg, Win95 and WinNT3.5 all respond to netbios node status +queries on port 137 no matter what the source port was in the +request. This works between machines that are both using port 137, but +it means it's not possible for a unix user to do a node status request +to any of these OSes unless they are running as root. The answer comes +back, but it goes to port 137 which the unix user can't listen +on. Interestingly WinNT3.1 got this right - it sends node status +responses back to the source port in the request. +</para> + +</sect1> + +<sect1> +<title>Protocol Complexity</title> +<para> +There are many "protocol levels" in the SMB protocol. It seems that +each time new functionality was added to a Microsoft operating system, +they added the equivalent functions in a new protocol level of the SMB +protocol to "externalise" the new capabilities. +</para> + +<para> +This means the protocol is very "rich", offering many ways of doing +each file operation. This means SMB servers need to be complex and +large. It also means it is very difficult to make them bug free. It is +not just Samba that suffers from this problem, other servers such as +WinNT don't support every variation of every call and it has almost +certainly been a headache for MS developers to support the myriad of +SMB calls that are available. +</para> + +<para> +There are about 65 "top level" operations in the SMB protocol (things +like SMBread and SMBwrite). Some of these include hundreds of +sub-functions (SMBtrans has at least 120 sub-functions, like +DosPrintQAdd and NetSessionEnum). All of them take several options +that can change the way they work. Many take dozens of possible +"information levels" that change the structures that need to be +returned. Samba supports all but 2 of the "top level" functions. It +supports only 8 (so far) of the SMBtrans sub-functions. Even NT +doesn't support them all. +</para> + +<para> +Samba currently supports up to the "NT LM 0.12" protocol, which is the +one preferred by Win95 and WinNT3.5. Luckily this protocol level has a +"capabilities" field which specifies which super-duper new-fangled +options the server suports. This helps to make the implementation of +this protocol level much easier. +</para> + +<para> +There is also a problem with the SMB specications. SMB is a X/Open +spec, but the X/Open book is far from ideal, and fails to cover many +important issues, leaving much to the imagination. Microsoft recently +renamed the SMB protocol CIFS (Common Internet File System) and have +published new specifications. These are far superior to the old +X/Open documents but there are still undocumented calls and features. +This specification is actively being worked on by a CIFS developers +mailing list hosted by Microsft. +</para> +</sect1> +</chapter> + |