diff options
Diffstat (limited to 'docs/Samba3-Developers-Guide/parsing.xml')
-rw-r--r-- | docs/Samba3-Developers-Guide/parsing.xml | 241 |
1 files changed, 241 insertions, 0 deletions
diff --git a/docs/Samba3-Developers-Guide/parsing.xml b/docs/Samba3-Developers-Guide/parsing.xml new file mode 100644 index 0000000000..d695f9313c --- /dev/null +++ b/docs/Samba3-Developers-Guide/parsing.xml @@ -0,0 +1,241 @@ +<?xml version="1.0" encoding="iso-8859-1"?> +<!DOCTYPE chapter PUBLIC "-//Samba-Team//DTD DocBook V4.2-Based Variant V1.0//EN" "http://www.samba.org/samba/DTD/samba-doc"> +<chapter id="parsing"> +<chapterinfo> + <author> + <firstname>Chris</firstname><surname>Hertel</surname> + </author> + <pubdate>November 1997</pubdate> +</chapterinfo> + +<title>The smb.conf file</title> + +<sect1> +<title>Lexical Analysis</title> + +<para> +Basically, the file is processed on a line by line basis. There are +four types of lines that are recognized by the lexical analyzer +(params.c): +</para> + +<orderedlist> +<listitem><para> +Blank lines - Lines containing only whitespace. +</para></listitem> +<listitem><para> +Comment lines - Lines beginning with either a semi-colon or a +pound sign (';' or '#'). +</para></listitem> +<listitem><para> +Section header lines - Lines beginning with an open square bracket ('['). +</para></listitem> +<listitem><para> +Parameter lines - Lines beginning with any other character. +(The default line type.) +</para></listitem> +</orderedlist> + +<para> +The first two are handled exclusively by the lexical analyzer, which +ignores them. The latter two line types are scanned for +</para> + +<orderedlist> +<listitem><para> + - Section names +</para></listitem> +<listitem><para> + - Parameter names +</para></listitem> +<listitem><para> + - Parameter values +</para></listitem> +</orderedlist> + +<para> +These are the only tokens passed to the parameter loader +(loadparm.c). Parameter names and values are divided from one +another by an equal sign: '='. +</para> + +<sect2> +<title>Handling of Whitespace</title> + +<para> +Whitespace is defined as all characters recognized by the isspace() +function (see ctype(3C)) except for the newline character ('\n') +The newline is excluded because it identifies the end of the line. +</para> + +<orderedlist> +<listitem><para> +The lexical analyzer scans past white space at the beginning of a line. +</para></listitem> + +<listitem><para> +Section and parameter names may contain internal white space. All +whitespace within a name is compressed to a single space character. +</para></listitem> + +<listitem><para> +Internal whitespace within a parameter value is kept verbatim with +the exception of carriage return characters ('\r'), all of which +are removed. +</para></listitem> + +<listitem><para> +Leading and trailing whitespace is removed from names and values. +</para></listitem> + +</orderedlist> + +</sect2> + +<sect2> +<title>Handling of Line Continuation</title> + +<para> +Long section header and parameter lines may be extended across +multiple lines by use of the backslash character ('\\'). Line +continuation is ignored for blank and comment lines. +</para> + +<para> +If the last (non-whitespace) character within a section header or on +a parameter line is a backslash, then the next line will be +(logically) concatonated with the current line by the lexical +analyzer. For example: +</para> + +<para><programlisting> + param name = parameter value string \ + with line continuation. +</programlisting></para> + +<para>Would be read as</para> + +<para><programlisting> + param name = parameter value string with line continuation. +</programlisting></para> + +<para> +Note that there are five spaces following the word 'string', +representing the one space between 'string' and '\\' in the top +line, plus the four preceeding the word 'with' in the second line. +(Yes, I'm counting the indentation.) +</para> + +<para> +Line continuation characters are ignored on blank lines and at the end +of comments. They are *only* recognized within section and parameter +lines. +</para> + +</sect2> + +<sect2> +<title>Line Continuation Quirks</title> + +<para>Note the following example:</para> + +<para><programlisting> + param name = parameter value string \ + \ + with line continuation. +</programlisting></para> + +<para> +The middle line is *not* parsed as a blank line because it is first +concatonated with the top line. The result is +</para> + +<para><programlisting> +param name = parameter value string with line continuation. +</programlisting></para> + +<para>The same is true for comment lines.</para> + +<para><programlisting> + param name = parameter value string \ + ; comment \ + with a comment. +</programlisting></para> + +<para>This becomes:</para> + +<para><programlisting> +param name = parameter value string ; comment with a comment. +</programlisting></para> + +<para> +On a section header line, the closing bracket (']') is considered a +terminating character, and the rest of the line is ignored. The lines +</para> + +<para><programlisting> + [ section name ] garbage \ + param name = value +</programlisting></para> + +<para>are read as</para> + +<para><programlisting> + [section name] + param name = value +</programlisting></para> + +</sect2> +</sect1> + +<sect1> +<title>Syntax</title> + +<para>The syntax of the smb.conf file is as follows:</para> + +<para><programlisting> + <file> :== { <section> } EOF + <section> :== <section header> { <parameter line> } + <section header> :== '[' NAME ']' + <parameter line> :== NAME '=' VALUE NL +</programlisting></para> + +<para>Basically, this means that</para> + +<orderedlist> +<listitem><para> + a file is made up of zero or more sections, and is terminated by + an EOF (we knew that). +</para></listitem> + +<listitem><para> + A section is made up of a section header followed by zero or more + parameter lines. +</para></listitem> + +<listitem><para> + A section header is identified by an opening bracket and + terminated by the closing bracket. The enclosed NAME identifies + the section. +</para></listitem> + +<listitem><para> + A parameter line is divided into a NAME and a VALUE. The *first* + equal sign on the line separates the NAME from the VALUE. The + VALUE is terminated by a newline character (NL = '\n'). +</para></listitem> + +</orderedlist> + +<sect2> +<title>About params.c</title> + +<para> +The parsing of the config file is a bit unusual if you are used to +lex, yacc, bison, etc. Both lexical analysis (scanning) and parsing +are performed by params.c. Values are loaded via callbacks to +loadparm.c. +</para> +</sect2> +</sect1> +</chapter> |