summaryrefslogtreecommitdiff
path: root/docs/docbook/devdoc/parsing.sgml
diff options
context:
space:
mode:
Diffstat (limited to 'docs/docbook/devdoc/parsing.sgml')
-rw-r--r--docs/docbook/devdoc/parsing.sgml239
1 files changed, 239 insertions, 0 deletions
diff --git a/docs/docbook/devdoc/parsing.sgml b/docs/docbook/devdoc/parsing.sgml
new file mode 100644
index 0000000000..8d929617f5
--- /dev/null
+++ b/docs/docbook/devdoc/parsing.sgml
@@ -0,0 +1,239 @@
+<chapter id="parsing">
+<chapterinfo>
+ <author>
+ <firstname>Chris</firstname><surname>Hertel</surname>
+ </author>
+ <pubdate>November 1997</pubdate>
+</chapterinfo>
+
+<title>The smb.conf file</title>
+
+<sect1>
+<title>Lexical Analysis</title>
+
+<para>
+Basically, the file is processed on a line by line basis. There are
+four types of lines that are recognized by the lexical analyzer
+(params.c):
+</para>
+
+<orderedlist>
+<listitem><para>
+Blank lines - Lines containing only whitespace.
+</para></listitem>
+<listitem><para>
+Comment lines - Lines beginning with either a semi-colon or a
+pound sign (';' or '#').
+</para></listitem>
+<listitem><para>
+Section header lines - Lines beginning with an open square bracket ('[').
+</para></listitem>
+<listitem><para>
+Parameter lines - Lines beginning with any other character.
+(The default line type.)
+</para></listitem>
+</orderedlist>
+
+<para>
+The first two are handled exclusively by the lexical analyzer, which
+ignores them. The latter two line types are scanned for
+</para>
+
+<orderedlist>
+<listitem><para>
+ - Section names
+</para></listitem>
+<listitem><para>
+ - Parameter names
+</para></listitem>
+<listitem><para>
+ - Parameter values
+</para></listitem>
+</orderedlist>
+
+<para>
+These are the only tokens passed to the parameter loader
+(loadparm.c). Parameter names and values are divided from one
+another by an equal sign: '='.
+</para>
+
+<sect2>
+<title>Handling of Whitespace</title>
+
+<para>
+Whitespace is defined as all characters recognized by the isspace()
+function (see ctype(3C)) except for the newline character ('\n')
+The newline is excluded because it identifies the end of the line.
+</para>
+
+<orderedlist>
+<listitem><para>
+The lexical analyzer scans past white space at the beginning of a line.
+</para></listitem>
+
+<listitem><para>
+Section and parameter names may contain internal white space. All
+whitespace within a name is compressed to a single space character.
+</para></listitem>
+
+<listitem><para>
+Internal whitespace within a parameter value is kept verbatim with
+the exception of carriage return characters ('\r'), all of which
+are removed.
+</para></listitem>
+
+<listitem><para>
+Leading and trailing whitespace is removed from names and values.
+</para></listitem>
+
+</orderedlist>
+
+</sect2>
+
+<sect2>
+<title>Handling of Line Continuation</title>
+
+<para>
+Long section header and parameter lines may be extended across
+multiple lines by use of the backslash character ('\\'). Line
+continuation is ignored for blank and comment lines.
+</para>
+
+<para>
+If the last (non-whitespace) character within a section header or on
+a parameter line is a backslash, then the next line will be
+(logically) concatonated with the current line by the lexical
+analyzer. For example:
+</para>
+
+<para><programlisting>
+ param name = parameter value string \
+ with line continuation.
+</programlisting></para>
+
+<para>Would be read as</para>
+
+<para><programlisting>
+ param name = parameter value string with line continuation.
+</programlisting></para>
+
+<para>
+Note that there are five spaces following the word 'string',
+representing the one space between 'string' and '\\' in the top
+line, plus the four preceeding the word 'with' in the second line.
+(Yes, I'm counting the indentation.)
+</para>
+
+<para>
+Line continuation characters are ignored on blank lines and at the end
+of comments. They are *only* recognized within section and parameter
+lines.
+</para>
+
+</sect2>
+
+<sect2>
+<title>Line Continuation Quirks</title>
+
+<para>Note the following example:</para>
+
+<para><programlisting>
+ param name = parameter value string \
+ \
+ with line continuation.
+</programlisting></para>
+
+<para>
+The middle line is *not* parsed as a blank line because it is first
+concatonated with the top line. The result is
+</para>
+
+<para><programlisting>
+param name = parameter value string with line continuation.
+</programlisting></para>
+
+<para>The same is true for comment lines.</para>
+
+<para><programlisting>
+ param name = parameter value string \
+ ; comment \
+ with a comment.
+</programlisting></para>
+
+<para>This becomes:</para>
+
+<para><programlisting>
+param name = parameter value string ; comment with a comment.
+</programlisting></para>
+
+<para>
+On a section header line, the closing bracket (']') is considered a
+terminating character, and the rest of the line is ignored. The lines
+</para>
+
+<para><programlisting>
+ [ section name ] garbage \
+ param name = value
+</programlisting></para>
+
+<para>are read as</para>
+
+<para><programlisting>
+ [section name]
+ param name = value
+</programlisting></para>
+
+</sect2>
+</sect1>
+
+<sect1>
+<title>Syntax</title>
+
+<para>The syntax of the smb.conf file is as follows:</para>
+
+<para><programlisting>
+ &lt;file&gt; :== { &lt;section&gt; } EOF
+ &lt;section&gt; :== &lt;section header&gt; { &lt;parameter line&gt; }
+ &lt;section header&gt; :== '[' NAME ']'
+ &lt;parameter line&gt; :== NAME '=' VALUE NL
+</programlisting></para>
+
+<para>Basically, this means that</para>
+
+<orderedlist>
+<listitem><para>
+ a file is made up of zero or more sections, and is terminated by
+ an EOF (we knew that).
+</para></listitem>
+
+<listitem><para>
+ A section is made up of a section header followed by zero or more
+ parameter lines.
+</para></listitem>
+
+<listitem><para>
+ A section header is identified by an opening bracket and
+ terminated by the closing bracket. The enclosed NAME identifies
+ the section.
+</para></listitem>
+
+<listitem><para>
+ A parameter line is divided into a NAME and a VALUE. The *first*
+ equal sign on the line separates the NAME from the VALUE. The
+ VALUE is terminated by a newline character (NL = '\n').
+</para></listitem>
+
+</orderedlist>
+
+<sect2>
+<title>About params.c</title>
+
+<para>
+The parsing of the config file is a bit unusual if you are used to
+lex, yacc, bison, etc. Both lexical analysis (scanning) and parsing
+are performed by params.c. Values are loaded via callbacks to
+loadparm.c.
+</para>
+</sect2>
+</sect1>
+</chapter>