summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--prog_guide.txt569
1 files changed, 569 insertions, 0 deletions
diff --git a/prog_guide.txt b/prog_guide.txt
new file mode 100644
index 0000000000..c000f66226
--- /dev/null
+++ b/prog_guide.txt
@@ -0,0 +1,569 @@
+THIS IS INCOMPLETE! I'M ONLY COMMITING IT IN ORDER TO SOLICIT COMMENTS
+FROM A FEW PEOPLE. DON'T TAKE THIS AS THE FINAL VERSION YET.
+
+
+
+
+Samba4 Programming Guide
+------------------------
+
+The internals of Samba4 are quite different from previous versions of
+Samba, so even if you are an experienced Samba developer please take
+the time to read through this document.
+
+This document will explain both the broad structure of Samba4, and
+some of the common coding elements such as memory management and
+dealing with macros.
+
+
+Coding Style
+------------
+
+In past versions of Samba we have basically let each programmer choose
+their own programming style. Unfortunately the result has often been
+that code that other members of the team find difficult to read. For
+Samba version 4 I would like to standardise on a common coding style
+to make the whole tree more readable. For those of you who are
+horrified at the idea of having to learn a new style, I can assure you
+that it isn't as painful as you might think. I was forced to adopt a
+new style when I started working on the Linux kernel, and after some
+initial pain found it quite easy.
+
+That said, I don't want to invent a new style, instead I would like to
+adopt the style used by the Linux kernel. It is a widely used style
+with plenty of support tools available. See Documentation/CodingStyle
+in the Linux source tree. This is the style that I have used to write
+all of the core infrastructure for Samba4 and I think that we should
+continue with that style.
+
+I also think that we should most definately *not* adopt an automatic
+reformatting system in cvs (or whatever other source code system we
+end up using in the future). Such automatic formatters are, in my
+experience, incredibly error prone and don't understand the necessary
+exceptions. I don't mind if people use automated tools to reformat
+their own code before they commit it, but please do not run such
+automated tools on large slabs of existing code without being willing
+to spend a *lot* of time hand checking the results.
+
+Finally, I think that for code that is parsing or formatting protocol
+packets the code layout should strongly reflect the packet
+format. That means ordring the code so that it parses in the same
+order as the packet is stored on the while (where possible) and using
+white space to align packet offsets so that a reader can immediately
+map any line of the code to the corresponding place in the packet.
+
+
+Static and Global Data
+----------------------
+
+The basic rule is "avoid static and global data like the plague". What
+do I mean by static data? The way to tell if you have static data in a
+file is to use the "size" utility in Linux. For example if we run:
+
+ size libcli/raw/*.o
+
+in Samba4 then you get the following:
+
+ text data bss dec hex filename
+ 2015 0 0 2015 7df libcli/raw/clikrb5.o
+ 202 0 0 202 ca libcli/raw/clioplock.o
+ 35 0 0 35 23 libcli/raw/clirewrite.o
+ 3891 0 0 3891 f33 libcli/raw/clisession.o
+ 869 0 0 869 365 libcli/raw/clisocket.o
+ 4962 0 0 4962 1362 libcli/raw/clispnego.o
+ 1223 0 0 1223 4c7 libcli/raw/clitransport.o
+ 2294 0 0 2294 8f6 libcli/raw/clitree.o
+ 1081 0 0 1081 439 libcli/raw/raweas.o
+ 6765 0 0 6765 1a6d libcli/raw/rawfile.o
+ 6824 0 0 6824 1aa8 libcli/raw/rawfileinfo.o
+ 2944 0 0 2944 b80 libcli/raw/rawfsinfo.o
+ 541 0 0 541 21d libcli/raw/rawioctl.o
+ 1728 0 0 1728 6c0 libcli/raw/rawnegotiate.o
+ 723 0 0 723 2d3 libcli/raw/rawnotify.o
+ 3779 0 0 3779 ec3 libcli/raw/rawreadwrite.o
+ 6597 0 0 6597 19c5 libcli/raw/rawrequest.o
+ 5580 0 0 5580 15cc libcli/raw/rawsearch.o
+ 3034 0 0 3034 bda libcli/raw/rawsetfileinfo.o
+ 5187 0 0 5187 1443 libcli/raw/rawtrans.o
+ 2033 0 0 2033 7f1 libcli/raw/smb_signing.o
+
+notice that the "data" and "bss" columns are all zero? That is
+good. If there are any non-zero values in data or bss then that
+indicates static data and is bad (as a rule of thumb).
+
+Lets compare that result to the equivalent in Samba3:
+
+ text data bss dec hex filename
+ 3978 0 0 3978 f8a libsmb/asn1.o
+ 18963 0 288 19251 4b33 libsmb/cliconnect.o
+ 2815 0 1024 3839 eff libsmb/clidgram.o
+ 4038 0 0 4038 fc6 libsmb/clientgen.o
+ 3337 664 256 4257 10a1 libsmb/clierror.o
+ 10043 0 0 10043 273b libsmb/clifile.o
+ 332 0 0 332 14c libsmb/clifsinfo.o
+ 166 0 0 166 a6 libsmb/clikrb5.o
+ 5212 0 0 5212 145c libsmb/clilist.o
+ 1367 0 0 1367 557 libsmb/climessage.o
+ 259 0 0 259 103 libsmb/clioplock.o
+ 1584 0 0 1584 630 libsmb/cliprint.o
+ 7565 0 256 7821 1e8d libsmb/cliquota.o
+ 7694 0 0 7694 1e0e libsmb/clirap.o
+ 27440 0 0 27440 6b30 libsmb/clirap2.o
+ 2905 0 0 2905 b59 libsmb/clireadwrite.o
+ 1698 0 0 1698 6a2 libsmb/clisecdesc.o
+ 5517 0 0 5517 158d libsmb/clispnego.o
+ 485 0 0 485 1e5 libsmb/clistr.o
+ 8449 0 0 8449 2101 libsmb/clitrans.o
+ 2053 0 4 2057 809 libsmb/conncache.o
+ 3041 0 256 3297 ce1 libsmb/credentials.o
+ 1261 0 1024 2285 8ed libsmb/doserr.o
+ 14560 0 0 14560 38e0 libsmb/errormap.o
+ 3645 0 0 3645 e3d libsmb/namecache.o
+ 16815 0 8 16823 41b7 libsmb/namequery.o
+ 1626 0 0 1626 65a libsmb/namequery_dc.o
+ 14301 0 1076 15377 3c11 libsmb/nmblib.o
+ 24516 0 2048 26564 67c4 libsmb/nterr.o
+ 8661 0 8 8669 21dd libsmb/ntlmssp.o
+ 3188 0 0 3188 c74 libsmb/ntlmssp_parse.o
+ 4945 0 0 4945 1351 libsmb/ntlmssp_sign.o
+ 1303 0 0 1303 517 libsmb/passchange.o
+ 1221 0 0 1221 4c5 libsmb/pwd_cache.o
+ 2475 0 4 2479 9af libsmb/samlogon_cache.o
+ 10768 32 0 10800 2a30 libsmb/smb_signing.o
+ 4524 0 16 4540 11bc libsmb/smbdes.o
+ 5708 0 0 5708 164c libsmb/smbencrypt.o
+ 7049 0 3072 10121 2789 libsmb/smberr.o
+ 2995 0 0 2995 bb3 libsmb/spnego.o
+ 3186 0 0 3186 c72 libsmb/trustdom_cache.o
+ 1742 0 0 1742 6ce libsmb/trusts_util.o
+ 918 0 28 946 3b2 libsmb/unexpected.o
+
+notice all of the non-zero data and bss elements? Every bit of that
+data is a bug waiting to happen.
+
+Static data is evil as it has the following consequences:
+ - it makes code much less likely to be thread-safe
+ - it makes code much less likely to be recursion-safe
+ - it leads to subtle side effects when the same code is called from
+ multiple places
+
+Static data is particularly evil in library code (such as our internal
+smb and rpc libraries). If you can get rid of all static data in
+libraries then you can make some fairly strong guarantees about the
+behaviour of functions in that library, which really helps.
+
+Of course, it is possible to write code that uses static data and is
+safe, it's just much harder to do that than just avoid static data in
+the first place. We have been tripped up countless times by subtle
+bugs in Samba due to the use of static data, so I think it is time to
+start avoiding it in new code. Much of the core infrastructure of
+Samba4 was specifically written to avoid static data, so I'm going to
+be really annoyed if everyone starts adding lots of static data back
+in.
+
+So, how do we avoid static data? The basic method is to use context
+pointers. When reading the Samba4 code you will notice that just about
+every function takes a pointer to a context structure as its first
+argument. Any data that the function needs that isn't an explicit
+argument to the function can be found by traversing that context.
+
+Note that this includes all of the little caches that we have lying
+all over the code in Samba3. I'm referring to the ones that generally
+have a "static int initialised" and then some static string or integer
+that remembers the last return value of the function. Get rid of them!
+If you are *REALLY* absolutely completely certain that your personal
+favourite mini-cache is needed then you should do it properly by
+putting it into the appropriate context rather than doing it the lazy
+way by putting it inside the target function. I would suggest however
+that the vast majority of those little caches are useless - don't
+stick it in unless you have really firm benchmarking results that show
+that it is needed and helps by a significant amount.
+
+Note that Samba4 is not yet completely clean of static data like
+this. I've gotten the smbd/ directory down to 24 bytes of static data,
+and libcli/raw/ down to zero. I've also gotten the ntvfs layer and all
+backends down to just 8 bytes in ntvfs_base.c. The rest still needs
+some more work.
+
+Also note that truly constant data is OK, and will not in fact show up
+in the data and bss columns in "size" anyway (it will be included in
+"text"). So you can have constant tables of protocol data.
+
+
+Memory Contexts
+---------------
+
+We introduced the talloc() system for memory contexts during the 2.2
+development cycle and it has been a great success. It has greatly
+simplified a lot of our code, particularly with regard to error
+handling.
+
+In Samba4 we use talloc even more extensively to give us much finer
+grained memory management. The really important thing to remember
+about talloc in Samba4 is:
+
+ "don't just use the first talloc context that comes to hand - use
+ the RIGHT talloc context"
+
+Just using the first talloc context that comes to hand is probably the
+most common systematic bug I have seen so far from programmers that
+have worked on the Samba4 code base. The reason this is vital is that
+different talloc contexts have vastly different lifetimes, so if you
+use a talloc context that has a long lifetime (such as one associated
+with a tree connection) for data that is very short lived (such as
+parsing an individual packet) then you have just introduced a huge
+memory leak.
+
+In fact, it is quite common that the correct thing to do is to create
+a new talloc context for some little function and then destroy it when
+you are done. That will give you a memory context that has exactly the
+right lifetime.
+
+You should also go and look at a new talloc function in Samba4 called
+talloc_steal(). By using talloc_steal() you can move a lump of memory
+from one memory context to another without copying the data. This
+should be used when a backend function (such as a packet parser)
+produces a result as a lump of talloc memory and you need to keep it
+around for a longer lifetime than the talloc context it is in. You
+just "steal" the memory from the short-lived context, putting it into
+your long lived context.
+
+
+Interface Structures
+--------------------
+
+One of the biggest changes in Samba4 is the universal use of interface
+structures. Go take a look through include/smb_interfaces.h now to get
+an idea of what I am talking about.
+
+In Samba3 many of the core wire structures in the SMB protocol were
+never explicitly defined in Samba. Instead, our parse and generation
+functions just worked directly with wire buffers. The biggest problem
+with this is that is tied our parse code with out "business logic"
+much too closely, which meant the code got extremely confusing to
+read.
+
+In Samba4 we have explicitly defined interface structures for
+everything in the protocol. When we receive a buffer we always parse
+it completely into one of these structures, then we pass a pointer to
+that structure to a backend handler. What we must *not* do is make any
+decisions about the data inside the parse functions. That is critical
+as different backends will need different portions of the data. This
+leads to a golden rule for Samba4:
+
+ "don't design interfaces that lose information"
+
+In Samba3 our backends often received "condensed" versions of the
+information sent from clients, but this inevitably meant that some
+backends could not get at the data they needed to do what they wanted,
+so from now on we should expose the backends to all of the available
+information and let them choose which bits they want.
+
+Ok, so now some of you will be thinking "this sounds just like our
+msrpc code from Samba3", and while to some extent this is true there
+are extremely important differences in the approach that are worth
+pointing out.
+
+In the Samba3 msrpc code we used explicit parse strucrures for all
+msrpc functions. The problem is that we didn't just put all of the
+real variables in these structures, we also put in all the artifacts
+as well. A good example is the security descriptor strucrure that
+looks like this in Samba3:
+
+typedef struct security_descriptor_info
+{
+ uint16 revision;
+ uint16 type;
+
+ uint32 off_owner_sid;
+ uint32 off_grp_sid;
+ uint32 off_sacl;
+ uint32 off_dacl;
+
+ SEC_ACL *dacl;
+ SEC_ACL *sacl;
+ DOM_SID *owner_sid;
+ DOM_SID *grp_sid;
+} SEC_DESC;
+
+The problem with this structure is all the off_* variables. Those are
+not part of the interface, and do not appear in any real descriptions
+of Microsoft security descriptors. They are parsing artifacts
+generated by the IDL compiler that Microsoft use. That doesn't mean
+they aren't needed on the wire - indeed they are as they tell the
+parser where to find the following four variables, but they should
+*NOT* be in the interface structure.
+
+In Samba3 there were unwritten rules about which variables in a
+strucrure a high level caller has to fill in and which ones are filled
+in by the marshalling code. In Samba4 those rules are gone, because
+the redundent artifact variables are gone. The high level caller just
+sets up the real variables and the marshalling code worries about
+generating the right offsets.
+
+The same rule applies to strings. In many places in the SMB and MSRPC
+protocols complex strings are used on the wire, with complex rules
+about padding, format, alighment, termination etc. None of that
+information is useful to a high level calling routine or to a backend
+- its all just so much wire fluff. So, in Samba4 these strings are
+just "char *" and are always in our internal multi-byte format (which
+is usually UTF8). It is up to the parse functions to worry about
+translating the format and getting the padding right.
+
+The one exception to this is the use of the WIRE_STRING type, but that
+has a very good justification in terms of regression testing. Go and
+read the comment in smb_interfaces.h about that now.
+
+So, here is another rule to code by. When writing an interface
+structure think carefully about what variables in the structure can be
+left out as they are redundent. If some length is effectively defined
+twice on the wire then only put it once in the packet. If a length can
+be inferred from a null termination then do that and leave the length
+out of the structure completely. Don't put redundent stuff in
+structures!
+
+
+Async Design
+------------
+
+Samba4 has an asynchronous design. That affects *lots* of the code,
+and the implications of the asynchronous design needs to be considered
+just about everywhere.
+
+The first aspect of the async design to look at is the SMB client
+library. Lets take a look at the following three functions in
+libcli/raw/rawfile.c:
+
+struct cli_request *smb_raw_seek_send(struct cli_tree *tree, struct smb_seek *parms);
+NTSTATUS smb_raw_seek_recv(struct cli_request *req, struct smb_seek *parms);
+NTSTATUS smb_raw_seek(struct cli_tree *tree, struct smb_seek *parms);
+
+Go and read them now then come back.
+
+Ok, first notice there there are 3 separate functions, whereas the
+equivalent code in Samba3 had just one. Also note that the 3rd
+function is extremely simple - its just a wrapper around calling the
+first two in order.
+
+The three separate functions are needed because we need to be able to
+generate SMB calls asynchronously. The first call, which for smb calls
+is always called smb_raw_XXXX_send(), constructs and sends a SMB
+request and returns a "struct cli_request" which acts as a handle for
+the request. The caller is then free to do lots of other calls if it
+wants to, then when it is ready it can call the smb_raw_XXX_recv()
+function to receive the reply.
+
+If all you want is a synchronous call then call the 3rd interface, the
+one called smb_raw_XXXX(). That just calls the first two in order, and
+blocks waiting for the reply.
+
+But what if you want to be called when the reply comes in? Yes, thats
+possible. You can do things like this:
+
+ struct cli_request *req;
+
+ req = smb_raw_XXX_send(tree, params);
+
+ req->async.fn = my_callback;
+ req->async.private = my_private_data;
+
+then in your callback function you can call the smb_raw_XXXX_recv()
+function to receive the reply. Your callback will receive the "req"
+pointer, which you can use to retrieve your private data.
+
+Then all you need to do is ensure that the main loop in the client
+library gets called. You can either do that by polling the connection
+using cli_transport_pending() and cli_request_receive_next() or you
+can use transport->idle.func to setup an idle function handler to call
+back to your main code. Either way, you can build a fully async
+application.
+
+In order to support all of this we have to make sure that when we
+write a piece of library code (SMB, MSRPC etc) that we build the
+separate _send() and _recv() functions. It really is worth the effort.
+
+Now about async in smbd, a much more complex topic.
+
+The SMB protocol is inherently async. Some functions (such as change
+notify) often don't return for hours, while hundreds of other
+functions pass through the socket. Take a look at the RAW-MUX test in
+the Samba4 smbtorture to see some really extreme examples of the sort
+of async operations that Windows supports.
+
+In Samba3 we handled this stuff very badly. We had awful "pending
+request" queues that allocated full 128k packet buffers, and even with
+all that crap we got the semantics wrong. In Samba4 I intend to make
+sure we get this stuff right.
+
+So, how do we do this? We now an async interface between smbd and the
+NTVFS backends. Whenever smbd calls into a backend the backend has an
+option of answer the request in a synchronous fashion if it wants to
+just like in Samba3, but it also has the option of answering the
+request asynchronously. The only backend that currently does this is
+the CIFS backend, but I hope the other backends will soon do this to.
+
+To make this work you need to do things like this in the backend:
+
+ req->control_flags |= REQ_CONTROL_ASYNC;
+
+that tells smbd that the backend has elected to reply later rather
+than replying immediately. The backend must *only* do this if
+req->async.send_fn is not NULL. If send_fn is NULL then it means that
+smbd cannot handle this function being replied to in an async fashion.
+
+If the backend does this then it is up to the backend to call
+req->async.send_fn() when it is ready to reply. It the meantime smbd
+puts the call on hold and goes back to answering other requests on the
+socket.
+
+Inside smbd you will find that there is code to support this. The most
+obvious change is that smbd splits each SMB reply function into two
+parts - just like the client library has a _send() and _recv()
+function, so smbd has a _send() function and the parse function for
+each SMB.
+
+Go and have a look at reply_getatr_send() and reply_getatr() in
+smbd/reply.c. Read them? Good.
+
+Notice that reply_getatr() sets up the req->async structure to contain
+the send function. Thats how the backend gets to do an async
+reply. Also notice that reply_getatr() only does the parsing of the
+request, and does not to the reply generation. That is done by the
+_send() function. Nice and simple really.
+
+
+
+MSRPC
+-----
+
+
+
+ - ntvfs
+ - testing
+ - command line handling
+ - libcli structure
+ - posix reliance
+ - uid/gid handling
+ - process models
+ - static data
+ - msrpc
+
+
+
+
+GMT vs TZ in printout of QFILEINFO timezones
+
+put in full UNC path in tconx
+
+test timezone handling by using a server in different zone from client
+
+don't just use any old TALLOC_CTX, use the right one!
+
+do {} while (0) system
+
+NT_STATUS_IS_OK() is NOT the opposite of NT_STATUS_IS_ERR()
+
+need to implement secondary parts of trans2 and nttrans in server and
+client
+
+add talloc_steal() to move a talloc ptr from one pool to another
+
+document access_mask in openx reply
+
+check all capabilities and flag1, flag2 fields (eg. EAs)
+
+large files -> pass thru levels
+
+setpathinfo is very fussy about null termination of the file name
+
+the overwrite flag doesn't seem to work on setpathinfo RENAME_INFORMATION
+
+END_OF_FILE_INFORMATION and ALLOCATION_INFORMATION don't seem to work
+via setpathinfo
+
+on w2k3 setpathinfo DISPOSITION_INFORMATION fails, but does have an
+effect. It leaves the file with SHARING_VIOLATION.
+
+on w2k3 trans2 setpathinfo with any invalid low numbered level causes
+the file to get into a state where DELETE_PENDING is reported, and the
+file cannot be deleted until you reboot
+
+trans2 qpathinfo doesn't see the delete_pending flag correctly, but
+qfileinfo does!
+
+get rid of pstring, fstring, strtok
+
+add programming documentation note about lp_set_cmdline()
+
+need to add a wct checking function in all client parsing code,
+similar to REQ_CHECK_WCT()
+
+need to make sure that NTTIME is a round number of seconds when
+converted from time_t
+
+not using a zero next offset in SMB_FILE_STREAM_INFORMATION for last
+entry causes explorer exception under win2000
+
+
+if the server sets the session key the same for a second SMB socket as
+an initial socket then the client will not re-authenticate, it will go
+straight to a tconx, skipping session setup and will use all the
+existing parameters! This allows two sockets with the same keys!?
+
+
+removed blocking lock code, we now queue the whole request the same as
+we queue any other pending request. This allows for things like a
+close() while a pending blocking lock is being processed to operate
+sanely.
+
+disabled change notify code
+
+disabled oplock code
+
+
+
+MILESTONES
+==========
+
+
+client library and test code
+----------------------------
+
+ convert client library to new structure
+ get smbtorture working
+ get smbclient working
+ expand client library for all requests
+ write per-request test suite
+ gentest randomised test suite
+ separate client code as a library for non-Samba use
+
+server code
+-----------
+ add remaining core SMB requests
+ add IPC layer
+ add nttrans layer
+ add rpc layer
+ fix auth models (share, server, rpc)
+ get net command working
+ connect CIFS backend to server level auth
+ get nmbd working
+ get winbindd working
+ reconnect printing code
+ restore removed smbd options
+ add smb.conf macro substitution code
+ add async backend notification
+ add generic timer event mechanism
+
+clustering code
+---------------
+
+ write CIFS backend
+ new server models (break 1-1)
+ test clustered models
+ add fulcrum statistics gathering
+
+docs
+----
+
+ conference paper
+ developer docs