summaryrefslogtreecommitdiff
path: root/docs/docbook/devdoc/internals.sgml
blob: 982cfd2e1081d552504e254e621dbce55b5c808c (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
<chapter id="internals">
<chapterinfo>
	<author>
		<firstname>David</firstname><surname>Chappell</surname>
		<affiliation>
			<address><email>David.Chappell@mail.trincoll.edu</email></address>
		</affiliation>
	</author>
	<pubdate>8 May 1996</pubdate>
</chapterinfo>

<title>Samba Internals</title>

<sect1>
<title>Character Handling</title>
<para>
This section describes character set handling in Samba, as implemented in
Samba 3.0 and above
</para>

<para>
In the past Samba had very ad-hoc character set handling. Scattered
throughout the code were numerous calls which converted particular
strings to/from DOS codepages. The problem is that there was no way of
telling if a particular char* is in dos codepage or unix
codepage. This led to a nightmare of code that tried to cope with
particular cases without handlingt the general case.
</para>
</sect1>

<sect1>
<title>The new functions</title>

<para>
The new system works like this:
</para>

<orderedlist>
<listitem><para>
	all char* strings inside Samba are "unix" strings. These are
	multi-byte strings that are in the charset defined by the "unix
	charset" option in smb.conf. 
</para></listitem>

<listitem><para>
	there is no single fixed character set for unix strings, but any
	character set that is used does need the following properties:
	</para>
	<orderedlist>
	
	<listitem><para>
		must not contain NULLs except for termination
	</para></listitem>

	<listitem><para>
		must be 7-bit compatible with C strings, so that a constant
		string or character in C will be byte-for-byte identical to the
		equivalent string in the chosen character set. 
	</para></listitem>
	
	<listitem><para>
		when you uppercase or lowercase a string it does not become
		longer than the original string
	</para></listitem>

	<listitem><para>
		must be able to correctly hold all characters that your client
		will throw at it
	</para></listitem>
	</orderedlist>
	
	<para>
	For example, UTF-8 is fine, and most multi-byte asian character sets
	are fine, but UCS2 could not be used for unix strings as they
	contain nulls.
	</para>
</listitem>

<listitem><para>
	when you need to put a string into a buffer that will be sent on the
	wire, or you need a string in a character set format that is
	compatible with the clients character set then you need to use a
	pull_ or push_ function. The pull_ functions pull a string from a
	wire buffer into a (multi-byte) unix string. The push_ functions
	push a string out to a wire buffer. 
</para></listitem>

<listitem><para>
	the two main pull_ and push_ functions you need to understand are
	pull_string and push_string. These functions take a base pointer
	that should point at the start of the SMB packet that the string is
	in. The functions will check the flags field in this packet to
	automatically determine if the packet is marked as a unicode packet,
	and they will choose whether to use unicode for this string based on
	that flag. You may also force this decision using the STR_UNICODE or
	STR_ASCII flags. For use in smbd/ and libsmb/ there are wrapper
	functions clistr_ and srvstr_ that call the pull_/push_ functions
	with the appropriate first argument.
	</para>
	
	<para>
	You may also call the pull_ascii/pull_ucs2 or push_ascii/push_ucs2
	functions if you know that a particular string is ascii or
	unicode. There are also a number of other convenience functions in
	charcnv.c that call the pull_/push_ functions with particularly
	common arguments, such as pull_ascii_pstring()
	</para>
</listitem>

<listitem><para>
	The biggest thing to remember is that internal (unix) strings in Samba
	may now contain multi-byte characters. This means you cannot assume
	that characters are always 1 byte long. Often this means that you will
	have to convert strings to ucs2 and back again in order to do some
	(seemingly) simple task. For examples of how to do this see functions
	like strchr_m(). I know this is very slow, and we will eventually
	speed it up but right now we want this stuff correct not fast.
</para></listitem>

<listitem><para>
	all lp_ functions now return unix strings. The magic "DOS" flag on
	parameters is gone.
</para></listitem>

<listitem><para>
	all vfs functions take unix strings. Don't convert when passing to them
</para></listitem>

</orderedlist>

</sect1>

<sect1>
<title>Macros in byteorder.h</title>

<para>
This section describes the macros defined in byteorder.h.  These macros 
are used extensively in the Samba code.
</para>

<sect2>
<title>CVAL(buf,pos)</title>

<para>
returns the byte at offset pos within buffer buf as an unsigned character.
</para>
</sect2>

<sect2>
<title>PVAL(buf,pos)</title>
<para>returns the value of CVAL(buf,pos) cast to type unsigned integer.</para>
</sect2>

<sect2>
<title>SCVAL(buf,pos,val)</title>
<para>sets the byte at offset pos within buffer buf to value val.</para>
</sect2>

<sect2>
<title>SVAL(buf,pos)</title>
<para>
	returns the value of the unsigned short (16 bit) little-endian integer at 
	offset pos within buffer buf.  An integer of this type is sometimes
	refered to as "USHORT".
</para>
</sect2>

<sect2>
<title>IVAL(buf,pos)</title>
<para>returns the value of the unsigned 32 bit little-endian integer at offset 
pos within buffer buf.</para>
</sect2>

<sect2>
<title>SVALS(buf,pos)</title>
<para>returns the value of the signed short (16 bit) little-endian integer at 
offset pos within buffer buf.</para>
</sect2>

<sect2>
<title>IVALS(buf,pos)</title>
<para>returns the value of the signed 32 bit little-endian integer at offset pos
within buffer buf.</para>
</sect2>

<sect2>
<title>SSVAL(buf,pos,val)</title>
<para>sets the unsigned short (16 bit) little-endian integer at offset pos within 
buffer buf to value val.</para>
</sect2>

<sect2>
<title>SIVAL(buf,pos,val)</title>
<para>sets the unsigned 32 bit little-endian integer at offset pos within buffer 
buf to the value val.</para>
</sect2>

<sect2>
<title>SSVALS(buf,pos,val)</title>
<para>sets the short (16 bit) signed little-endian integer at offset pos within 
buffer buf to the value val.</para>
</sect2>

<sect2>
<title>SIVALS(buf,pos,val)</title>
<para>sets the signed 32 bit little-endian integer at offset pos withing buffer
buf to the value val.</para>
</sect2>

<sect2>
<title>RSVAL(buf,pos)</title>
<para>returns the value of the unsigned short (16 bit) big-endian integer at 
offset pos within buffer buf.</para>
</sect2>

<sect2>
<title>RIVAL(buf,pos)</title>
<para>returns the value of the unsigned 32 bit big-endian integer at offset 
pos within buffer buf.</para>
</sect2>

<sect2>
<title>RSSVAL(buf,pos,val)</title>
<para>sets the value of the unsigned short (16 bit) big-endian integer at 
offset pos within buffer buf to value val.
refered to as "USHORT".</para>
</sect2>

<sect2>
<title>RSIVAL(buf,pos,val)</title>
<para>sets the value of the unsigned 32 bit big-endian integer at offset 
pos within buffer buf to value val.</para>
</sect2>

</sect1>


<sect1>
<title>LAN Manager Samba API</title>

<para>
This section describes the functions need to make a LAN Manager RPC call.
This information had been obtained by examining the Samba code and the LAN
Manager 2.0 API documentation.  It should not be considered entirely
reliable.
</para>

<para>
<programlisting>
call_api(int prcnt, int drcnt, int mprcnt, int mdrcnt, 
	char *param, char *data, char **rparam, char **rdata);
</programlisting>
</para>

<para>
This function is defined in client.c.  It uses an SMB transaction to call a
remote api.
</para>

<sect2>
<title>Parameters</title>

<para>The parameters are as follows:</para>

<orderedlist>
<listitem><para>
	prcnt: the number of bytes of parameters begin sent.
</para></listitem>
<listitem><para>
	drcnt:   the number of bytes of data begin sent.
</para></listitem>
<listitem><para>
	mprcnt:  the maximum number of bytes of parameters which should be returned
</para></listitem>
<listitem><para>
	mdrcnt:  the maximum number of bytes of data which should be returned
</para></listitem>
<listitem><para>
	param:   a pointer to the parameters to be sent.
</para></listitem>
<listitem><para>
	data:    a pointer to the data to be sent.
</para></listitem>
<listitem><para>
	rparam:  a pointer to a pointer which will be set to point to the returned
	paramters.  The caller of call_api() must deallocate this memory.
</para></listitem>
<listitem><para>
	rdata:   a pointer to a pointer which will be set to point to the returned 
	data.  The caller of call_api() must deallocate this memory.
</para></listitem>
</orderedlist>

<para>
These are the parameters which you ought to send, in the order of their
appearance in the parameter block:
</para>

<orderedlist>

<listitem><para>
An unsigned 16 bit integer API number.  You should set this value with
SSVAL().  I do not know where these numbers are described.
</para></listitem>

<listitem><para>
An ASCIIZ string describing the parameters to the API function as defined
in the LAN Manager documentation.  The first parameter, which is the server
name, is ommited.  This string is based uppon the API function as described
in the manual, not the data which is actually passed.
</para></listitem>

<listitem><para>
An ASCIIZ string describing the data structure which ought to be returned.
</para></listitem>

<listitem><para>
Any parameters which appear in the function call, as defined in the LAN
Manager API documentation, after the "Server" and up to and including the
"uLevel" parameters.
</para></listitem>

<listitem><para>
An unsigned 16 bit integer which gives the size in bytes of the buffer we
will use to receive the returned array of data structures.  Presumably this
should be the same as mdrcnt.  This value should be set with SSVAL().
</para></listitem>

<listitem><para>
An ASCIIZ string describing substructures which should be returned.  If no 
substructures apply, this string is of zero length.
</para></listitem>

</orderedlist>

<para>
The code in client.c always calls call_api() with no data.  It is unclear
when a non-zero length data buffer would be sent.
</para>

</sect2>

<sect2>
<title>Return value</title>

<para>
The returned parameters (pointed to by rparam), in their order of appearance
are:</para>

<orderedlist>

<listitem><para>
An unsigned 16 bit integer which contains the API function's return code. 
This value should be read with SVAL().
</para></listitem>

<listitem><para>
An adjustment which tells the amount by which pointers in the returned
data should be adjusted.  This value should be read with SVAL().  Basically, 
the address of the start of the returned data buffer should have the returned
pointer value added to it and then have this value subtracted from it in
order to obtain the currect offset into the returned data buffer.
</para></listitem>

<listitem><para>
A count of the number of elements in the array of structures returned. 
It is also possible that this may sometimes be the number of bytes returned.
</para></listitem>
</orderedlist>

<para>
When call_api() returns, rparam points to the returned parameters.  The
first if these is the result code.  It will be zero if the API call
suceeded.  This value by be read with "SVAL(rparam,0)".
</para>

<para>
The second parameter may be read as "SVAL(rparam,2)".  It is a 16 bit offset
which indicates what the base address of the returned data buffer was when
it was built on the server.  It should be used to correct pointer before
use.
</para>

<para>
The returned data buffer contains the array of returned data structures. 
Note that all pointers must be adjusted before use.  The function
fix_char_ptr() in client.c can be used for this purpose.
</para>

<para>
The third parameter (which may be read as "SVAL(rparam,4)") has something to
do with indicating the amount of data returned or possibly the amount of
data which can be returned if enough buffer space is allowed.
</para>

</sect2>
</sect1>

<sect1>
<title>Code character table</title>
<para>
Certain data structures are described by means of ASCIIz strings containing
code characters.  These are the code characters:
</para>

<orderedlist>
<listitem><para>
W	a type byte little-endian unsigned integer
</para></listitem>
<listitem><para>
N	a count of substructures which follow
</para></listitem>
<listitem><para>
D	a four byte little-endian unsigned integer
</para></listitem>
<listitem><para>
B	a byte (with optional count expressed as trailing ASCII digits)
</para></listitem>
<listitem><para>
z	a four byte offset to a NULL terminated string
</para></listitem>
<listitem><para>
l	a four byte offset to non-string user data
</para></listitem>
<listitem><para>
b	an offset to data (with count expressed as trailing ASCII digits)
</para></listitem>
<listitem><para>
r	pointer to returned data buffer???
</para></listitem>
<listitem><para>
L	length in bytes of returned data buffer???
</para></listitem>
<listitem><para>
h	number of bytes of information available???
</para></listitem>
</orderedlist>

</sect1>
</chapter>