summaryrefslogtreecommitdiff
path: root/docs/docbook/projdoc/Speed.sgml
blob: 17adf1042917ee5817db2e2f2c0c4cf3639cf8b8 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
<chapter id="speed">

<chapterinfo>
	<author>
		<affiliation>
			<orgname>Samba Team</orgname>
			<address><email>samba@samba.org</email></address>
		</affiliation>
	</author>
	<author>
		<firstname>Paul</firstname><surname>Cochrane</surname>
		<affiliation>
			<orgname>Dundee Limb Fitting Centre</orgname>
			<address><email>paulc@dth.scot.nhs.uk</email></address>
		</affiliation>
	</author>
</chapterinfo>

<title>Samba performance issues</title>

<sect1>
<title>Comparisons</title>

<para>
The Samba server uses TCP to talk to the client. Thus if you are
trying to see if it performs well you should really compare it to
programs that use the same protocol. The most readily available
programs for file transfer that use TCP are ftp or another TCP based
SMB server.
</para>

<para>
If you want to test against something like a NT or WfWg server then
you will have to disable all but TCP on either the client or
server. Otherwise you may well be using a totally different protocol
(such as Netbeui) and comparisons may not be valid.
</para>

<para>
Generally you should find that Samba performs similarly to ftp at raw
transfer speed. It should perform quite a bit faster than NFS,
although this very much depends on your system.
</para>

<para>
Several people have done comparisons between Samba and Novell, NFS or
WinNT. In some cases Samba performed the best, in others the worst. I
suspect the biggest factor is not Samba vs some other system but the
hardware and drivers used on the various systems. Given similar
hardware Samba should certainly be competitive in speed with other
systems.
</para>

</sect1>

<sect1>
<title>Oplocks</title>

<sect2>
<title>Overview</title>

<para>
Oplocks are the way that SMB clients get permission from a server to
locally cache file operations. If a server grants an oplock
(opportunistic lock) then the client is free to assume that it is the
only one accessing the file and it will agressively cache file
data. With some oplock types the client may even cache file open/close
operations. This can give enormous performance benefits.
</para>

<para>
With the release of Samba 1.9.18 we now correctly support opportunistic 
locks. This is turned on by default, and can be turned off on a share-
by-share basis by setting the parameter :
</para>

<para>
<command>oplocks = False</command>
</para>

<para>
We recommend that you leave oplocks on however, as current benchmark
tests with NetBench seem to give approximately a 30% improvement in
speed with them on. This is on average however, and the actual 
improvement seen can be orders of magnitude greater, depending on
what the client redirector is doing.
</para>

<para>
Previous to Samba 1.9.18 there was a 'fake oplocks' option. This
option has been left in the code for backwards compatibility reasons
but it's use is now deprecated. A short summary of what the old
code did follows.
</para>

</sect2>

<sect2>
<title>Level2 Oplocks</title>

<para>
With Samba 2.0.5 a new capability - level2 (read only) oplocks is
supported (although the option is off by default - see the smb.conf
man page for details). Turning on level2 oplocks (on a share-by-share basis)
by setting the parameter :
</para>

<para>
<command>level2 oplocks = true</command>
</para>

<para>
should speed concurrent access to files that are not commonly written
to, such as application serving shares (ie. shares that contain common
.EXE files - such as a Microsoft Office share) as it allows clients to
read-ahread cache copies of these files.
</para>

</sect2>

<sect2>
<title>Old 'fake oplocks' option - deprecated</title>

<para>
Samba can also fake oplocks, by granting a oplock whenever a client 
asks for one. This is controlled using the smb.conf option "fake 
oplocks". If you set "fake oplocks = yes" then you are telling the 
client that it may agressively cache the file data for all opens.
</para>

<para>
Enabling 'fake oplocks' on all read-only shares or shares that you know
will only be accessed from one client at a time you will see a big
performance improvement on many operations. If you enable this option
on shares where multiple clients may be accessing the files read-write
at the same time you can get data corruption.
</para>

</sect2>
</sect1>

<sect1>
<title>Socket options</title>

<para>
There are a number of socket options that can greatly affect the
performance of a TCP based server like Samba.
</para>

<para>
The socket options that Samba uses are settable both on the command
line with the -O option, or in the smb.conf file.
</para>

<para>
The "socket options" section of the smb.conf manual page describes how
to set these and gives recommendations.
</para>

<para>
Getting the socket options right can make a big difference to your
performance, but getting them wrong can degrade it by just as
much. The correct settings are very dependent on your local network.
</para>

<para>
The socket option TCP_NODELAY is the one that seems to make the
biggest single difference for most networks. Many people report that
adding "socket options = TCP_NODELAY" doubles the read performance of
a Samba drive. The best explanation I have seen for this is that the
Microsoft TCP/IP stack is slow in sending tcp ACKs.
</para>

</sect1>

<sect1>
<title>Read size</title>

<para>
The option "read size" affects the overlap of disk reads/writes with
network reads/writes. If the amount of data being transferred in
several of the SMB commands (currently SMBwrite, SMBwriteX and
SMBreadbraw) is larger than this value then the server begins writing
the data before it has received the whole packet from the network, or
in the case of SMBreadbraw, it begins writing to the network before
all the data has been read from disk.
</para>

<para>
This overlapping works best when the speeds of disk and network access
are similar, having very little effect when the speed of one is much
greater than the other.
</para>

<para>
The default value is 16384, but very little experimentation has been
done yet to determine the optimal value, and it is likely that the best
value will vary greatly between systems anyway. A value over 65536 is
pointless and will cause you to allocate memory unnecessarily.
</para>

</sect1>

<sect1>
<title>Max xmit</title>

<para>
At startup the client and server negotiate a "maximum transmit" size,
which limits the size of nearly all SMB commands. You can set the
maximum size that Samba will negotiate using the "max xmit = " option
in smb.conf. Note that this is the maximum size of SMB request that 
Samba will accept, but not the maximum size that the *client* will accept.
The client maximum receive size is sent to Samba by the client and Samba
honours this limit.
</para>

<para>
It defaults to 65536 bytes (the maximum), but it is possible that some
clients may perform better with a smaller transmit unit. Trying values
of less than 2048 is likely to cause severe problems.
</para>

<para>
In most cases the default is the best option.
</para>

</sect1>

<sect1>
<title>Locking</title>

<para>
By default Samba does not implement strict locking on each read/write
call (although it did in previous versions). If you enable strict
locking (using "strict locking = yes") then you may find that you
suffer a severe performance hit on some systems.
</para>

<para>
The performance hit will probably be greater on NFS mounted
filesystems, but could be quite high even on local disks.
</para>

</sect1>

<sect1>
<title>Share modes</title>

<para>
Some people find that opening files is very slow. This is often
because of the "share modes" code needed to fully implement the dos
share modes stuff. You can disable this code using "share modes =
no". This will gain you a lot in opening and closing files but will
mean that (in some cases) the system won't force a second user of a
file to open the file read-only if the first has it open
read-write. For many applications that do their own locking this
doesn't matter, but for some it may. Most Windows applications
depend heavily on "share modes" working correctly and it is
recommended that the Samba share mode support be left at the
default of "on".
</para>

<para>
The share mode code in Samba has been re-written in the 1.9.17
release following tests with the Ziff-Davis NetBench PC Benchmarking
tool. It is now believed that Samba 1.9.17 implements share modes
similarly to Windows NT.
</para>

<para>
NOTE: In the most recent versions of Samba there is an option to use
shared memory via mmap() to implement the share modes. This makes
things much faster. See the Makefile for how to enable this.
</para>

</sect1>

<sect1>
<title>Log level</title>

<para>
If you set the log level (also known as "debug level") higher than 2
then you may suffer a large drop in performance. This is because the
server flushes the log file after each operation, which can be very
expensive. 
</para>
</sect1>

<sect1>
<title>Wide lines</title>

<para>
The "wide links" option is now enabled by default, but if you disable
it (for better security) then you may suffer a performance hit in
resolving filenames. The performance loss is lessened if you have
"getwd cache = yes", which is now the default.
</para>

</sect1>

<sect1>
<title>Read raw</title>

<para>
The "read raw" operation is designed to be an optimised, low-latency
file read operation. A server may choose to not support it,
however. and Samba makes support for "read raw" optional, with it
being enabled by default.
</para>

<para>
In some cases clients don't handle "read raw" very well and actually
get lower performance using it than they get using the conventional
read operations. 
</para>

<para>
So you might like to try "read raw = no" and see what happens on your
network. It might lower, raise or not affect your performance. Only
testing can really tell.
</para>

</sect1>

<sect1>
<title>Write raw</title>

<para>
The "write raw" operation is designed to be an optimised, low-latency
file write operation. A server may choose to not support it,
however. and Samba makes support for "write raw" optional, with it
being enabled by default.
</para>

<para>
Some machines may find "write raw" slower than normal write, in which
case you may wish to change this option.
</para>

</sect1>

<sect1>
<title>Read prediction</title>

<para>
Samba can do read prediction on some of the SMB commands. Read
prediction means that Samba reads some extra data on the last file it
read while waiting for the next SMB command to arrive. It can then
respond more quickly when the next read request arrives.
</para>

<para>
This is disabled by default. You can enable it by using "read
prediction = yes".
</para>

<para>
Note that read prediction is only used on files that were opened read
only.
</para>

<para>
Read prediction should particularly help for those silly clients (such
as "Write" under NT) which do lots of very small reads on a file.
</para>

<para>
Samba will not read ahead more data than the amount specified in the
"read size" option. It always reads ahead on 1k block boundaries.
</para>

</sect1>

<sect1>
<title>Memory mapping</title>

<para>
Samba supports reading files via memory mapping them. One some
machines this can give a large boost to performance, on others it
makes not difference at all, and on some it may reduce performance.
</para>

<para>
To enable you you have to recompile Samba with the -DUSE_MMAP option
on the FLAGS line of the Makefile.
</para>

<para>
Note that memory mapping is only used on files opened read only, and
is not used by the "read raw" operation. Thus you may find memory
mapping is more effective if you disable "read raw" using "read raw =
no".
</para>

</sect1>

<sect1>
<title>Slow Clients</title>

<para>
One person has reported that setting the protocol to COREPLUS rather
than LANMAN2 gave a dramatic speed improvement (from 10k/s to 150k/s).
</para>

<para>
I suspect that his PC's (386sx16 based) were asking for more data than
they could chew. I suspect a similar speed could be had by setting
"read raw = no" and "max xmit = 2048", instead of changing the
protocol. Lowering the "read size" might also help.
</para>

</sect1>

<sect1>
<title>Slow Logins</title>

<para>
Slow logins are almost always due to the password checking time. Using
the lowest practical "password level" will improve things a lot. You
could also enable the "UFC crypt" option in the Makefile.
</para>

</sect1>

<sect1>
<title>Client tuning</title>

<para>
Often a speed problem can be traced to the client. The client (for
example Windows for Workgroups) can often be tuned for better TCP
performance.
</para>

<para>
See your client docs for details. In particular, I have heard rumours
that the WfWg options TCPWINDOWSIZE and TCPSEGMENTSIZE can have a
large impact on performance.
</para>

<para>
Also note that some people have found that setting DefaultRcvWindow in
the [MSTCP] section of the SYSTEM.INI file under WfWg to 3072 gives a
big improvement. I don't know why.
</para>

<para>
My own experience wth DefaultRcvWindow is that I get much better
performance with a large value (16384 or larger). Other people have
reported that anything over 3072 slows things down enourmously. One
person even reported a speed drop of a factor of 30 when he went from
3072 to 8192. I don't know why.
</para>

<para>
It probably depends a lot on your hardware, and the type of unix box
you have at the other end of the link.
</para>

<para>
Paul Cochrane has done some testing on client side tuning and come 
to the following conclusions:
</para>

<para>
Install the W2setup.exe file from www.microsoft.com. This is an 
update for the winsock stack and utilities which improve performance.
</para>

<para>
Configure the win95 TCPIP registry settings to give better 
perfomance. I use a program called MTUSPEED.exe which I got off the 
net. There are various other utilities of this type freely available. 
The setting which give the best performance for me are:
</para>

<orderedlist>
<listitem><para>
MaxMTU                  Remove
</para></listitem>
<listitem><para>
RWIN                    Remove
</para></listitem>
<listitem><para>
MTUAutoDiscover         Disable
</para></listitem>
<listitem><para>
MTUBlackHoleDetect      Disable
</para></listitem>
<listitem><para>
Time To Live            Enabled
</para></listitem>
<listitem><para>
Time To Live - HOPS     32
</para></listitem>
<listitem><para>
NDI Cache Size          0
</para></listitem>
</orderedlist>

<para>
I tried virtually all of the items mentioned in the document and 
the only one which made a difference to me was the socket options. It 
turned out I was better off without any!!!!!
</para>

<para>
In terms of overall speed of transfer, between various win95 clients 
and a DX2-66 20MB server with a crappy NE2000 compatible and old IDE 
drive (Kernel 2.0.30). The transfer rate was reasonable for 10 baseT.
</para>

<para>
FIXME
The figures are:          Put              Get 
P166 client 3Com card:    420-440kB/s      500-520kB/s
P100 client 3Com card:    390-410kB/s      490-510kB/s
DX4-75 client NE2000:     370-380kB/s      330-350kB/s
</para>

<para>
I based these test on transfer two files a 4.5MB text file and a 15MB 
textfile. The results arn't bad considering the hardware Samba is 
running on. It's a crap machine!!!!
</para>

<para>
The updates mentioned in 1 and 2 brought up the transfer rates from 
just over 100kB/s in some clients.
</para>

<para>
A new client is a P333 connected via a 100MB/s card and hub. The 
transfer rates from this were good: 450-500kB/s on put and 600+kB/s 
on get.
</para>

<para>
Looking at standard FTP throughput, Samba is a bit slower (100kB/s 
upwards). I suppose there is more going on in the samba protocol, but 
if it could get up to the rate of FTP the perfomance would be quite 
staggering.
</para>

</sect1>

<sect1>
<title>My Results</title>

<para>
Some people want to see real numbers in a document like this, so here
they are. I have a 486sx33 client running WfWg 3.11 with the 3.11b
tcp/ip stack. It has a slow IDE drive and 20Mb of ram. It has a SMC
Elite-16 ISA bus ethernet card. The only WfWg tuning I've done is to
set DefaultRcvWindow in the [MSTCP] section of system.ini to 16384. My
server is a 486dx3-66 running Linux. It also has 20Mb of ram and a SMC
Elite-16 card. You can see my server config in the examples/tridge/
subdirectory of the distribution.
</para>

<para>
I get 490k/s on reading a 8Mb file with copy.
I get 441k/s writing the same file to the samba server.
</para>

<para>
Of course, there's a lot more to benchmarks than 2 raw throughput
figures, but it gives you a ballpark figure.
</para>

<para>
I've also tested Win95 and WinNT, and found WinNT gave me the best
speed as a samba client. The fastest client of all (for me) is
smbclient running on another linux box. Maybe I'll add those results
here someday ...
</para>

</sect1>
</chapter>