summaryrefslogtreecommitdiff
path: root/docs/Samba-HOWTO-Collection/TOSHARG-LargeFile.xml
blob: 44f054236e3df57b3a2d18104f59c48a31db7e65 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE chapter PUBLIC "-//Samba-Team//DTD DocBook V4.2-Based Variant V1.0//EN" "http://www.samba.org/samba/DTD/samba-doc">
<chapter id="largefile">
<chapterinfo>
	&author.jeremy;
	&author.jht;
	<pubdate>March 5, 2005</pubdate>
</chapterinfo>
<title>Handling Large Directories</title>

<para>
Samba-3.0.12 implements a solution for sites that have experienced performance degradation do to the
problem of using Samba-3 with applications that need large numbers of files (100,000 or more) per directory.
</para>

<para>
The key was fixing the directory handling to read only the current list requested instead of the old
(up to samba-3.0.11) behaviour of reading the entire directory into memory before doling out names.
Normally this would have broken OS/2 applications which have very strange delete semantics, but by
stealing logic from Samba4 (thanks tridge) the current code in 3.0.12 handles this correctly.
</para>

<para>
To set up an application that needs large number of files per directory in a way that does not
damage performance unduly follow these steps:
</para>

<para>
Firstly, you need to canonicalize all the files in the directory to have one case, upper or lower - take your
pick (I chose upper as all my files were already upper case names). Then set up a new custom share for the
application as follows:
<screen>
[bigshare]
        path = /home/jeremy/tmp/manyfilesdir
        read only = no
        case sensitive = True
        default case = upper
        preserve case = no
        short preserve case = no
</screen>
</para>

<para>
Of course, use your own path and settings, but set the case options to match the case of all the files in your
directory. The path should point at the large directory needed for the application - any new files created in
there and in any paths under it will be forced by smbd into upper case - but smbd will no longer have to scan
the directory for names - it knows that if a file does not exist in upper case then it doesn't exist at all.
</para>

<para>
The secret to this is really in the <smbconfoption name="case sensitive">True</smbconfoption>
line. This tells smbd never to scan for case-insensitive versions of names. So if an application asks for a file
called <filename>FOO</filename>, and it can not be found by a simple stat call, then smbd will return file not
found immediately without scanning the containing directory for a version of a different case. The other
<filename>xxx case xxx</filename> lines make this work by forcing a consistent case on all files created by smbd.
</para>

<para>
Remember, all files and directories under the <parameter>path</parameter> directory must be in upper case
with this &smb.conf; stanza as smbd will not be able to find lower case filenames with these settings. Also
note this is done on a per-share basis, allowing this to be set only for a share servicing an application with
this problematic behaviour (using large numbers of entries in a directory) - the rest of your smbd shares
don't need to be affected.
</para>

<para>
This makes smbd much faster when dealing with large directories.  My test case has over 100,000 files and
smbd now deals with this very efficiently.
</para>

</chapter>