Using SambaRobert Eckstein, David Collier-Brown, Peter Kelly1st Edition November 1999 1-56592-449-5, Order Number: 4495 416 pages, $34.95 |
8.3 Internationalization
Samba has a limited ability to speak foreign tongues: if you need to deal with characters that aren't in standard ASCII, some options that can help you are shown in Table 8.3. Otherwise, you can skip over this section.
Table 8.3: Networking Configuration Options Option
Parameters
Function
Default
Scope
client code page
Described in this section
Sets a code page to expect from clients
850
Global
character set
Described in this section
Translates code pages into alternate UNIX character sets
None
Global
coding system
Described in this section
Translates code page 932 into an Asian character set
None
Global
valid chars
string (set of characters)
Obsolete: formerly added individual characters to a code page, and had to be used after setting client code page
None
Global
8.3.1 client code page
The character sets on Windows platforms hark back to the original concept of a code page. These code pages are used by DOS and Windows clients to determine rules for mapping lowercase letters to uppercase letters. Samba can be instructed to use a variety of code pages through the use of the global
client
code
page
option in order to match the corresponding code page in use on the client. This option loads a code-page definition file, and can take the values specified in Table 8.4.
Table 8.4: Valid Code Pages with Samba 2.0 Code Page
Definition
437
MS-DOS Latin (United States)
737
Windows 95 Greek
850
MS-DOS Latin 1 (Western European)
852
MS-DOS Latin 2 (Eastern European)
861
MS-DOS Icelandic
866
MS-DOS Cyrillic (Russian)
932
MS-DOS Japanese Shift-JIS
936
MS-DOS Simplified Chinese
949
MS-DOS Korean Hangul
950
MS-DOS Traditional Chinese
You can set the client code page as follows:
[global] client code page = 852The default value of this option is 850. You can use the make_smbcodepage tool that comes with Samba (by default in /usr/local/samba/bin) to create your own SMB code pages, in the event that those listed earlier are not sufficient.
8.3.2 character set
The global
character
set
option can be used to convert filenames offered through a DOS code page (see the previous section, Section 8.3.1, client code page) to equivalents that can be represented by Unix character sets other than those in the United States. For example, if you want to convert the Western European MS-DOS character set on the client to a Western European Unix character set on the server, you can use the following in your configuration file:[global] client code page = 850 character set = ISO8859-1Note that you must include a
client
code
page
option to specify the character set from which you are converting. The valid character sets (and their matching code pages) that Samba 2.0 accepts are listed in Table 8.5:
Table 8.5: Valid Character Sets with Samba 2.0 Character Set
Matching Code Page
Definition
ISO8859-1
850
Western European Unix
ISO8859-2
852
Eastern European Unix
ISO8859-5
866
Russian Cyrillic Unix
KOI8-R
866
Alternate Russian Cyrillic Unix
Normally, the
character
set
option is disabled completely.8.3.3 coding system
The
coding
system
option is similar to thecharacter
set
option. However, its purpose is to determine how to convert a Japanese Shift JIS code page into an appropriate Unix character set. In order to use this option, theclient
code
page
option described previously must be set to page 932. The valid coding systems that Samba 2.0 accepts are listed in Table 8.6.
Table 8.6: Valid Coding System Parameters with Samba 2.0 Character Set
Definition
SJIS
Standard Shift JIS
JIS8
Eight-bit JIS codes
J8BB
Eight-bit JIS codes
J8BH
Eight-bit JIS codes
J8@B
Eight-bit JIS codes
J8@J
Eight-bit JIS codes
J8@H
Eight-bit JIS codes
JIS7
Seven-bit JIS codes
J7BB
Seven-bit JIS codes
J7BH
Seven-bit JIS codes
J7@B
Seven-bit JIS codes
J7@J
Seven-bit JIS codes
J7@H
Seven-bit JIS codes
JUNET
JUNET codes
JUBB
JUNET codes
JUBH
JUNET codes
JU@B
JUNET codes
JU@J
JUNET codes
JU@H
JUNET codes
EUC
EUC codes
HEX
Three-byte hexidecimal code
CAP
Three-byte hexidecimal code (Columbia Appletalk Program)
8.3.4 valid chars
The
valid
chars
option is an older Samba feature that will add individual characters to a code page. However, this option is being phased out in favor of more modern coding systems. You can use this option as follows:valid chars = Î valid chars = 0450:0420 0x0A20:0x0A00 valid chars = A:aEach of the characters in the list specified should be separated by spaces. If there is a colon between two characters or their numerical equivalents, the data to the left of the colon is considered an uppercase character, while the data to the right is considered the lowercase character. You can represent characters both by literals (if you can type them) and by octal, hexidecimal, or decimal Unicode equivalents.
We recommend against using this option. Instead, go with one of the standard code pages listed earlier in this section. If you do use this option, however, it must be listed after the
client
code
page
to which you wish to add the character. Otherwise, the characters will not be added.
© 1999, O'Reilly & Associates, Inc.