summaryrefslogtreecommitdiff
path: root/source4/include/charset.h
AgeCommit message (Collapse)AuthorFilesLines
2007-10-10r3494: got rid of include/rewrite.h, and split out the dynconfig.h headerAndrew Tridgell1-0/+13
(This used to be commit 558de54ec6432a4ae90aa14a585f32c6cd03ced2)
2007-10-10r2857: this commit gets rid of smb_ucs2_t, wpstring and fpstring, plus lots ↵Andrew Tridgell1-0/+4
of associated functions. The motivation for this change was to avoid having to convert to/from ucs2 strings for so many operations. Doing that was slow, used many static buffers, and was also incorrect as it didn't cope properly with unicode codepoints above 65536 (which could not be represented correctly as smb_ucs2_t chars) The two core functions that allowed this change are next_codepoint() and push_codepoint(). These functions allow you to correctly walk a arbitrary multi-byte string a character at a time without converting the whole string to ucs2. While doing this cleanup I also fixed several ucs2 string handling bugs. See the commit for details. The following code (which counts the number of occuraces of 'c' in a string) shows how to use the new interface: size_t count_chars(const char *s, char c) { size_t count = 0; while (*s) { size_t size; codepoint_t c2 = next_codepoint(s, &size); if (c2 == c) count++; s += size; } return count; } (This used to be commit 814881f0e50019196b3aa9fbe4aeadbb98172040)
2007-10-10r2159: converted samba4 over to UTF-16.Andrew Tridgell1-1/+1
I had previously thought this was unnecessary, as windows doesn't use standards compliant UTF-16, and for filesystem operations treats bytes as UCS-2, but Bjoern Jacke has pointed out to me that this means we don't correctly store extended UTF-16 characters as UTF-8 on disk. This can be seen with (for example) the gothic characters with codepoints above 64k. This commit also adds a LOCAL-ICONV torture test that tests the first 1 million codepoints against the system iconv library, and tests 5 million random UTF-16LE buffers for identical error handling to the system iconv library. the lib/iconv.c changes need backporting to samba3 (This used to be commit 756f28ac95feaa84b42402723d5f7286865c78db)
2003-12-16added support for big-endian ucs2 strings (as used by big-endianAndrew Tridgell1-2/+2
msrpc). this was easier than I expected! (This used to be commit a0a51af6b746b1f82faaa49d33c17fea9d708fb0)
2003-08-13first public release of samba4 codeAndrew Tridgell1-0/+40
(This used to be commit b0510b5428b3461aeb9bbe3cc95f62fc73e2b97f)