summaryrefslogtreecommitdiff
path: root/source4/lib/util_unistr.c
AgeCommit message (Collapse)AuthorFilesLines
2007-10-10r3449: more include file reductionAndrew Tridgell1-0/+1
the ldb part isn't ideal, I will have to think of a better solution (This used to be commit 6b1f86aea8427a8e957b1aeb0ec2f507297f07cb)
2007-10-10r2902: make toupper_w() and tolower_w() slightly faster by putting the most ↵Andrew Tridgell1-8/+8
common conditions first (This used to be commit 878f6b565f4e80eefbb08f44551b3b4f647d7aa7)
2007-10-10r2901: if we can't load upcase.dat or lowcase.dat then don't waste 256kAndrew Tridgell1-27/+8
making fake tables, instead just do the approximate upper/lower inline with toupper() and tolower(). (This used to be commit 994392d085e87046212191b8f41eba628467c778)
2007-10-10r2871: - got rid of the last bits of non-threadsafe data in util_str.oAndrew Tridgell1-8/+4
- switch the fallback case tables to use talloc - moved the used-once octal_string() inline in loadparm.c (This used to be commit b04202eaacc87d264d463f75673ee0e68cd54f94)
2007-10-10r2857: this commit gets rid of smb_ucs2_t, wpstring and fpstring, plus lots ↵Andrew Tridgell1-169/+56
of associated functions. The motivation for this change was to avoid having to convert to/from ucs2 strings for so many operations. Doing that was slow, used many static buffers, and was also incorrect as it didn't cope properly with unicode codepoints above 65536 (which could not be represented correctly as smb_ucs2_t chars) The two core functions that allowed this change are next_codepoint() and push_codepoint(). These functions allow you to correctly walk a arbitrary multi-byte string a character at a time without converting the whole string to ucs2. While doing this cleanup I also fixed several ucs2 string handling bugs. See the commit for details. The following code (which counts the number of occuraces of 'c' in a string) shows how to use the new interface: size_t count_chars(const char *s, char c) { size_t count = 0; while (*s) { size_t size; codepoint_t c2 = next_codepoint(s, &size); if (c2 == c) count++; s += size; } return count; } (This used to be commit 814881f0e50019196b3aa9fbe4aeadbb98172040)
2007-10-10r2644: removed an unused functionAndrew Tridgell1-18/+0
(This used to be commit bc779cb2ce6bc13157f9d046400ce99d107ccd52)
2007-10-10r2639: we doon't need the valid_table code, so get rid of itAndrew Tridgell1-56/+0
(This used to be commit 480636ebbca102172621609496bdab682d4bda8a)
2007-10-10r2634: use discard_const_p() in a few placesAndrew Tridgell1-3/+3
(This used to be commit 56ecda2178e33508c55c6195ccec41c06e099d6f)
2007-10-10r2631: the strchr family of functions should not return const strings.Andrew Tridgell1-3/+3
(This used to be commit 2a7e5f07086ef4aebbb2be35acbf9c7c39b13c75)
2007-10-10r2552: Character set conversion and string handling updates.Andrew Bartlett1-0/+7
The intial motivation for this commit was to merge in some of the bugfixes present in Samba3's chrcnv and string handling code into Samba4. However, along the way I found a lot of unused functions, and decided to do a bit more... The strlen_m code now does not use a fixed buffer, but more work is needed to finish off other functions in str_util.c. These fixed length buffers hav caused very nasty, hard to chase down bugs at some sites. The strupper_m() function has a strupper_talloc() to replace it (we need to go around and fix more uses, but it's a start). Use of these new functions will avoid bugs where the upper or lowercase version of a string is a different length. I have removed the push_*_allocate functions, which are replaced by calls to push_*_talloc. Likewise, pstring and other 'fixed length' wrappers are removed, where possible. I have removed the first ('base pointer') argument, used by push_ucs2, as the Samba4 way of doing things ensures that this is always on an even boundary anyway. (It was used in only one place, in any case). (This used to be commit dfecb0150627b500cb026b8a4932fe87902ca392)
2007-10-10r2402: to make ms_fnmatch() case-insensitive we need toupper_w() exposedAndrew Tridgell1-1/+1
(This used to be commit 69413bdcfcf40e9ae2e5bcb00863cc7ef0ee8da1)
2007-10-10r2159: converted samba4 over to UTF-16.Andrew Tridgell1-2/+2
I had previously thought this was unnecessary, as windows doesn't use standards compliant UTF-16, and for filesystem operations treats bytes as UCS-2, but Bjoern Jacke has pointed out to me that this means we don't correctly store extended UTF-16 characters as UTF-8 on disk. This can be seen with (for example) the gothic characters with codepoints above 64k. This commit also adds a LOCAL-ICONV torture test that tests the first 1 million codepoints against the system iconv library, and tests 5 million random UTF-16LE buffers for identical error handling to the system iconv library. the lib/iconv.c changes need backporting to samba3 (This used to be commit 756f28ac95feaa84b42402723d5f7286865c78db)
2007-10-10r1983: a completely new implementation of tallocAndrew Tridgell1-8/+0
This version does the following: 1) talloc_free(), talloc_realloc() and talloc_steal() lose their (redundent) first arguments 2) you can use _any_ talloc pointer as a talloc context to allocate more memory. This allows you to create complex data structures where the top level structure is the logical parent of the next level down, and those are the parents of the level below that. Then destroy either the lot with a single talloc_free() or destroy any sub-part with a talloc_free() of that part 3) you can name any pointer. Use talloc_named() which is just like talloc() but takes the printf style name argument as well as the parent context and the size. The whole thing ends up being a very simple piece of code, although some of the pointer walking gets hairy. So far, I'm just using the new talloc() like the old one. The next step is to actually take advantage of the new interface properly. Expect some new commits soon that simplify some common coding styles in samba4 by using the new talloc(). (This used to be commit e35bb094c52e550b3105dd1638d8d90de71d854f)
2007-10-10r890: convert samba4 to use [u]int8_t instead of [u]int8Stefan Metzmacher1-2/+2
metze (This used to be commit 2986c5f08c8f0c26a2ea7b6ce20aae025183109f)
2007-10-10r827: remove a few more unused functions that we are unlikely to use againAndrew Tridgell1-37/+0
(This used to be commit 121dd9ba0038f6e076c464cddad0b788fe6076fa)
2003-12-01 * got rid of UNISTR2 and everything that depends on itAndrew Tridgell1-40/+0
* removed a bunch of code that needs to be rewritten using the new interfaces (This used to be commit 9b02b486ef5906516f8cad79dbff5e3dd54cde66)
2003-11-30 * removed a bunch of unused codeAndrew Tridgell1-379/+11
* made some functions static (This used to be commit 829b87f30d5f4cc7174b716f3354982d84af4818)
2003-11-23reduced the number of magic types we need in mkproto.plAndrew Tridgell1-39/+0
In general I prefer "struct foo" to just "foo" for most structures. There are exceptions. (This used to be commit 04eb12b56c653f98801ab29411f47564ab32fa58)
2003-08-15more fixes from the IRIX compiler (thanks herb!)Andrew Tridgell1-4/+2
(This used to be commit 02d068ba7d81d6db25122144981c63f74ad44025)
2003-08-13first public release of samba4 codeAndrew Tridgell1-0/+838
(This used to be commit b0510b5428b3461aeb9bbe3cc95f62fc73e2b97f)