[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: Problems with case folding of UTF-8
On Sat, Dec 22, 2001 at 07:07:42PM +0100, Pierangelo Masarati wrote:
> Can you, Stig and Michael, provide a set of strings that do not
> work, so that I can try to see what's going on? I didn't have
> any problems with few selected accented letters, while I found
> that strange folding with others so the code is not completely
> broken but there might be different subtleties here and there.
Okay, here is one that fails for me:
adding new entry "cn=Stig Venås, dc=my-domain,dc=com"
ldapadd: update failed: cn=Stig Venås, dc=my-domain,dc=com
ldap_add: Invalid DN syntax (34)
additional info: invalid DN
The DN in base64 is Y249U3RpZyBWZW7DpXMsIGRjPW15LWRvbWFpbixkYz1jb20
Ã¥ is å (a with circle above), and should still be one character
when normalized (still 2 characters in UTF-8).
I can see that UTF8normalize does the right thing:
Breakpoint 1, UTF8normalize (bv=0x80fbf00, casefold=1 '\001') at ucstr.c:9
(gdb) ins *bv
$1 = {bv_len = 11, bv_val = 0x80fbf10 "Stig Venås"}
and at the end:
(gdb) ins out
$2 = 0x80fbf20 "STIG VENÃ\205S"
backtrace:
(gdb) bt
#0 UTF8normalize (bv=0x80fbf00, casefold=1 '\001') at ucstr.c:214
#1 0x806989b in LDAPDN_rewrite (dn=0x80fbed0, flags=0) at schema_init.c:483
#2 0x80699c9 in dnNormalize (syntax=0x0, val=0x40f9d904,
normalized=0x40f9d900) at schema_init.c:533
#3 0x805bf67 in dn_normalize (
dn=0x80fbd40 "cn=Stig Venås, dc=my-domain,dc=com") at dn.c:261
#4 0x8052952 in do_add (conn=0x403659a8, op=0x80fb880) at add.c:83
Stig