[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: Problems with case folding of UTF-8
At 10:35 AM 2001-12-22, Michael Ströder wrote:
>Stig Venaas wrote:
>>
>> adding new entry "cn=Stig Venås, dc=my-domain,dc=com"
>
>Well, you have to tell us that this string is improperly interpreted
>as ISO-8859-1 by your xterm. Otherwise it's meaningless. ;-)
>
>> The DN in base64 is Y249U3RpZyBWZW7DpXMsIGRjPW15LWRvbWFpbixkYz1jb20
>
>Are you sure about that being properly base64-encoded?
echo -n 'Y249U3RpZyBWZW7DpXMsIGRjPW15LWRvbWFpbixkYz1jb20' \
| b64d | hexdump -C
00000000 63 6e 3d 53 74 69 67 20 56 65 6e c3 a5 73 2c 20 |cn=Stig Ven..s, |
00000010 64 63 3d 6d 79 2d 64 6f 6d 61 69 6e 2c 64 63 3d |dc=my-domain,dc=|
00000020 63 6f 6d |com|
00000023
(b64d is a alias which uses perl to decode the base64)
>Python 2.1.1 (#5, Nov 18 2001, 17:07:23)
>[GCC 2.95.2 19991024 (release)] on linux2
>>>> import base64
>>>> base64.decodestring('Y249U3RpZyBWZW7DpXMsIGRjPW15LWRvbWFpbixkYz1jb20')
>Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> File "/usr/lib/python2.1/base64.py", line 47, in decodestring
> decode(f, g)
> File "/usr/lib/python2.1/base64.py", line 31, in decode
> s = binascii.a2b_base64(line)
>binascii.Error: Incorrect padding
>>>>
>
>> Ã¥ is å (a with circle above), and should still be one character
>> when normalized (still 2 characters in UTF-8).
>
>For your records Python's UTF-8 encoding:
>
>>>> unicode('Venås','iso-8859-1').encode('utf-8')
>'Ven\xc3\xa5s'
>>>>
As Stig provided.