[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: strange (swedish) characters
Quoting Markus Jardemalm <markus.jardemalm@enea.se>:
> Since UTF-8 encoded ISO 10646-1 is Unicode my characters is in the
> "Latin-1 Supplement". How do I tell openldap (ldapadd) about what Code
> Chart I'm using so it won't complain about my characters?
> >$ldapadd -x -D "cn=Manager,o=myorg,c=SE" -W -f my_entry.ldif
> >Enter LDAP Password:
> >adding new entry "cn=my_name,o=myorg,c=SE"
> >ldap_add: Invalid syntax
> > additional info: value contains invalid data
>
> Any ideas about getting these characters in the ldap database?
Assuming you're loading from an LDIF file, I first process the LDIF file with a
Perl script I wrote, that converts from ISO 8559-1 Latin1 to UTF-8, using this
snippet:
use MIME::Base64;
use Unicode::String;
sub utfencode {
my ($att,$val) = @_;
if ($val =~ /[\x80-\xFF]/) {
my $u = Unicode::String::utf8($val); # convert from ISO8559-1 to UTF-8
$val = encode_base64($u->utf8);
chop($val); # remove the newline
# we use the double colon to indicate MIME encoding
return $att . ":: " . $val;
} else {
return $att . ": " . $val;
}
}
This takes the attribute (like CN) as the first parameter, and the value as the
second, the tests to see if it has a non-ASCII charactet (with Hex value >
0x7f). If so, we assume this is ISO8559, so I convert the result to UTF-8
(Unicode) then encode the result into Base64. Note that anything with Base64
encoding uses two colons.
Hope this helps!
*********************************
Paul Gillingwater
Managing Director
CSO Lanifex Unternehmensberatung
& Softwareentwicklung G.m.b.H.
NEW BUSINESS CONCEPTS
E-mail: paul@lanifex.com
Mobile: +43/699/1922 3085
Webhome: http://www.lanifex.com
Address: Praterstrasse 60/1/2
A-1020 Vienna, Austria
*********************************