[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: migrating from 2.0 to 2.1
Anders Bruun Olsen writes:
>> Messy. The DN is latin-1, but the CN is UTF-8.
>> You need to convert the DN to utf-8 (after base64-decoding it).
>
> That would be quite a good solution.. but how do I do that?
Depends on what OS and which programs you have installed.
This Perl script should do the trick, if you have a recent enough
Perl (or have installed MIME::Base64 "by hand").
As written, the script only converts latin-1 DNs to UTF-8.
If you have other latin-1 data as well which should be UTF-8,
then:
If you replace 'if (0 || ...)' with 'if (1 || ...)', it will
convert everything which looks like latin-1 to UTF-8.
If you have both binary and textual attributes, you'll instead
need to use 'if ($line =~ /^(dn|attrname1|attrname2):/is)'
which names the specific textual attributes which may contain
latin-1.
#!/usr/bin/perl -w
use strict;
use MIME::Base64;
my $line = "";
while (<>) {
print_line() unless /^ /s;
$line .= $_;
}
print_line();
sub print_line {
if (0 || $line =~ /^(dn):/is) {
# Remove continuation line separators
$line =~ s/\n //g;
# Decode base64
my $was_b64 = '';
$line =~ s/^([-.\;\w]+:): *(.*)(?=\n\z)/$1." ".decode_base64($2)/es
and $was_b64 = '\s.*|';
# Convert non-UTF-8 - assumed to be Latin-1 - to UTF-8
if ($line =~ /[\300-\377](?![\200-\277])|(?![\200-\377])[\200-\277]/) {
$line =~ s/([\200-\377])/
pack('CC', 0xC0 + (ord($1) >> 6), (ord($1) & 0xBF)) /ge;
}
# Convert back to base64
$line =~ s/^([^\n:]+:) ($was_b64.*?[\0\n\r].*)(?=\n\z)/
$1 . ": " . Base64($2) /e;
}
print $line;
$line = "";
}
sub Base64 {
my $b64 = encode_base64($_[0]);
$b64 =~ tr/ \n\r//d;
$b64;
}