[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: UTF8 case insensitive matching
At 04:31 PM 10/25/00 +0200, Stig Venås wrote:
>On Tue, Oct 24, 2000 at 01:11:25PM -0700, Kurt D. Zeilenga wrote:
>> The DN normalization and matching?
>
>I'm looking at this. I have some questions.
>
>I'm writing UTF8str2upper and perhaps some other UTF8 functions
>that need liblunicode to work. I think they belong in utf8.c in
>libldap, but it's not so good I think, if applications that use
>libldap also must link with liblunicode. Where should I put it?
ldap_pvt_uc.h/-llunicode
>I'm not sure, but I think that the width of a character in UTF8
>might change when you change casing. Does anyone know for sure
>if it might?
Yes.
>If it can change, dn_normalize will have to malloc
>space for a new string and return a pointer to that.
This is needed anyway for quoting/escape normalization...
>A lot of
>code would have to be changed then. An easy but incorrect way
>out could be to simply not change casing for a character if
>the size is different. It would still be better than todays
>situation.
We can certainly cheat in the short term....
Long term, we need to use the dnValidate()/dnNormalizer()
semantics instead of the dn_validate()/dn_normalize() semantics.
In the mid term, to avoid the ripple effect of the
dn_validate()/dn_normalize() change, I suggest that temporary
versions of dn_validate()/dn_normalize() be implemented which
use dnValidate()/dnNormalize() to do the work but provide old
semantics otherwise.
The dnValidate()/dnNormalize(), besides dealing with
lower/upper case length changes, can:
validate/normalize attribute type
unescape/unquoting, validate/normalize value (*) and reescape
* extra credit: per value syntax
Kurt