[Date Prev][Date Next] [Chronological] [Thread] [Top]

Unicode combining marks

To: ietf-ldapbis@OpenLDAP.org
Subject: Unicode combining marks
From: Hallvard B Furuseth <h.b.furuseth@usit.uio.no>
Date: Mon, 8 Nov 2004 03:26:49 +0100

[models] 4.1.2. Attribute Types refers to "minimum upper bound on the
number of characters in a value with a string-based syntax...".

Is "a<Combining Accute Accent = U+0301>" 1 or 2 characters?  The Unicode
FAQ at <http://www.unicode.org/faq/char_combmark.html#2> seems to say
'that depends'.


I'm wondering a bit when character combinations that are illegal
according to Unicode (those exist, yes?) will cause failure and when
they will slip though.

E.g. what do these return - as extensibleMatch filters, and as
equality/substring filters:

- distinguishedNameMatch:
  assertion value "cn=\3a<Combining Accute Accent>", attrval "cn=foo".
  (I presume ":<Combining Accute Accent>" is invalid in Unicode while
  "a<Combining Accute Accent>" is valid.)

- caseIgnoreListSubstringsMatch:
  assertion value "de*<Combining Accute Accent>fg", attrval "foo".

  assertion value "x\5c<combining mark>y" where "c" can be followed by
  the combining mark but "\" cannot, or vice versa if possible.

Can an entry with DN "cn=\3a<Combining Accute Accent>" be added, or a
seeAlso attribute with that value?

-- 
Hallvard

Prev by Date: Re: registration of LDAP syntax OIDs
Next by Date: Re: Cross-purpose SEQUENCE/CHOICE protocol extension fields
Index(es):
- Chronological
- Thread