[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: string value encoding and escaping question
Jeff,
T.61 is avaialable from http://www.itu.ch. Note that it is copyrighted.
The '*' character is a special character in some cases. See section 4.5.1
of RFC 2251.
Cheers, ....Erik.
---------------------------------
Erik Skovgaard
GeoTrain Corp.
LDAP and X.500 Training and Consulting
http://www.geotrain.com
At 13:20 99/02/11 -0800, Jeff Hodges wrote:
>Mark Smith wrote:
>>
>> The origin of the '$' separator is the Quipu X.500 implementation (it
>> used '$' inside various string syntaxes because '$' is not a valid
>> character in the T.61 character set which was used for some string
>> syntaxes in the olden days).
>
>Ah! ok, I hadn't realized T.61 was the culprit. I'd had a hunch it was
because
>X.500 was largely Euro-originated and suspected they chose $ cuz it wasn't
>their (whoever "they" exactly were) currency symbol. I hadn't looked at T.61
>closely enuff to figure out that it doesn't contain '$'. So my hunch wasn't
>that far off actually.
>
>Do you or anyone else have a URL handy that points to a reference for T.61?
>I'd like to stick it in the LDAP Roadmap.
>
>> Use of '$' has been carried over to some
>> of the LDAPv3 syntaxes, so we are stuck with it now.
>
>right.
>
>> In general, you should pick a separator character that makes sense to
>> you. Backslash is clearly an inconvenient choice ;-)
>
>Well, of course. (my 10yr old would say: DUH. ;)
>
>So, are there any chars other than '\' that're treated specially in the
>protocol docs (aka RFCs [2251..2256] + relevant near-RFC I-Ds) that you know
>of? My search hasn't turned up any, but I might've left a stone unturned. It
>looks to me like the protocol docs ~don't~ treat '$' specially.
>
>Also, I'd appreciate getting explicit confirmation from LDAP/X.500 mavens on
>these other questions I had...
>
>> Jeff.Hodges@Stanford.edu scribbled in netscape.dev.directory newsgroup:
>>
>> What I'm trying to figure out (sorta outta morbid curiosity) is whether
it is
>> the libldap (aka "the ldap sdk", "the ldap stub") or the NS DS that is
>> recognizing the '\' char and interpreting it as a hex escape. Anyone know?
>>
>> The below RFC 2252 excerpts imply to me that the client side (aka the LDAP
>> stub, lib, or whatever) needs to know about this stuff in order to
understand
>> and properly handle this value syntax. Is this correct? Or not and why?
>>
>> Also, I'm curious as to whether there's anything to gain by following
X.500's
>> lead and using '$' as a separator char? I don't believe that any of the
RFCs
>> or I-Ds specify treating it specially, so I doubt it will be inadvertently
>> specially treated as backslash apparently is.
>
>If '$' isn't treated specially protocol-wise, then the only value of using it
>as a separator is consistency with "tradition" and thus perhaps reuse of some
>amount of attribute value parsing code out there, tho we don't really have a
>large body of that ourselves.
>
>thanks,
>
>Jeff
>
>ps: thanks to Mark Wilcox for experimenting with duplicating our attr value
>issues.
>
>
>>
----------------------------------------------------------------------------
---
>> http://info.internet.isi.edu:80/in-notes/rfc/files/rfc2252.txt
>> .
>> .
>>
>> 4.1. Common Encoding Aspects
>>
>> For the purposes of defining the encoding rules for attribute
>> syntaxes, the following BNF definitions will be used. They are based
>> on the BNF styles of RFC 822 [13].
>>
>> a = "a" / "b" / "c" / "d" / "e" / "f" / "g" / "h" / "i" /
>> "j" / "k" / "l" / "m" / "n" / "o" / "p" / "q" / "r" /
>> "s" / "t" / "u" / "v" / "w" / "x" / "y" / "z" / "A" /
>> "B" / "C" / "D" / "E" / "F" / "G" / "H" / "I" / "J" /
>> "K" / "L" / "M" / "N" / "O" / "P" / "Q" / "R" / "S" /
>> "T" / "U" / "V" / "W" / "X" / "Y" / "Z"
>>
>> d = "0" / "1" / "2" / "3" / "4" /
>> "5" / "6" / "7" / "8" / "9"
>>
>> hex-digit = d / "a" / "b" / "c" / "d" / "e" / "f" /
>> "A" / "B" / "C" / "D" / "E" / "F"
>>
>> k = a / d / "-" / ";"
>>
>> p = a / d / """ / "(" / ")" / "+" / "," /
>> "-" / "." / "/" / ":" / "?" / " "
>>
>> letterstring = 1*a
>>
>> numericstring = 1*d
>>
>> anhstring = 1*k
>>
>> keystring = a [ anhstring ]
>>
>> printablestring = 1*p
>>
>> space = 1*" "
>>
>> whsp = [ space ]
>>
>> utf8 = <any sequence of octets formed from the UTF-8 [9]
>> transformation of a character from ISO10646 [10]>
>>
>> dstring = 1*utf8
>>
>> qdstring = whsp "'" dstring "'" whsp
>>
>> qdstringlist = [ qdstring *( qdstring ) ]
>>
>> qdstrings = qdstring / ( whsp "(" qdstringlist ")" whsp )
>>
>> .
>> .
>> 4.3. Syntaxes
>> .
>> .
>> In encodings where an arbitrary string, not a Distinguished Name, is
>> used as part of a larger production, and other than as part of a
>> Distinguished Name, a backslash quoting mechanism is used to escape
>> the following separator symbol character (such as "'", "$" or "#") if
>> it should occur in that string. The backslash is followed by a pair
>> of hexadecimal digits representing the next character. A backslash
>> itself in the string which forms part of a larger syntax is always
>> transmitted as '\5C' or '\5c'. An example is given in section 6.27.
>> .
>> .
>>
>> 6.27. Postal Address
>>
>> ( 1.3.6.1.4.1.1466.115.121.1.41 DESC 'Postal Address' )
>>
>> Values in this syntax are encoded according to the following BNF:
>>
>> postal-address = dstring *( "$" dstring )
>>
>> In the above, each dstring component of a postal address value is
>> encoded as a value of type Directory String syntax. Backslashes and
>> dollar characters, if they occur in the component, are quoted as
>> described in section 4.3. Many servers limit the postal address to
>> six lines of up to thirty characters.
>>
>> Example:
>>
>> 1234 Main St.$Anytown, CA 12345$USA
>> \241,000,000 Sweepstakes$PO Box 1000000$Anytown, CA 12345$USA
>>
>> .
>> .
>>
>> [ note that "\241,000,000" is intended to resolve to "$1000000" once the
value
>> string shown above is parsed out into its constituent components, which're
>> delineated by the "$" chars. This implies to me that the client needs to
know
>> about this in order to understand and properly handle this value syntax. I
>> don't know if that assertion is exactly true. ]
>>
>>
----------------------------------------------------------------------------
---
>
>
>