[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: (c.harding 44382) RE: (a.josey 14931) Re: (c.harding 44333) Re: regexMatch (Was: substring filters using DN attributes ?)
Hi, Ron -
There are several variants on the definition of regular expression, for
historical reasons. There is one UNIX(TM) standard, though, which says that
all of the versions must be supported by a UNIX system, each being applied
in the appropriate contexts as defined by the standard.
Internationalization is a very tricky - but extremely important - area. The
standards (even the UNIX ones) should not be followed blindly, you need to
look carefuly at what implications they would have before referencing them
in a matching rule RFC or draft.
When dealing with multiple character sets and languages, collating
sequences and regular expressions clearly become more difficult. A great
deal of very good work went into the definition of internationalized
regular expressions. However, that work pre-dates the deployment of
UNICODE, which removes some but by no means all of the problems. So far as
I know, it has not been re-evaluated in the light of UNICODE, but it
certainly should be.
Should there be locale attributes and if so where they should go? People
must be able to put entries using different languages and character sets in
the same directory. This is in fact supported by LDAP language tagging (RFC
2596), and the first question has to be whether any mechanism beyond RFC
2596 is needed. For the sake of simplicity, I would hope not.
>Chris,
>
>Interesting. On reading re.html below I found no less than three 'standards'
>in the first paragraph. Mention of locales later completed the picture for
>me.
>
>I guess in the conformance statement for the directory we say which RE
>'standard' we are following. We also need an attribute inb the root DSE
>which specifies what our locale is.
>
>Ron.
>
>-----Original Message-----
>From: Chris Harding [mailto:c.harding@opengroup.org]
>Sent: Wednesday, 26 July 2000 21:37
>To: Rob Byrne - Sun Microsystems; Kurt D. Zeilenga
>Cc: ldapext; a.josey@opengroup.org
>Subject: (a.josey 14931) Re: (c.harding 44333) Re: regexMatch (Was:
>substring filters using DN attributes ?)
>
>
>>Is there a standard definition of what a regular expression actually is ?
>>
>There certainly is.
>
>It is part of the standard definition of the UNIX(TM) operating system
>which is available (foc) from The Open Group, see
>http://www.opengroup.org/publications/catalog/t912.htm#medium2
>
>The definition of regular expressions is at
>http://www.opengroup.org/onlinepubs/007908799/xbd/re.html
>
>>I ask this because if you work on Solaris for example, there are n
>different
>>libraries and functions for doing regular expression matching so the
>meaning
>>of "regular expression" is not so obvious.
>>
>>Rob.
>>
>>"Kurt D. Zeilenga" wrote:
>>
>>> At 09:35 AM 7/25/00 -0700, Mark C Smith wrote:
>>> >"Kurt D. Zeilenga" wrote:
>>> >>
>>> >> I've meaning to publish a regexMatch rule I-D which would allow
>>> >> matching of an asserted regular expression against the string
>>> >> representation of attribute values. Of course, to be useful with
>>> >> DNs, we'd have to have to define a canonical string representation
>>> >> of DNs. Given such, you would be able to do DN matching like:
>>> >>
>>> >> (member:regexMatch:=.*,dc=example,dc=com$)
>>> >>
>>> >> Such a matching rule, I believe, would be generally useful in
>>> >> a number of applications. Of course, user applications may
>>> >> not want to expose regular expressions to average Joe.
>>> >>
>>> >> If others concur that this would be generally useful, I'll put
>>> >> up a straw man proposal after IETF#48.
>>> >
>>> >It would be interesting to see examples of the kinds of LDAP application
>>> >problems that would be more easily addressed if such a matching rule was
>>> >available.
>>>
>>> I agree. In fact, I wouldn't attempt to write such an I-D
>>> without decent examples. In general, such a rule would be useful
>>> to applications which required very specific, complex matching
>>> which cannot easily be decomposed into a substrings assertion.
>>> I'll try to come up with some examples, hopefully ones which
>>> are not too contrived.
>>>
>>> >If all we really need is a way to anchor the start and end
>>> >of strings (i.e., ^ and $ from regex), I'd rather see a more narrow
>>> >proposal. Why? Because general regular expression matching will be
>>> >quite difficult to support using indexes, etc.
>>>
>>> I concur that general regular expressions are quite difficult to
>>> to support using indexing. I also concur that applications wanting
>>> to make an assertion should use an appropriate matching rule. I
>>> fully agree that applications wanting to simply assert start/end
>>> text should use a substrings matching rules.
>>>
>>> Kurt
>>
>>
>>
>
>Regards,
>
>Chris
>+++++
>
>========================================================================
> Chris Harding
> T H E Directory Program Manager
> O P E N Apex Plaza, Forbury Road, Reading RG1 1AX, UK
>G R O U P Mailto:c.harding@opengroup.org Phone: +44 118 950 8311 x2262
> WWW: http://www.opengroup.org Mobile: +44 771 8588820
>========================================================================
>
>
Regards,
Chris
+++++
========================================================================
Chris Harding
T H E Directory Program Manager
O P E N Apex Plaza, Forbury Road, Reading RG1 1AX, UK
G R O U P Mailto:c.harding@opengroup.org Phone: +44 118 950 8311 x2262
WWW: http://www.opengroup.org Mobile: +44 771 8588820
========================================================================