[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: String conversions UTF8 <-> ISO-8859-1

To: Hallvard B Furuseth <h.b.furuseth@usit.uio.no>
Subject: Re: String conversions UTF8 <-> ISO-8859-1
From: "Kurt D. Zeilenga" <Kurt@OpenLDAP.org>
Date: Sun, 01 Jun 2003 12:55:36 -0700
Cc: openldap-devel@OpenLDAP.org
In-reply-to: <HBF.20030601vj1h@bombur.uio.no>
References: <5.2.0.9.0.20030530090953.02873840@127.0.0.1> <5.2.0.9.0.20030523065531.028af638@127.0.0.1> <5.2.0.9.0.20030523031228.02890ad0@127.0.0.1> <5.2.0.9.0.20030521172025.02a5fb38@127.0.0.1> <HBF.20030430vsve@bombur.uio.no> <HBF.20030429tx53@bombur.uio.no> <006d01c30e8b$7c98af50$0e01a8c0@CELLO> <HBF.20030515ync0@bombur.uio.no> <HBF.20030523tepg@bombur.uio.no> <HBF.20030523xiiq@bombur.uio.no> <HBF.20030530plw@bombur.uio.no> <5.2.0.9.0.20030530090953.02873840@127.0.0.1>

At 10:58 AM 6/1/2003, Hallvard B Furuseth wrote:
>> let me try to restart the discussion a bit.
>I'm not quite sure how to reply to this one - shall I just repost some
>of what I said before?  Well...

Not necessary.  I wanted to take a step back to primary clarify my
previous statements due to missing "not"s, but also just to try to
describe "the problem" as whole.

>First of all, I think you are trying to bite over too much. 

Actually, I'm trying to bit off very little of "the problem".  In fact,
I'd prefer just to provide a few helper routines to have application
developers avoid some tedious coding.

>I think a character set conversion API which tries to be everything
>to everyone is bound to be too clumsy.

I rather not try to solve the complete problem.  I don't think we can.

>Which is why I'm personally only interested in getting it to be useful
>in _most_ cases.  I think people can do their own conversion in the remaining cases.

I think _most_ people will have remaining cases.  That is, _most_
people will still need to do some conversion without the assistance
of the LDAP API.

>> There are applications which use different character sets and encodings
>> when interacting with the user then when interacting with the
>> directory.  Those applications will need access to an appropriate
>> conversion routine.  Personally, I think applications should
>> deal with conversion issues at the user interface, not at the LDAP
>> interface.  (...)
>
>Meaning, the application should think UTF-8 internally?

Unless they got a good reason not to use Unicode/UTF-8 internally, yes.
Otherwise they'll have not only to deal with user<->internal conversions
but internal<->Internet conversions.

>The choice of internal character set in the application is up to the
>application developer, not to us.

Of course, as it they, not us, who live with that choice.

>> Now, you suggested some sort of callback mechanism.  (...)
>
>I'll mostly skip this part for now, since we both have strong opinions
>about it.  Let's see what else we can agree on in general before we dive
>too far into the API choice.

I think we actually agree that we should provide some "help" to application
developers who need to do "conversions".  I think we just disagree over
the choice of the API mechanism to use to provide that "help".

The problem with callbacks is coming up with a reasonable way to
provide enough context so that the application can make the right
conversion.

>> For example, maybe provide a "foreach entry" routine which call
>> an application-specified function on each entry in a message
>> chain (previously provided by the API).  And then a "foreach
>> attribute" routine... etc..
>
>This sounds very slow.  Seems to be it would entail a lot of unpacking
>and repacking of Ber elements in the LDAPMessages.

If we go with callbacks, un/repacking of BER is exactly what we'll be
doing.  If we just provide helpers, the application can do conversion
where they normally do value extraction and hence avoid repacking.

Kurt

Follow-Ups:
- Re: String conversions UTF8 <-> ISO-8859-1
  - From: Hallvard B Furuseth <h.b.furuseth@usit.uio.no>

References:
- Re: String conversions UTF8 <-> ISO-8859-1
  - From: Hallvard B Furuseth <h.b.furuseth@usit.uio.no>

Prev by Date: Re: String conversions UTF8 <-> ISO-8859-1
Next by Date: Re: String conversions UTF8 <-> ISO-8859-1
Index(es):
- Chronological
- Thread