2 Oct 06:33
[ruby-core:19103] Re: Encoding.default_internal
From: Martin Duerst <duerst <at> it.aoyama.ac.jp>
Subject: [ruby-core:19103] Re: Encoding.default_internal
Newsgroups: gmane.comp.lang.ruby.core
Date: 2008-10-02 04:36:36 GMT
Subject: [ruby-core:19103] Re: Encoding.default_internal
Newsgroups: gmane.comp.lang.ruby.core
Date: 2008-10-02 04:36:36 GMT
At 07:59 08/10/02, Michael Selig wrote: >On Thu, 02 Oct 2008 00:15:01 +1000, James Gray <james <at> grayproductions.net> >wrote: >> To be honest, I doubt I would have made the effort if I had known this >> change was coming. It was challenging and I'm a wimp. ;) >Someone had to be the trailblazer, James, even if it was only to find out >that it wasn't the best pathYes indeed. I think your experience helped Matz quite a bit for his decision. >But I agree with you: if a library can be confident that its inputs are at >least ASCII-comptible, quite a bit of your efforts could be saved. >If on top of that, if it can be reasonably sure that all its inputs are >encoding compatible, then it's even better. I think this is not about confidence. In the software world, there is no confidence about input. It's much more about what expectation a library sets and documents. I think there are quite a few possibilities: a) The library accepts and produces only UTF-8. Best used with -U. b) The library accepts, in one run, a single arbitrary encoding, and returns the same encoding, if that encoding is ASCII-compatible. c) Same as before, but extended for non-ASCII-compatible. (what James has done with the CVS library, as far I understand it) d) The library accepts multiple encodings and handles all the conversions internally. There are of course other cases, such as a library only accepting some specific encoding different from UTF-8, for some special processing. From an overall Ruby standpoint, b) should be the 'default', but in all cases, things should be clearly documented. Regards, Martin. #-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University #-#-# http://www.sw.it.aoyama.ac.jp mailto:duerst <at> it.aoyama.ac.jp
Yes indeed. I think your experience helped Matz quite a bit
for his decision.
>But I agree with you: if a library can be confident that its inputs are at
>least ASCII-comptible, quite a bit of your efforts could be saved.
>If on top of that, if it can be reasonably sure that all its inputs are
>encoding compatible, then it's even better.
I think this is not about confidence. In the software world,
there is no confidence about input. It's much more about what
expectation a library sets and documents. I think there are
quite a few possibilities:
a) The library accepts and produces only UTF-8. Best used with -U.
b) The library accepts, in one run, a single arbitrary encoding,
and returns the same encoding, if that encoding is ASCII-compatible.
c) Same as before, but extended for non-ASCII-compatible.
(what James has done with the CVS library, as far I understand it)
d) The library accepts multiple encodings and handles all the
conversions internally.
There are of course other cases, such as a library only accepting
some specific encoding different from UTF-8, for some special processing.
From an overall Ruby standpoint, b) should be the 'default', but
in all cases, things should be clearly documented.
Regards, Martin.
#-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#
RSS Feed