verdy_p | 27 Nov 19:54
Picon

Re: [OT] Re: Support of ISO 639 (was: Survey Tool pre-alpha)

"Doug Ewell" wrote:
> Warning: this is completely OT for the Unicode list.  Future discussion 
> should be on the LTRU list (ltru <at> ietf.org) or CLDR list 
> (cldr-users <at> unicode.org) as appropriate.

You have just replied to the Unicode list yourself (despite I was replaying to you using a CC to the CLDR list...)

> "verdy underscore p" <verdy underscore p at wanadoo dot fr> wrote:
> 
> > If only we could have some access to ISO 639-5 data (for managing the 
> > language families instead of using the historic and bdly designed 
> > language collections of ISO 639-1 (code [bi] only) and ISO 639-2...
> 
> I wish the ISO 639-5 Registration Authority, which is the same as that 
> for ISO 639-2 (Library of Congress), would set up an official 639-5 Web 
> site.  It has been a long time coming.

Well, still waiting (sorry, my interest for the subject is mostly personal, although I could have use of it 
professionnally, but I can't pay myself for getting a copy of the published paper; it's too expensive for me).

> I don't agree with characterizing 639-1 and 639-2 as "badly designed." 
> They were designed for different purposes.

Apparently not. Your description just indicates that 639-5 is effectively continuing the 639-2 (and
639-1 for 
bihari) model, and does not create what was expected (a comprehensive hierarchy similar to the
Ethnologue); in 
addition, the 639-5 is now incompatible with 639-2 and 639-1, making it mostly unusable within the RFC
4645/4646 
bis framework). For me, this means that 639-5 is already a dead standard before its publication, unless the
639-2 
and 639-1 collections are completely removed, due to the changes that occured in 639-5.

> > Also I'm still waiting to see how ISO 639-5 can be integrated with the 
> > RFC 4545bis and RFC 4646bis rules.
> 
> This is clearly laid out in the two LTRU drafts:
> 
> http://www.ietf.org/internet-drafts/draft-ietf-ltru-4646bis-18.txt
> http://www.ietf.org/internet-drafts/draft-ietf-ltru-4645bis-07.txt
> http://www.ewellic.org/rfc4645bis.html

There's absolutely no integration. Or more exactly, it does not create the encoding framework that would
allow the 
efective creation of a comprehensive hierarchy of language families.

It just says that they are just added as possible subtags, usable as prefixes, but immediately, the
included list 
of tags make these combinations of a collection subtag plus a language subtag accepted as non prefered
aliases for 
the prefered tag consisting in the language tag only (this is fine for me, as it effectively initiates the 
organization as a hierarchy; however, the list is not complete enough to organize the 639-3 list of
macrolanguages 
and individual languages)

Also I don't understand the need to create a prefix subtag for the special language scopes of 639-3 (except
as a 
categorization in a more compelte hierarchy). Well, this is not critical and does not affect my own
projects with 
them.

> In brief, 639-5 code elements are simply added as more language subtags 
> that represent language collections, just like existing subtags such as 
> 'alg'.  This is very straightforward.

If you say so... For me, this absolutely does not change what was already in use, or attempted, without
639-5. The 
639-5 part solves absolutely no additional problem, but just creates more confusion (due to its
incompatibilities 
with 639-1/2).

I would have really hoped that those unstable collections of 639-1/2 were deprecated (grandfathered with
no 
indication of a prefered new code, due to the ambiguities, just like what has been done for "cel-gaulish" in
RFC 
4645bis), and that new codes were assigned to the codes that were changed to be inclusive and built
according to 
serious definitions (e.g., the exclusive [ine] collection of 639-2 would have been grandfathered, and a
new code 
added in 639-5 for the inclusive Indo-european family).

> You can check out the proposed replacement Language Subtag Registry, 
> embedded in draft-ietf-ltru-4645bis-07.txt, to see how this works.

> I don't see how one can be pro-639-5 and anti-639-2.  639-2 includes a 
> modest number of collection codes among its repertoire.  639-5 comprises 
> a more comprehensive list of collections, but for some reason omits two 
> of the collections included in 639-2 (Bihari and Himachali), and 
> incorrectly lists 'car' as a collection called "Carib languages" when 
> other parts of 639 had already classified this as an individual 
> language, "Galibi Carib."

You forget that I had hoped for 639-5 to be made interoperable with 639-1/2 and so with the RFC 4645/4646 
framework. For me it has completely failed to this, and I can predict that the integration of 639-5 in RFC 
4645/4646 will fail.

> Except for these few inconsistencies, 639-2 is simply a subset of "639-3 
> plus 639-5."  You can't simultaneously "integrate ISO 639-5" and "drop 
> ISO 639-2 collections."

Yes you can !

In fact you can't integrate both 639-2 and 639-5 simultaneously (in RFC 4645/4646bis), unless the RFC
4645/4646bis 
paper is updated (once again) to resolve the conflicts and inconsistencies that the early,
semi-confidential and 
constly publication of 639-5 has created ! Thanks, 639-5 is not published freely, so it is still mostly
ignored by 
everyone.

But I hope that 639-5 will be rapidly corrected to solve what I consider as severe bugs (or lack of analysis
and 
decisions about its policies, stability and compatibility with the rest of the 639 standard).

(and I can understand now why The Ethnologue has chosen to NOT use or display the 639-5 "collection" codes
for its 
language families !)


Gmane