27 Nov 19:54
Re: [OT] Re: Support of ISO 639 (was: Survey Tool pre-alpha)
verdy_p <verdy_p <at> wanadoo.fr>
2008-11-27 18:54:41 GMT
2008-11-27 18:54:41 GMT
"Doug Ewell" wrote: > Warning: this is completely OT for the Unicode list. Future discussion > should be on the LTRU list (ltru <at> ietf.org) or CLDR list > (cldr-users <at> unicode.org) as appropriate. You have just replied to the Unicode list yourself (despite I was replaying to you using a CC to the CLDR list...) > "verdy underscore p" <verdy underscore p at wanadoo dot fr> wrote: > > > If only we could have some access to ISO 639-5 data (for managing the > > language families instead of using the historic and bdly designed > > language collections of ISO 639-1 (code [bi] only) and ISO 639-2... > > I wish the ISO 639-5 Registration Authority, which is the same as that > for ISO 639-2 (Library of Congress), would set up an official 639-5 Web > site. It has been a long time coming. Well, still waiting (sorry, my interest for the subject is mostly personal, although I could have use of it professionnally, but I can't pay myself for getting a copy of the published paper; it's too expensive for me). > I don't agree with characterizing 639-1 and 639-2 as "badly designed." > They were designed for different purposes. Apparently not. Your description just indicates that 639-5 is effectively continuing the 639-2 (and 639-1 for bihari) model, and does not create what was expected (a comprehensive hierarchy similar to the Ethnologue); in addition, the 639-5 is now incompatible with 639-2 and 639-1, making it mostly unusable within the RFC 4645/4646 bis framework). For me, this means that 639-5 is already a dead standard before its publication, unless the 639-2 and 639-1 collections are completely removed, due to the changes that occured in 639-5. > > Also I'm still waiting to see how ISO 639-5 can be integrated with the > > RFC 4545bis and RFC 4646bis rules. > > This is clearly laid out in the two LTRU drafts: > > http://www.ietf.org/internet-drafts/draft-ietf-ltru-4646bis-18.txt > http://www.ietf.org/internet-drafts/draft-ietf-ltru-4645bis-07.txt > http://www.ewellic.org/rfc4645bis.html There's absolutely no integration. Or more exactly, it does not create the encoding framework that would allow the efective creation of a comprehensive hierarchy of language families. It just says that they are just added as possible subtags, usable as prefixes, but immediately, the included list of tags make these combinations of a collection subtag plus a language subtag accepted as non prefered aliases for the prefered tag consisting in the language tag only (this is fine for me, as it effectively initiates the organization as a hierarchy; however, the list is not complete enough to organize the 639-3 list of macrolanguages and individual languages) Also I don't understand the need to create a prefix subtag for the special language scopes of 639-3 (except as a categorization in a more compelte hierarchy). Well, this is not critical and does not affect my own projects with them. > In brief, 639-5 code elements are simply added as more language subtags > that represent language collections, just like existing subtags such as > 'alg'. This is very straightforward. If you say so... For me, this absolutely does not change what was already in use, or attempted, without 639-5. The 639-5 part solves absolutely no additional problem, but just creates more confusion (due to its incompatibilities with 639-1/2). I would have really hoped that those unstable collections of 639-1/2 were deprecated (grandfathered with no indication of a prefered new code, due to the ambiguities, just like what has been done for "cel-gaulish" in RFC 4645bis), and that new codes were assigned to the codes that were changed to be inclusive and built according to serious definitions (e.g., the exclusive [ine] collection of 639-2 would have been grandfathered, and a new code added in 639-5 for the inclusive Indo-european family). > You can check out the proposed replacement Language Subtag Registry, > embedded in draft-ietf-ltru-4645bis-07.txt, to see how this works. > I don't see how one can be pro-639-5 and anti-639-2. 639-2 includes a > modest number of collection codes among its repertoire. 639-5 comprises > a more comprehensive list of collections, but for some reason omits two > of the collections included in 639-2 (Bihari and Himachali), and > incorrectly lists 'car' as a collection called "Carib languages" when > other parts of 639 had already classified this as an individual > language, "Galibi Carib." You forget that I had hoped for 639-5 to be made interoperable with 639-1/2 and so with the RFC 4645/4646 framework. For me it has completely failed to this, and I can predict that the integration of 639-5 in RFC 4645/4646 will fail. > Except for these few inconsistencies, 639-2 is simply a subset of "639-3 > plus 639-5." You can't simultaneously "integrate ISO 639-5" and "drop > ISO 639-2 collections." Yes you can ! In fact you can't integrate both 639-2 and 639-5 simultaneously (in RFC 4645/4646bis), unless the RFC 4645/4646bis paper is updated (once again) to resolve the conflicts and inconsistencies that the early, semi-confidential and constly publication of 639-5 has created ! Thanks, 639-5 is not published freely, so it is still mostly ignored by everyone. But I hope that 639-5 will be rapidly corrected to solve what I consider as severe bugs (or lack of analysis and decisions about its policies, stability and compatibility with the rest of the 639 standard). (and I can understand now why The Ethnologue has chosen to NOT use or display the 639-5 "collection" codes for its language families !)
RSS Feed