1 Apr 2008 19:12
Re: [MarkLogic Dev General] Sorting pinyin text?
On Tue, 01 Apr 2008 08:31:12 -0700, Marc Moskowitz <mmoskowitz@...> wrote: > The standard zh collation sorts Chinese characters correctly, but I'm > trying to sort the pinyin transliterations. For example, this XQuery: > > default collation="http://marklogic.com/collation/zh" > let $words := ('fù-bèi shòu dí','fùdi','fùgǎo','fūzi','fùtòng','fùxiè', > 'fù-mu') > for $x in $words > order by $x > return $x > > returns > > fù-bèi shòu dí > fù-mu > fùdi > fùgǎo > fùtòng > fùxiè > fūzi > > which is in codepoint order, instead of the correct order: > > fūzi (1st tone comes before 4th) > fù-bèi shòu dí > fùdi > fùgǎo > fù-mu (hyphens should be ignored) > fùtòng > fùxiè > > Am I correct that the supported way to sort this text is to create a > sortable form for each of these strings at document load time? > -Marc Ah, right. I missed the key word "transliteration". Yes, I think what you want to do is create some kind of sort key at document load time. //Mary
RSS Feed