29 Nov 22:29
Re: RSS and diacritics
Hi Bob Bob Rasmussen wrote: > On Thu, 29 Nov 2007, Thomas Dowling wrote: > >> The more adept browsers out there figured this out quite a while ago. If the >> font they're using doesn't have a glyph for the character requested, they pull >> the correct glyph from a font that does have it. Awkwardly, there's a less >> adept browser that fails to do this, that has about 80% market share... >> >> CSS2 requires that browsers work their way down the list of specified fonts to >> find the right glyph, not just find a matching font name. IIRC, Gecko-based >> browsers and Opera go beyond that to find any system font with the right >> glyph. > > As an aside, that is precisely the approach taken by Anzio, our terminal emulation package, and Print Wizard, our printing utility. These programs also take many steps to handle combining diacritics well, including raising the "above" diacritics where necessary to avoid collision with the base character. > > My perception of the most common issues in regards to library systems displaying (and printing) diacritics and non-Latin characters: > > 1) Very few fonts have the combining double tilde and combining double ligature marks, used mostly with transliterated Russian. > Try the SIL fonts. Charis SIl and Doulos SIL, have had those diacritics displaying correctly. Hopefully Gentium Book when its finally released will also support these diacritics. But also depends on your font rendering technology in use, either latest Uniscribe, or Graphite within Windows. My gut reaction though is the core limitation is going to be, not the fonts or the font rendering system, it is actually the web pages generated by the vendors. Well structured content following web internationalization and accessibility best practice would be a breeze to tweak and get all languages to display fine. > 2) Software does not correctly combine combining diacritics. This is simply poor softwrae internationalization. On the right operating system there is no excuse for diacritics not displaying properly. if the default rendering of the operating system supports it, there is no excuse for an application that is well internationalised to not support it. Personally, I think vendors are let off too lightly. Generally, they say they support Unicode, but never spell out what parts they support and what parts they don't. From the perspective of my work place, all our web interfaces should support our state government's web standards. I doubt there is a single vendor solution in use in our state that does meet those standards. > 3) Fonts are inconsistent in the way they specify the X-location of combining diacritics. A font should use the mark and mkmk features in the GPOS table to indicate the placement of a diacritic relative to a specific base character or relative to another diacritic. But currently few do. And my current compliant about Vista core fonts is that it positions combining diacritics conistently at a different height than the diacritic placement of precomposed glyphs, makes for ugly text when using a mix of precomposed and composed forms which may be necessary in some languages. > 4) Library software I have worked with does not give the browsers information about the language contained in a particular section of text. Thus the browser can not take advantage of the user's language-specific font preferences. This is especially a problem in rendering Han characters, which could be part of a Japanese, Korean, Simplified Chinese, or Traditional Chinese title, for instance. With IE, this seems to force the user to use one super-font, which inevitably has shortcomings. > Yes, An in this scenario , different browsers will have different responses. Richard Ishida (w3C) but together a test of html CJK data that wasn't language tagged. Some browsers will default to displaying CJK data with a Japanese font, others will use a Simplified Chinese font, in at least one case an older version of opera defaulted to a Korean font. > Finally, Andrew Cunningham mentioned Font Linking. According to MS's documentation, this should make it possible to define a large virtual font by linking together multiple fonts, without physically combining the files. So theoretically I could create a font with the missing ligature marks (see 1 above), and link it to Arial Unicode, for instance. However, I have never succeeded in this in regards to IE. Has anyone succeeded in doing this? Not quite that simple. To support the missing ligature marks, you'd be better off with a whole new OpenType font. To properly handle combining diacritics, esp the double diacritics, you need to treat the Latin script as a complex script. Which for Windows, means dealing with uniscribe. And a lot of the font linking smarts Microsoft uses in its applications are script dependent and built into Uniscribe. Often this is a fallback. If you are on Win XP SP2, or Vista download the Charis SIL font at http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=CharisSIL_download, its released under OFL so can be redistributed or modified under that license. Then have a look at http://www.openroad.net.au/test/sample.html Andrew -- -- Andrew Cunningham Research and Development Coordinator (Vicnet) State Library of Victoria 328 Swanston Street Melbourne VIC 3000 Australia Email: andrewc+AEA-vicnet.net.au Alt. email: lang.support+AEA-gmail.com Ph: +613-8664-7430 Fax:+613-9639-2175 Mob: 0421-450-816 http://www.slv.vic.gov.au/ http://www.vicnet.net.au/ http://www.openroad.net.au/ http://www.mylanguage.gov.au/ http://home.vicnet.net.au/~andrewc/
RSS Feed