Dochula Pass, Bhutan

Multiple scripts in XMetal’s tags-on view (click to enlarge).

I received a query from someone asking:

I try to edit lao and thai text with XMetal 5.0, but nothing is displayed but squares. In fact, Unicode characters seems to be correctly saved in the XML file and displayed in Firefox (for example), but i can’t get a correct display in XMetal. Is it a font problem ?

There are two places this needs to be addressed:

  1. in the plain text view
  2. in the tags-on view

For the plain text view, it is a question of setting a font that shows Lao and Thai (or whatever other language/script you need) in Tools>Options>Plain Text View>Font. You can only set one font at a time, so a wide ranging Unicode font like Arial Unicode MS or Code2000 may be useful for Windows users.

For the tags-on view (which is the view I use most of the time) you need to edit the CSS file that sets the editor’s styling for the DOCTYPE you are working with. This may be in one of a number of places. The one I edit is C:\Program Files\Blast Radius\XMetaL 4.6\Author\Display\xhtml1-transitional.css.

I added the following to mine. I chose fonts I have on my PC and sets font sizes relative to the size I set for my body element. You should, of course, choose your own fonts and sizes.

[lang="am"] { font-family: "Code2000", serif; font-size: 120%; }
[lang="ar"] {font-family: "Traditional Arabic", sans-serif; font-size: 200%; }
[lang="bn"] {font-family: SolaimanLipi, sans-serif; font-size: 200%; }
[lang="dz"] { font-family: "Tibetan Machine Uni", serif; font-size: 140%; }
[lang="he"] {font-family: "Arial Unicode MS", sans-serif; font-size: 120%;}
[lang="hi"] {font-family: Mangal, sans-serif;  font-size: 120%;}
[lang="kk"] {font-family: "Arial Unicode MS", sans-serif;  }
[lang="iu"] {font-family: Pigiarniq, Uqammaq, sans-serif; font-size: 120%; }
[lang="ko"] { font-family: Batang, sans-serif; font-size: 120%;}
[lang="ne"] {font-family: Mangal, sans-serif;  font-size: 120%; }
[lang="pa"] { font-family: Raavi, sans-serif; font-size: 120%;}
[lang="te"] {font-family: Gautami, sans-serif; font-size: 140%;}
[lang="my"] {font-family: Myanmar1, sans-serif; font-size: 200%;}
[lang="th"] {font-family: "Cordia New", sans-serif; font-size: 200%; }
[lang="ur"] { font-family: "Nafees Nastaleeq", serif; font-size: 130%;}
[lang="ve"] { font-family: "Arial Unicode MS", sans-serif; }
[lang="zh-Hans"] { font-family: "Simsun", sans-serif; font-size: 140%; }
[lang="zh-Hant"] { font-family: "Mingliu", sans-serif; font-size: 140%; }

Note that I would have preferred to say :lang(am) { font-family… } etc, but XMetal 4.6 seems to require you to specify the attribute value as shown above. (You also have to specify class selectors as [class=”myclass”] {…} rather than .myclass {…}.)

I see from a recent bugzilla report and some cursory testing that a (very) long-standing bug in Mozilla related to complex scripts has now been fixed.

Complex scripts include many non-Latin scripts that use combining characters or ligatures, or that apply shaping to adjacent characters like Arabic script.

It used to be that, when you highlighted text in a complex script, as you extended the edges of the highlighted area you would break apart combining characters from their base character, split ligatures and disrupt the joining behaviour of Arabic script characters.

The good news is that this no longer happens – it was fixed by the new text frame code. The bad news is that the highlighting still happens character by character, rather than at grapheme boundaries – which can make it tricky to know whether you got the combining characters or not.

UPDATE: I hear from Kevin Brosnan that the following will be fixed in Firefox 3. Hurrah! And thank you Mozilla team.

What doesn’t appear to be fixed is the behaviour of asian scripts when the CSS text-align:justify is applied. 🙁

I raised a bug report about this. I was amazed, after hearing about this from Indians and Pakistanis too, that there didn’t seem to be a bug report already. Come on users, don’t leave this up to the W3C!

Basically, the issue is that if you apply text-align: justify to some text in an Indian or Tibetan script the combining characters all get rendered alongside their base characters, ie. you go from this (showing, respectively, tibetan, devanagari (hindi and nepali), punjabi, telegu and thai text):

Picture of text with no alignment.

to this:

Picture of text with justify alignment.

Strangely the effect doesn’t seem to apply to the Thai text, nor to other text with combining characters that I’ve tried.

That’s a pretty big bug for people in the affected region because it effectively means that text-align:justify can’t be used.

>> Use it !

Picture of the page in action.

This tool allows you to see what is assigned to event.keyCode and event.charCode in the DOM after the events keydown, keypress, and keyup are detected by the browser. Use it across different browsers with different keyboard mappings to see how things differ.

It’s a bit esoteric, but it may be of interest to someone. I wanted to play with this a bit to help me understand the background to the DOM Level 3 Events Specification.