Picture of the page in action.

>> Use it

In 1992 the Chinese government recognised the Fraser alphabet as the official script for the Lisu language and has encouraged its use since then. There are 630,000 Lisu people in China, mainly in the regions of Nujiang, Diqing, Lijiang, Dehong, Baoshan, Kunming and Chuxiong in the Yunnan Province. Another 350,000 Lisu live in Myanmar, Thailand and India. Other user communities are mostly Christians from the Dulong, the Nu and the Bai nationalities in China.

About the tool: Pickers allow you to quickly create phrases in a script by clicking on Unicode characters arranged in a way that aids their identification. Pickers are likely to be most useful if you don’t know a script well enough to use the native keyboard. The arrangement of characters also makes it much more usable than a regular character map utility.

Latest changes: This picker is new. The default view was modified from an original proposal by Benjamin Lee, and is likely to be more useful to people who are somewhat familiar with the alphabet and characters of Lisu. Characters are arranged to simplify entry, with consonants to the left, vowels to their right, and tone marks to their right.

There is also a keyboard view. Many of the positions of characters are based on keyboard layouts I have seen. Those keyboards, however, tended to use some ASCII characters for punctuation, when the Unicode Standard recommends other characters (in particular, MODIFIER LETTER LOW MACRON and MODIFIER LETTER APOSTROPHE) or omit some punctuation characters mentioned in the Unicode Standard. The current version of this keyboard, therefore adds some extra characters.

The layout is adequate, given that pickers assume availability of a QWERTY keyboard, however if a real standardised keyboard layout is to be made, it should involve some further changes. For example, people wanting to use syntax characters such as comma, period, semi-colon, single quote, etc, while writing the text in Lisu will need direct access to those characters. They are missing from this layout.

Characters in the Unicode Bengali block.

If you’re interested, I just did a major overhaul of my script notes on Bengali in Unicode. There’s a new section about which characters to use when there are multiple options (eg. RRA vs. DDA+nukta), and the page provides information about more characters from the Bengali block in Unicode (including those used in Bengali’s amazingly complicated currency notation prior to 1957).

In addition, this has all been squeezed into the latest look and feel for script notes pages.

The new page is at a new location. There is a redirect on the old page.

Hope it’s useful.

>> Read it


Picture of the page in action.
 
Picture of the page in action.

About the tools: Pickers allow you to quickly create phrases in a script by clicking on Unicode characters arranged in a way that aids their identification. Pickers are likely to be most useful if you don’t know a script well enough to use the native keyboard. The arrangement of characters also makes it much more usable than a regular character map utility

Latest changes: The Urdu and Tamil pickers have been upgraded to version 10. This provides new views of the data, but also involved a thorough overhaul and redesign of the pickers. Transliteration functions have also been added for the Tamil picker.

In addition, the Urdu notes page was updated and a new Tamil notes page was created. Database entries were also updated or, in the case of Tamil, created to support the notes pages. These notes pages are the first to use a new look and feel, based on the analyse-string tool I produced earlier this year. This adds information about each character from the Unicode descriptions data to that from my own database.

>> Read the notes

Today I put the finishing touches to and uploaded my first draft notes about the long lost ishidic script. See what you think of it.

Here’s a small section of the sample text shown at the bottom of the post. Click on it to see the whole transcript.

Part of a sample of text written in ishidic script.

>> Read it !

Picture of the page in action.

I finally got to the point, after many long early morning hours, where I felt I could remove the ‘Draft’ from the heading of my Myanmar (Burmese) script notes.

This page is the result of my explorations into how the Myanmar script is used for the Burmese language in the context of the Unicode Myanmar block. It takes into account the significant changes introduced in Unicode version 5.1 in April of this year.

Btw, if you have JavaScript running you can get a list of characters in the examples by mousing over them. If you don’t have JS, you can link to the same information.

There’s also a PDF version, if you don’t want to install the (free) fonts pointed to for the examples.

Here is a summary of the script:

Myanmar is a tonal language and is syllable-based. The script is an abugida, ie. consonants carry an inherent vowel sound that is overridden using vowel signs.

Spaces are used to separate phrases, rather than words. Words can be separated with ZWSP to allow for easy wrapping of text.

Words are composed of syllables. These start with an consonant or initial vowel. An initial consonant may be followed by a medial consonant, which adds the sound j or w. After the vowel, a syllable may end with a nasalisation of the vowel or an unreleased glottal stop, though these final sounds can be represented by various different consonant symbols.

At the end of a syllable a final consonant usually has an ‘asat’ sign above it, to show that there is no inherent vowel.

In multisyllabic words derived from an Indian language such as Pali, where two consonants occur internally with no intervening vowel, the consonants tend to be stacked vertically, and the asat sign is not used.

Text runs from left to right.

There are a set of Myanmar numerals, which are used just like Latin digits.

So, what next. I’m quite keen to get to Mongolian. That looks really complicated. But I’ve been telling myself for a while that I ought to look at Malayalam or Tamil, so I think I’ll try Malayalam.

New article

In-progress draft of notes that list the symbols used to represent Bengali, describe their use, and relate them to appropriate characters for representation in Unicode. There is an index of shapes you can use to look up Bengali glyphs and track them down to their constituent Unicode codepoints.