Character pickers are especially useful for people who don't know a script well, as characters are displayed in ways that aid identification. See the notes for details. If you prefer, you can still use version 14.
Click on characters below to create text in the box, then copy & paste to your content.
To properly display the text you will need to choose a font that is loaded on your system or device, or use the web font downloaded with the page (Khmer OS Battambang). The font list indicates which fonts are standard for Mac (Snow Leopard/Lion) and Windows7, as well as which are the web fonts. Note that the web fonts aren't guarranteed to work on every system/device. This typically arises because the font relies on rendering algorithms provided by the operating system. See more information about standard OS fonts in Mac and Windows.
You can also add codepoints and escapes via the "Add codepoint" field (hit return to add to the output field). You can also paste text into the output field to get information about it. Use the yellow boxes to set preferences or search. Regular expressions are allowed when searching – for example, to find characters with the word KA in their name, enter \bka\b, or the short form :ka:.
When working on an iPad or similar device, you should turn off more controls/Autofocus. This prevents the keyboard popping up after you input every character. You may also need to select a character twice to add it to the output field.
About the chart
Includes all the characters in the Unicode Khmer and Khmer Symbols blocks (in the default panel).
All text is output in Unicode normalisation form NFC by default. You can change to NFD or no normalisation by clicking on the buttons in the yellow area. Note that normalization only takes place when you click on a character - text pasted into the box won't be normalised until you click on another character above, or click on a button in the yellow area. (Note: normalization is turned off for Han characters in this application.)
The following options are available by clicking on the vertical grey bar to the left of the selection area.
Default Clicking on this turns off the other features described in this section. The default table is likely to be most useful to people who are somewhat familiar with the alphabet and characters of Khmer. Characters are arranged based on the order of input, to speed up picking.
Simple consonants are to the left in mostly alphabetic order. To their right are combining characters that follow the initial consonant, then subscript consonants, then vowels and other symbols. Independent vowels appear at the top, then combining vowel signs, then other combining marks. At the far right are digits and the currency symbol, and various other symbols and punctuation. Clicking on the subscript characters produces a coeng sign followed by a consonant.
Open the expanding link for obsoleted and other less commonly used characters.
Hints This changes the behaviour of the table view so that, when you mouse over a character, characters that are similar in appearance, and may be easily confused, are automatically highlighted. This can be particularly useful for people who are not familiar with the script, to avoid confusing similar characters, or to find the right character when two or more look similar.
Shape lookup This adds a row of orange pictures that represent basic shapes associated with the Thai characters. When you click on a picture, characters that incorporate that shape are highlighted. This is particularly helpful for those who don't know the script at all and want to pick characters based on their shape, or for those times when you just can't find the character you want and need a hint.
Transcriptions There are three transcription panels available in this picker: Latin, Huffman, and Gilbert. These panels allow you to follow a transcription to generate some Khmer text. Where there are multiple possible choices, these choices are presented in a small pop-up box; click on the choice you want to insert it into the output area.
The Latin panel provides additional characters you may need while typing in a Latin transcription from the keyboard.
The Huffman panel allows you to generate Khmer text from a transcription as used by Huffman in Cambodian System of Writing.
The Gilbert panel allows you to generate Khmer text from a transcription as used by Gilbert and Hang in Cambodian for Beginners.
A hyphen in a selection list for either of these transcription panels indicates that the sound is produced without a Khmer character, ie. the inherent vowel.
In a small number of cases, you will need to click twice on the components that make up the sound (eg. when bantoc is used on the following consonant). These cases are indicated by a red plus sign between two clickable shapes (one of which may be just a hyphen). You need to click on the item to the left of the plus sign, then add a consonant, then click on the item to the right of the plus sign. In several cases the item to the left is a hyphen (representing the inherent vowel), in which case just add another consonant followed by the item to the right.
The toIPA button produces an output that is intended to approximately reflect actual pronunciation. It uses the rules in Franklin Huffman's Cambodian System of Writing. However, it needs some assistance from the user. This is because Khmer doesn't use spaces between words, and it is often ambiguous as to whether a consonant represents a syllable-final sound or a syllable in its own right. It also needs help to identify unstressed syllables. I don't have the means to do automatic word segmentation, so you will need to provide this information.
After the first syllable on the line, put an ordinary space before each consonant or independent vowel sign that begins a new syllable (not word). (Note that this may split consonant clusters. The Khmer text will look strange but still work.) You should also indicate unstressed syllables by following the syllable with a hyphen, rather than a space. For many bisyllabic words, this means putting a hyphen after the first of the two syllables. For example, converting ប្រកាន់និទៀន to ប្រ-កាន់ និ-ទៀន will produce the following transcription prɑkannitiən. Note that, if you don't know Khmer well enough to know when a syllable is unstressed, you can still get an approximation to the pronunciation using only spaces. For instance, the previous example separated by spaces only will yield prɑːkanniʔtiən.
The condense button removes the spaces from the highlighted range (or the whole output area, if nothing is highlighted).
Although the transcription is based on rules by Franklin Huffman in Cambodian System of Writing, some symbols are changed to be more recognizable to those familiar with IPA. While the transcription rules are quite detailed, and Khmer is largely regular, there are a few exceptions, particularly in words from Sanskrit or Pali, or ambiguities, for example in a few independent vowel signs, that cause problems for the transcription. The transcription is non-reversible. I created it to help me quickly reproduce (simple) phonetic alternatives for examples in my notes on Khmer.
Notes on other controls
Controls at the bottom of the page allow you to modify fonts used, the font size, and the height of the output box.
Searching by character name or codepoint. If you are searching for a particular character and know (at least part of) the name or the codepoint, type that in the search box and hit return. All characters with matching text in the name or codepoint number will be highlighted. The highlighting is only removed when you click on the X next to the search input field. You can also use regular expression syntax to improve your search results. For example, to find the letter 'ha', but not 'gha' etc, you can use \bha\b (or the shortcut, :ha:).