Accesskey n skips to in-page navigation. Skip to the content start

 
ishida >> writing

Khmer script notes [Draft]

These notes are still in development. I am using them to explore the Khmer script.

This page sets out to list the symbols used to represent Khmer text, describe their use, and relate them to appropriate characters for representation in Unicode. Along the way I also describe the basic phonology associated with the graphical symbols.

In some cases there is some discussion about which Unicode characters are most appropriate, and it was to address these questions that I originally embarked on this.

You need a font for Khmer. See the sidebar for links to the fonts I used. Alternatively you can view the PDF version of this document.

Brief script introduction

The script is an abugida, ie. like most Bhahmi-influenced scripts, each consonant carries with it an inherent vowel. The sound following a consonant can be modified by attaching vowel signs to the consonant when writing.

Direction of text is horizontal, left to right. However, glyphs constituting a single syllable can appear on all sides of the initial character.

A key feature of Khmer is that there are a large number of vowel sounds, and only a few vowel signs, but a large number of consonant signs for only a small number of consonant sounds. This lead to a system where there are generally two consonant signs for a given sound, each belonging to one of two classes (or registers). So to determine the pronunciation of a vowel sign you start by seeing which class of consonant it follows. For example, using the two symbols for the sound [k], is [kɑː] neck, and is [kɔː] mute.

Diacritics are available to change the class of a consonant. These are particularly useful when a particular sound has only one character associated with it, such as មយស etc.

The basic written elements for Khmer as follows. Click on a character image for more details.

Consonants: KHMER LETTER KA KHMER LETTER KHA KHMER LETTER KO KHMER LETTER KHO KHMER LETTER NGO KHMER LETTER CA KHMER LETTER CHA KHMER LETTER CO KHMER LETTER CHO KHMER LETTER NYO KHMER LETTER DA KHMER LETTER TTHA KHMER LETTER DO KHMER LETTER TTHO KHMER LETTER NNO KHMER LETTER TA KHMER LETTER THA KHMER LETTER TO KHMER LETTER THO KHMER LETTER NO KHMER LETTER BA KHMER LETTER PHA KHMER LETTER PO KHMER LETTER PHO KHMER LETTER MO KHMER LETTER YO KHMER LETTER LA KHMER LETTER LO KHMER LETTER RO KHMER LETTER SA KHMER LETTER HA KHMER LETTER VO KHMER LETTER QA
Subscript forms, composed in Unicode by U+17D2 KHMER SIGN COENG plus a consonant character: KHMER LETTER KA KHMER LETTER KHA KHMER LETTER KO KHMER LETTER KHO KHMER LETTER NGO KHMER LETTER CA KHMER LETTER CHA KHMER LETTER CO KHMER LETTER CHO KHMER LETTER NYO KHMER LETTER DA KHMER LETTER TTHA KHMER LETTER DO KHMER LETTER TTHO KHMER LETTER NNO KHMER LETTER TA KHMER LETTER THA KHMER LETTER TO KHMER LETTER THO KHMER LETTER NO KHMER LETTER BA KHMER LETTER PHA KHMER LETTER PO KHMER LETTER PHO KHMER LETTER MO KHMER LETTER YO KHMER LETTER LA KHMER LETTER LO KHMER LETTER RO KHMER LETTER SA KHMER LETTER HA KHMER LETTER VO KHMER LETTER QA
Vowels: KHMER VOWEL SIGN AA KHMER VOWEL SIGN I KHMER VOWEL SIGN II KHMER VOWEL SIGN Y KHMER VOWEL SIGN YY KHMER VOWEL SIGN U KHMER VOWEL SIGN UU KHMER VOWEL SIGN UA KHMER VOWEL SIGN E KHMER VOWEL SIGN AE KHMER VOWEL SIGN AI KHMER VOWEL SIGN OO KHMER VOWEL SIGN AU KHMER VOWEL SIGN OE KHMER VOWEL SIGN YA KHMER VOWEL SIGN IE KHMER SIGN NIKAHIT KHMER SIGN REAHMUK
Some combinations of the above that are regarded as additional letters of the alphabet. KHMER VOWEL SIGN AA KHMER VOWEL SIGN U
Independent vowels: KHMER INDEPENDENT VOWEL QI KHMER INDEPENDENT VOWEL QII KHMER INDEPENDENT VOWEL QU KHMER INDEPENDENT VOWEL QUU KHMER INDEPENDENT VOWEL QUUV KHMER INDEPENDENT VOWEL QE KHMER INDEPENDENT VOWEL QAI KHMER INDEPENDENT VOWEL QOO TYPE ONE KHMER INDEPENDENT VOWEL QOO TYPE TWO KHMER INDEPENDENT VOWEL QAU KHMER INDEPENDENT VOWEL RY KHMER INDEPENDENT VOWEL RYY KHMER INDEPENDENT VOWEL LY KHMER INDEPENDENT VOWEL LYY
Some independent vowels have subscript forms: KHMER INDEPENDENT VOWEL QU KHMER INDEPENDENT VOWEL QE KHMER INDEPENDENT VOWEL RY KHMER INDEPENDENT VOWEL RYY
Combining marks: KHMER SIGN MUUSIKATOAN KHMER SIGN TRIISAP KHMER SIGN BANTOC KHMER SIGN ROBAT KHMER SIGN TOANDAKHIAT KHMER SIGN KAKABAT KHMER SIGN AHSDA KHMER SIGN SAMYOK SANNYA
Punctuation: KHMER SIGN KHAN KHMER SIGN BARIYOOSAN KHMER SIGN CAMNUC PII KUUH
Digits: KHMER DIGIT ZERO KHMER DIGIT ONE KHMER DIGIT TWO KHMER DIGIT THREE KHMER DIGIT FOUR KHMER DIGIT FIVE KHMER DIGIT SIX KHMER DIGIT SEVEN KHMER DIGIT EIGHT KHMER DIGIT NINE
Other signs and symbols : KHMER CURRENCY SYMBOL RIEL KHMER SIGN LEK TOO KHMER SIGN PHNAEK MUAN KHMER SIGN KOOMUUT
There are also a few more rarely used characters: KHMER LETTER SHA KHMER LETTER SSO KHMER SIGN AVAKRAHASANYA KHMER SIGN ATTHACAN KHMER SIGN VIRIAM
And a number of deprecated characters: KHMER INDEPENDENT VOWEL QAA KHMER INDEPENDENT VOWEL QAQ KHMER VOWEL INHERENT AQ KHMER VOWEL INHERENT AA KHMER SIGN BATHAMASAT KHMER INDEPENDENT VOWEL QUK KHMER SIGN BEYYAL

In addition, there is the coeng generator , which has no visual form in Cambodian, and sets of divination lore and lunar date symbols which are not described here (but are available from the picker).

There are two distinct styles of font in Modern Khmer: slanted អក្សរច្រៀង (with an upright variant) and round អក្សរឈរ. The round style includes more ligated forms. The upright style is used here. Style examples: slanted upright អក្សរ ខ្មែ, round អក្សរ ខ្មែ.

Script usage notes

Text boundaries

The syllable is fundamental in Cambodian.

Many native Cambodian words are monosyllabic. These start with one or more consonants or an independent vowel (or a vowel sign attached to ʔɑː, which is a combination of both). Short vowels in stressed syllables are always followed by a consonant. Long vowels may not be. There are many monosyllabic words that begin with consonant clusters, and some monosyllabic words that end with clusters, although only one consonant is pronounced in syllable final position.

There are also many bisyllabic words. In many cases the first syllable in a bisyllabic word is unstressed, and the vowel is usually rendered in colloquial speech as a schwa. Some bisyllabic words are compounds, however, and this may not apply.

Polysyllabic words are usually of Sanskrit, Pali or French origin. These words tend to alternate stress across their syllables, but may not.

Vowels

Several vowel characters are composed of separate parts visually, eg.  ើ [aw/əː]. The descendants of the anusvara and the visarga, called niʔkəhət និគ្គហិត and reə̆hmuk រះមុខ respectively, are also regarded as vowels in Khmer, even though their vowel sounds still end with [ŋ] and [h] respectively. Two combinations of these characters and other vowel sign characters are regarded as vowels in the alphabet but not encoded separately in Unicode (though they are named sequences), ie. អាំ [am/oə̆m] and អុំ [om/um].

Other diacritics also produce vowel sounds after or before the consonants they are attached to.

As mentioned above, an initial indicator of pronunciation is the class of the syllable initial consonant. Additional factors include whether this is an unstressed vowel, vowel harmony, and whether any of the special diacritics have been used to change the sound. For an in-depth treatment of pronunciation see Huffman in the sources section.

Inherent vowels Khmer has two inherent vowels, [ɑː] and [ɔː]. The class of the consonant will initially dictate which sound is appropriate, eg. [kɑː] vs. [kɔː].

Inherent vowels are not pronounced after syllable final consonants.

Vowel signs. As mentioned above, in most cases, vowel signs attached to a consonant are pronounced differently, depending on the register of the consonant letter, eg. កា [kaː] vs. គា [kiə].

Independent vowels. There are two ways of representing vowel sounds that are not preceded by a consonant.

The most common way is to add a vowel-sign to the character , eg. អី [ʔəj].

There are also some independent vowel letters, but unlike most South Asian scipts, there are fewer independent vowels than vowel signs, and some do not have direct correspondances with a vowel sign, eg. corresponds phonetically to the vowel plus consonant combination  ូវ.

Whether an independent vowels sound is represented using an independent vowel sign or the glottal consonant plus vowel sign varies from word to word. In Cambodian orthography the two are not interchangeable. The independent vowel signs appear in relatively few words, but some of those words are quite common, eg. ឪពុក [ʔəwpuk] father, ឲ្យ [ʔaoj] to give and [lɨː] to hear .

Vowel harmony. In two-syllable words, where the second syllable begins with one of the following consonants, ងញណនមយឡលរវ, the vowel class of the second syllable is the same as that of the first, eg. in ប្រយ័ត្ន [prɑjat] to be careful, the second syllable starts with an [oː] class consonant but the class of the preceding syllable turns the vowel to an [ɑː] class sound. There are, however, exceptions to this rule.

Consonants

Final consonants. Not all Khmer consonants can appear in syllable-final position. The most common syllable-final consonants include កងញតនបមល. The pronunciation of the consonant in final position may differ from it's normal pronunciation.

Subscript consonants. It is common to find clusters of consonants with no intervening vowel sounds. In Khmer, this is very common at the beginning of a word, but clusters also occur medially in multisyllable words, and occasionally at the end of a word.

When two consonants occur together without an intervening vowel, the second is rendered in subscript form, called ជើងអក្សរ [cəːŋʔɑʔsɑː] consonant feet (called in Unicode 'coeng'). Cambodians see these subscripts as distinct letter forms, but in Unicode they are produced by inserting 17D2: KHMER SIGN COENG before the consonant that will become a subscript.

Where the two consonants involved in the cluster are in different classes or registers, the pronunciation of any following vowel is normally determined by the register of the subscript consonant. For the following exceptions, however, the vowel pronunciation is determined by the register of the first consonant: ងញនមយរលវ. XXX Add an example.

Some subscripts change the sound of the preceding consonant.

Subscript consonants that appear at the end of a word, are silent, eg. ពេទ្យ [peit]; រដ្ឋ [roat].

In some multisyllabic words a medial cluster may contain a final consonant for the first syllable and the initial consonant of the next syllable, eg. កម្មករ [kɑmmɔkɑː] worker .

There are some clusters involving two subscripts. These are, with three exceptions, composed of a final nasal, followed by a stop and r, eg. កន្ត្រៃ [kɑntraj] scissors, កញ្ជ្រេង [kɑɲcreːŋ] fox. The three exceptions are the loan words, អង្គ្លេស [ʔɑŋkleːh] English, សងស្ក្រិត [sɑŋskret] Sanskrit, and សាស្ត្រាចារ្យ [sɑstraːcaː] teacher.

It is rare but possible to find subscripts used after independent vowels. One common word spelled this way is ឲ្យ [ʔaoj] to give.

It is also possible to find subscript forms of independent vowels. Four of these are named sequences in Unicode. (See the table above.)

Shaping

There is very little in the way of interaction between characters other than the subscript shapes used after the coeng generator.

Some small joining features occur in relation to  ា and similarly shaped vowels. Unicode provides the following list of common forms:

  1. ក + ា = កា
  2. ប +  ា = បា (avoids confusion with )
  3. ប +  ៅ = បៅ
  4.  ្ស +  ា = ្្សា

Some reshaping of glyphs is needed to cope with stacking of characters. Compare for example the length of the final element in ង្យ and ង្ខ្យ.

Also, when museʔkətoə̯n or trəisaɓ appears with a vowel sign above the consonant, the ក្បៀសក្រោម [kɓiəhkraom] form is used. This looks exactly like sra-o អុ, eg. compare យ៉ាង and ម៉ឺន [məɨn] 10,000 or ញ៉ាំ [ɲam] to eat. (This behaviour can be modified using the zero-width non-joiner.)

Another common feature is that drops the swash below the baseline when followed by a subscript consonant, eg. បញ្ឆោត [ɓɑɲcʰaot] to trick. Also, when it appears as a subscript under itself it uses a special full form subscript. Compare កញ្ញា [kɑɲɲaa] young lady and ប្រាជ្ញា [praːcɲaa] intelligence.

Ordering of characters

Components of an 'orthographic syllable'* should be composed in the following order:

* An orthographic syllable is slightly different from a morphological syllable, since an orthographic syllable may begin with the final consonant of the previous morphological syllable. Alternatively, and orthographic syllable may be just a final consonant or consonant cluster in a morphological syllable.

 

  1. base consonant or independent vowel
  2. rɔɓaːt
  3. museʔkətoə̯n or trəisaɓ (register shifters)
  4. subscript (consonant or independent vowel)
  5. vowel sign
  6. zero-width joiner or non-joiner
  7. any other mark

This fixed ordering makes it easier to search for and collate text.

As mentioned above, although all combining characters follow the base in memory, the visual order of syllable components may not follow a linear progression from left to right. In the following example the order in which the glyphs are pronounced is far left, far right, down, left, left: កន្ត្រៃ [kɑntraj] scissors. Here ច្រៀង the spoken order of the separate visible parts, numbered left to right, is 3,2, 1+4, 5, Some vowel signs span two or three sides of the base consonant or cluster.

Punctuation

Space. Khmer words are not separated by spaces, so the space, ឃ្លា [kliə], is regarded as punctuation, similar to the comma. Huffman lists the following uses:

  1. between clauses within a sentence
  2. between sentences in a cohesive group of sentences
  3. after preposed adverbial phrases, such as 'usually', 'today', 'in that town', etc.
  4. before and after proper names
  5. before and after numbers
  6. before and after the symbols and and the terms ។ល។ and ។ប។
  7. between coordinate words in lists

Huffman gives the following example to show the use of the space:

ថ្ងៃនេះ ខ្ញុំទៅផ្សារ ទិញក្រច អង្ករ ហើយនឹងអីវ៉ាន់ផ្សេង ៗ
[tŋajnih kɲomtɨwpsaː tiɲkrouc ʔɑŋkɑː haəjnɨŋʔəjʋanpseiŋ pseiŋ]
Today ( ) I'm going to the market ( ) to buy oranges ( ) rice ( ) and various things.

Other punctuation. Khmer uses other punctuation marks described in the punctuation section below. In addition to its own punctuation characters, Khmer uses Western punctuation marks, such as question mark (eg. ហេត៊អ្វី? [haetʰ aʋəi]), exclamation mark (eg. កុំ! [kom]).

Hyphens are used to indicate when part of a word has been wrapped onto a new line.

Hyphens are also used between the parts of a person's name. Typically the family name (written first) and following names, but often all names for Chinese Cambodians, eg. ញ៉ុក-ថែម [ɲok tʰaem], លី-ធាម-តេង [liː tʰiəm teiŋ].

Consonants

1780   1780: KHMER LETTER KA

Khmer consonant, kɑː

[k] with inherent vowel [ɑː] or followed by a vowel, eg. ក៏ [kɑː] also.

[k] before a subscript consonant.

[kʰ] when followed by a subscript or , eg. ក្មូយ [kʰmuəj].

[k] in final position, eg. លើក [ləːk] to lift.

្ក as a subscript consonant.

[edit]

1781   1781: KHMER LETTER KHA

Khmer consonant, kʰɑː

[kʰ] with inherent vowel [ɑː] or before a vowel, eg. ខាង direction [kʰaːŋ].

[k] before a subscript consonant.

[k] in final position.

្ខ as a subscript consonant.

[edit]

1782   1782: KHMER LETTER KO

Khmer consonant, kɔː

[k] with inherent vowel [ɔː] or before a vowel, eg. គេ [kei] they.

[k] before a subscript consonant.

[k] in final position.

[kʰ] when followed by a subscript or , eg. គ្នា [kʰniə].

្គ as a subscript consonant.

[edit]

1783   1783: KHMER LETTER KHO

Khmer consonant, kʰɔː

[kʰ] with inherent vowel [ɔː] or before a vowel.

[k] before a subscript consonant, eg. ឃ្លាន [kliən] hungry.

[k] in final position. Not common.

្ឃ as a subscript consonant. Seldom used.

[edit]

1784   1784: KHMER LETTER NGO

Khmer consonant, ŋɔː

[ŋ] with inherent vowel [ɔː] or before a vowel, eg. ងងឹត [ŋoŋət] dark. (Note that this sound appears in syllable initial position in Khmer.)

Not used before a subscript consonant.

[ŋ] in final position.

្ង as a subscript consonant. As a subscript this consonant doesn't determine the pronunciation of the vowel sound, it is determined by the class of the non-subscript consonant.

[edit]

1785   1785: KHMER LETTER CA

Khmer consonant, cɑː

[c] with inherent vowel [ɑː] or before a vowel, eg. ចង់ [cɑŋ] to want.

[c] before a subscript consonant.

[ik] in final position. [c] according to Huffman.

្ច as a subscript consonant.

[edit]

1786   1786: KHMER LETTER CHA

Khmer consonant, cʰɑː

[cʰ] with inherent vowel [ɑː] or before a vowel, eg. ឆា [cʰaː] stir fry.

[c] before a subscript consonant.

Not found in final position.

្ឆ as a subscript consonant. Seldom used.

[edit]

1787   1787: KHMER LETTER CO

Khmer consonant, cɔː

[c] with inherent vowel [ɔː] or before a vowel, eg. ជា [ciə] is.

[c] before a subscript consonant.

[ik] in final position. [c] according to Huffman.

្ជ as a subscript consonant.

[edit]

1788   1788: KHMER LETTER CHO

Khmer consonant, cʰɔː

[cʰ] with inherent vowel [ɔː] or before a vowel, eg. ឈឺ [cʰɨː] sick.

[c] before a subscript consonant.

Not found in final position.

្ឈ as a subscript consonant. Seldom used.

[edit]

1789   1789: KHMER LETTER NYO

Khmer consonant, ɲɔː

[ɲ] with inherent vowel [ɔː] or before a vowel, eg. ញី [ɲiː] female.

Not found before a subscript consonant.

[ɲ] in final position.

្ញ as a subscript consonant. As a subscript this consonant doesn't determine the pronunciation of the vowel sound, it is determined by the class of the non-subscript consonant.

The bottom of this character is dropped when followed by a subscript consonant, eg. បញ្ឆោត [ɓɑɲcʰaot] to trick.

There are two shapes used for the subscript. When this character appears twice in a cluster, the full form is used. Elsewhere a reduce form is used. For example, compare កញ្ញា្ញ [kɑɲɲaa] young lady and ប្រាជ្ញា [praːcɲaa] intelligence.

[edit]

178a   178A: KHMER LETTER DA

Khmer consonant, ɗɑː

[ɗ] with inherent vowel [ɑː] or before a vowel, eg. ដុល្លារ [ɗɑllaː] dollar.

[ɗ] before a subscript consonant.

[t] in final position.

្ដ as a subscript consonant. This is the same shape as the subscript of tɑː .

[edit]

178b   178B: KHMER LETTER TTHA

Khmer consonant, tʰɑː

[tʰ] with inherent vowel [ɑː] or before a vowel.

[t] before a subscript consonant.

[t] in final position.

This consonant is only used in a few words of Pali or Sanskrit origin.

្ឋ as a subscript consonant. Seldom used: often a silent final subscript.

[edit]

178c   178C: KHMER LETTER DO

Khmer consonant, ɗɔː

[ɗ] with inherent vowel [ɔː] or before a vowel.

Not found before a subscript consonant.

[t] in final position.

This consonant is rare and is only used in a few words of Pali or Sanskrit origin.

្ឌ as a subscript consonant. Seldom used.

[edit]

178d   178D: KHMER LETTER TTHO

Khmer consonant, tʰɔː

[tʰ] with inherent vowel [ɔː] or before a vowel.

Not found before a subscript consonant.

[t] in final position.

This consonant is only used in a few words of Pali or Sanskrit origin.

្ឍ as a subscript consonant. Obsolete, or rarely, if ever, used.

[edit]

178e   178E: KHMER LETTER NNO

Khmer consonant, nɑː

[n] with inherent vowel [ɑː] or before a vowel, eg. ណាស់ [nah] very.

[n] before a subscript consonant.

[n] in final position.

្ណ as a subscript consonant. Seldom used: often a silent final subscript.

[edit]

178f   178F: KHMER LETTER TA

Khmer consonant, tɑː

[t] with inherent vowel [ɑː] or before a vowel, eg. ត្រី [trəj] fish.

[ɗ] at the beginning of two syllable words where the first syllable ends with final nasal, eg. តង្វាយ [ɗɔŋʋaaj] gift.

[t] before a subscript consonant.

[t] in final position.

្ត as a subscript consonant. This is the same shape as the subscript of ɗɑː .

The pronunciation when a subscript in medial position is unpredictable, sometimes [t] and sometimes [ɗ]. As a general rule, but not always, it is pronounced [t] when a subscript to nɔː , and [ɗ] when a subscript to nɑː , eg. បន្តុះ [ɓɑntoh] to criticise, and បណ្តុះ [ɓɑnɗoh] to grow.

[edit]

1790   1790: KHMER LETTER THA

Khmer consonant, tʰɑː

[tʰ] with inherent vowel [ɑː] or before a vowel, eg. ថា [tʰaː] that.

[t] before a subscript consonant.

[t] in final position.

្ថ as a subscript consonant.

[edit]

1791   1791: KHMER LETTER TO

Khmer consonant, tɔː

[t] with inherent vowel [ɔː] or before a vowel, eg. ទម្ងន់ [tɔmŋɔn] weight.

[t] before a subscript consonant.

[t] in final position.

្ទ as a subscript consonant.

[edit]

1793   1793: KHMER LETTER NO

Khmer consonant, nɔː

[n] with inherent vowel [ɔː] or before a vowel, eg. នឹង [nəŋ] future tense marker.

Not found before a subscript consonant.

[n] in final position.

្ន as a subscript consonant. As a subscript this consonant doesn't determine the pronunciation of the vowel sound, it is determined by the class of the non-subscript consonant.

In some words it follows a silent to make the following vowel behave as if it was an [ɑː] class consonant, eg. ហ្ន is [nɑː].

[edit]

1792   1792: KHMER LETTER THO

Khmer consonant, tʰɔː

[tʰ] with inherent vowel [ɔː] or before a vowel, eg. ភំ [tʰom] big.

[t] before a subscript consonant.

[t] in final position.

្ធ as a subscript consonant. Seldom used: often a silent final subscript.

[edit]

1794   1794: KHMER LETTER BA

Khmer consonant, ɓɑː

[ɓ] with inherent vowel [ɑː] or before a vowel, eg. បន្ទប់ [ɓɑntuɓ] room.

[p] when followed by a subscript consonant, eg. ម្រាំ [pram].

[p] in final position, eg. ឈប់ [cʰup] to stop.

[p] when below a museʔkətoə̯n, eg. ប៉ា [paː] father.

[p] in some words just by convention, eg. បច្ច័យ [paccaj] money.

្ប as a subscript consonant.

A ligature បា is used when this character is followed by sra-aː, to avoid similarity with hɑː , eg. បាយ [ɓaaj] cooked rice. The same applies when followed by sra-ao បោ and sra-aw បៅ .

[edit]

1795   1795: KHMER LETTER PHA

Khmer consonant, pʰɑː

[pʰ] with inherent vowel [ɑː], eg. ផ្សារ [psaː] market.

[p] before a subscript consonant.

[p] in final position. Not common.

្ផ as a subscript consonant. Obsolete, or rarely, if ever, used.

[edit]

1796   1796: KHMER LETTER PO

Khmer consonant, pɔː

[p] with inherent vowel [ɔː] or before a vowel, eg. ពី [piː] from.

[p] before a subscript consonant.

[p] in final position.

្ព as a subscript consonant.

[edit]

1797   1797: KHMER LETTER PHO

Khmer consonant, pʰɔː

[pʰ] with inherent vowel [ɔː] or before a vowel, eg. ភាសា [pʰiəsaː] language.

[p] before a subscript consonant.

[p] in final position.

្ភ as a subscript consonant.

[edit]

1798   1798: KHMER LETTER MO

Khmer consonant, mɔː

[m] with inherent vowel [ɔː] or before a vowel, eg. មុខ [muk] ahead, front.

[m] before a subscript consonant.

[m] in final position.

្ម as a subscript consonant. As a subscript this consonant doesn't determine the pronunciation of the vowel sound, it is determined by the class of the non-subscript consonant.

In some words it follows a silent to make the following vowel behave as if it was an [ɑː] class consonant, eg. ហ្ម is [mɑː].

[edit]

1799   1799: KHMER LETTER YO

Khmer consonant, jɔː

[j] with inherent vowel [ɔː] or before a vowel, eg. យល់់ [jul] to understand.

Not found before a subscript consonant.

[iː] in final position. Huffman says [j]

្យ as a subscript consonant. As a subscript this consonant doesn't determine the pronunciation of the vowel sound, it is determined by the class of the non-subscript consonant.

[edit]

179a   179A: KHMER LETTER RO

Khmer consonant, rɔː

[r] with inherent vowel [ɔː] or before a vowel, eg. រូប [ruːɓ] picture.

Not found before a subscript consonant, eg. ការ [kaa] work; ខ្មែរ [kmae] Cambodian.

Silent in final position. There is no final r sound in Cambodian, but the r symbol can sometimes disambiguate homonyms, eg. កា [kaa] to address (a letter) and ការ [kaa] to work; ពី [piː] from and ពីរ [piː] two.

្រ as a subscript consonant. As a subscript this consonant doesn't determine the pronunciation of the vowel sound, it is determined by the class of the non-subscript consonant.

[edit]

179b   179B: KHMER LETTER LO

Khmer consonant, lɔː

[l] with inherent vowel [ɔː] or before a vowel, eg. លុយ [luj] money.

[l] before a subscript consonant.

[l] in final position.

្ល as a subscript consonant. As a subscript this consonant doesn't determine the pronunciation of the vowel sound, it is determined by the class of the non-subscript consonant.

In some words it follows a silent to make the following vowel behave as if it was an [ɑː] class consonant, eg. ហ្ល is [lɑː].

[edit]

179c   179C: KHMER LETTER VO

Khmer consonant, ʋɔː

[ʋ] with inherent vowel [ɔː] or before a vowel.

Not found before a subscript consonant.

[w] in final position.

្វ as a subscript consonant. As a subscript this consonant doesn't determine the pronunciation of the vowel sound, it is determined by the class of the non-subscript consonant.

In some words it follows a silent to make the following vowel behave as if it was an [ɑː] class consonant, eg. ហ្វូង [ʋouŋ] crowd.

In combination with preceding it also gives [f], eg. ហ្វឹក [fek] train; កាហ្វេ [kaafei] coffee.

[edit]

179f   179F: KHMER LETTER SA

Khmer consonant, sɑː

[s] with inherent vowel [ɑː] or before a vowel.

[s] before a subscript consonant.

[h] in final position.

្ស as a subscript consonant.

[edit]

17a0   17A0: KHMER LETTER HA

Khmer consonant, hɑː

[h] with inherent vowel [ɑː] or before a vowel.

Silent before a subscript consonant.

Not found in final position.

្ហ as a subscript consonant.

In combination with subscript gives [f], eg. ហ្វឹក [fek] train; កាហ្វេ [kaafei] coffee.

In some words it combines with one of the following [ɔː] class subscripts, វមនល, to make the following vowel behave as if they were [ɑː] class consonants, eg. ហ្វូង [ʋouŋ], ហ្ម [mɑː], ហ្ន [nɑː], ហ្ល [lɑː].

[edit]

17a1   17A1: KHMER LETTER LA

Khmer consonant, lɑː

[l] with inherent vowel [ɑː] or before a vowel.

Not found before a subscript consonant or in final position.

Subscript consonant not used in Cambodia (only in Khmer spoken in Thailand).

[edit]

17a2   17A2: KHMER LETTER QA

Khmer consonant, ʔɑː

[ʔ] with inherent vowel [ɑː] or before a vowel.

[ʔ] before a subscript consonant.

Not found in final position.

្អ as a subscript consonant.

When used as a subscript at the beginning of a word this adds an extra syllable after the initial consonant, eg. ផ្អែម [pʰaʔaem]; ស្អាត [saʔaːtʰ].

[edit]

vowel signs

17b6   17B6: KHMER VOWEL SIGN AA

Khmer vowel, sra-aː ស្រៈអា

[aː] after an [ɑː] class consonant, eg. ណា [naː] which, where

[iə] after an [ɔː] class consonant, eg. ជា [ciə] to be

In combination with a following nikahit អាំ this is regarded as a letter of the Khmer alphabet. Sounds are:

[am] after an [ɑː] class consonant.

[oə̯m] after an [ɔː] class consonant.

In combination with a following nikahit and ŋɔː អាំង:

[aŋ] after an [ɑː] class consonant.

[ɛaŋ] after an [ɔː] class consonant.

[edit]

17b7   17B7: KHMER VOWEL SIGN I

Khmer vowel, sra-e ស្រៈអិ

[e] after an [ɑː] class consonant, eg. ចិត្ដ [cet] heart

[i] after an [ɔː] class consonant, eg. វិញ [ʋiɲ] instead, again

In combination with a following reahmuk អិះ this is regarded as a letter of the Khmer alphabet. (It has the same sound, followed by h), eg. ជិះ [cih] to ride. This combination has the same sound as អេះ, but this is much less common.

[edit]

17b8   17B8: KHMER VOWEL SIGN II

Khmer sra-əj ស្រៈអី

[əj] after an [ɑː] class consonant.

[iː] after an [ɔː] class consonant.

[edit]

17b9   17B9: KHMER VOWEL SIGN Y

Khmer sra-ə ស្រៈអឹ

[ə] after an [ɑː] class consonant.

[ɨ] after an [ɔː] class consonant.

[edit]

17ba   17BA: KHMER VOWEL SIGN YY

Khmer sra-əɨ ស្រៈអឺ

[əɨ] after an [ɑː] class consonant.

[ɨː] after an [ɔː] class consonant.

[edit]

17bb   17BB: KHMER VOWEL SIGN U

Khmer vowel, sra-o ស្រៈអុ

[o] after an [ɑː] class consonant.

[u] after an [ɔː] class consonant.

In combination with a following reə̯hmuk អុះ this is regarded as a letter of the Khmer alphabet. (It has the same sound, followed by h), eg. ចុះ [coh] so?.

[edit]

17bc   17BC: KHMER VOWEL SIGN UU

Khmer sra-ou ស្រៈអូ

[ou] after an [ɑː] class consonant.

[uː] after an [ɔː] class consonant.

[edit]

17bd   17BD: KHMER VOWEL SIGN UA

Khmer sra-uə ស្រៈអួ

[uə] after any class of consonant.

[edit]

17be   17BE: KHMER VOWEL SIGN OE

Khmer sra-aə ស្រៈអើ

[aə] after an [ɑː] class consonant.

[əː] after an [ɔː] class consonant.

[edit]

17bf   17BF: KHMER VOWEL SIGN YA

Khmer sra-ɨə ស្រៈអឿ

[ɨə] after any class of consonant.

[edit]

17c0   17C0: KHMER VOWEL SIGN IE

Khmer sra-iə ស្រៈអៀ

[iə] after any class of consonant.

[edit]

17c1   17C1: KHMER VOWEL SIGN E

Khmer vowel, sra-ei ស្រៈអេ

[ei] after an [ɑː] class consonant.

[eː] after an [ɔː] class consonant.

Combined with reə̯hmuk អេះ:

  • [eh] after an [ɑː] class consonant, eg. សេះ [seh] horse
  • [ih] after an [ɔː] class consonant, eg. នេះ [nih] this
  • This combination has the same sound as អិះ, but this is much more common.
[edit]

17c2   17C2: KHMER VOWEL SIGN AE

Khmer sra-ae ស្រៈអែ

[ae] after an [ɑː] class consonant.

[ɛː] after an [ɔː] class consonant.

[edit]

17c3   17C3: KHMER VOWEL SIGN AI

Khmer sra-aj ស្រៈអៃ

[aj] after an [ɑː] class consonant.

[ɨj] after an [ɔː] class consonant.

[edit]

17c4   17C4: KHMER VOWEL SIGN OO

Khmer vowel, sra-ao ស្រៈអោ

[ao] after an [ɑː] class consonant.

[oː] after an [ɔː] class consonant.

Combined with reə̯hmuk អោះ:

  • [ɑh] after an [ɑː] class consonant, eg. នៅណោះ [nəɨnɔh] over there
  • [uəh] or [uh] after an [ɔː] class consonant, eg. ឈ្មោះ [cʰmuə] name, នោះ [nuh] that
[edit]

17c5   17C5: KHMER VOWEL SIGN AU

Khmer sra-aw ស្រៈអៅ

[aw] after an [ɑː] class consonant.

[ɨw] after an [ɔː] class consonant.

[edit]

17c6   17C6: KHMER SIGN NIKAHIT

Khmer Vowel niʔkəhət និគ្គហិត

Although it can be equated with the anusvara in Sanskrit, this is usually regarded as a vowel sign or a part of a vowel sign in Khmer.

[ɑm] after an [ɑː] class consonant, eg. កំពុង [kɑmpuŋ] present tense marker.

[um] after an [ɔː] class consonant, eg. រំភើប [rumɓəːpʰ] excited

Combined with sra-o អុ:

  • [om] after an [ɑː] class consonant, eg. ធំ [tʰɑm] big
  • [um] after an [ɔː] class consonant, eg. ខ្លាឃ្មុំ [kʰlaːkʰmum] bear

Combined with sra-aː អា:

  • [am] after an [ɑː] class consonant, eg. ដាំ [ɗam] to plant
  • [oə̯m] after an [ɔː] class consonant, eg. នាំ [noə̯m] to lead

Combined with sra-aː and ŋɔː អាង:

  • [aŋ] after an [ɑː] class consonant, eg. ម្ហូបបារាំង [mhouɓɓaːraŋ] French food
  • [eə̯ŋ] after an [ɔː] class consonant, eg. ទាំង [teə̯ŋ] including, both

In some words of Sanskrit origin, the niʔkəhət represents [aŋ] or [an], eg. សំស្ក្រិត [sɑŋskret] Sanskrit and សំយោគសញ្ញា [sɑnjoːksaɲɲaː] name of a diacritic.

[edit]

17c7   17C7: KHMER SIGN REAHMUK

Khmer vowel, reə̆hmuk រះមុខ

Although it can be equated with the visarga in Sanskrit, this is regarded as a vowel sign or part of a vowel sign in Khmer.

[ah] with an [ɑː] class inherent vowel, eg. ខ្លះ [klɑh] some.

[eə̆h] with an [ɔː] class inherent vowel or an [ɔː] class sra-aː អា, eg. ផ្ទះ [pteə̆h] house, home.

[h] after the normal sounds of short vowels sra-e អិ, sra-ə អឹ, and sra-o អុ, eg. ជិះ [cih] to ride, កឹះ [kəh] to scratch,ពុះ [puh] to boil.

[ih] with sra-ei អេ as an [ɔː] class vowel, eg. នេះ [nih] this.

[eh] with sra-ei អេ or sra-aj អៃ as an [ɑː] class vowel, eg. សេះ [seh] horse, កែះ [keh] wild goat .

[əh] with sra-aə អើ as an [ɑː] class vowel, eg. ចង្កើះ [cɑŋkəh] chopsticks.

[ɑh] with sra-ao អោ as an [ɑː] class vowel, eg. កោះ [kɑh] island.

[uə̆h] with sra-ao អោ as an [ɔː] class vowel, eg. គោះ [kuə̆h] strike.

[edit]

independent vowels

17a3   17A3: KHMER INDEPENDENT VOWEL QAQ

Khmer deprecated independent vowel

This should be considered an error in the encoding. Use of this character is strongly discouraged; 17A2: KHMER LETTER QA should be used instead.

Originally intended only for Pali/Sanskrit transliteration, but not actually a separate character in Khmer.

[edit]

17b2   17B2: KHMER INDEPENDENT VOWEL QOO TYPE TWO

Khmer independent vowel, sra-ao ស្រៈឱ

[ao]

This is a variant of KHMER INDEPENDENT VOWEL QOO TYPE ONE that is only used, according to Unicode, in two words, one of which, ឲ្យ [ʔaoj] to give, is very common however.

[edit]

17a4   17A4: KHMER INDEPENDENT VOWEL QAA

Khmer deprecated independent vowel

This should be considered an error in the encoding. Use of this character is discouraged; the sequence 17A2: KHMER LETTER QA + 17B6: KHMER VOWEL SIGN AA should be used instead.

Originally intended only for Pali/Sanskrit transliteration, but not actually a separate character in Khmer.

[edit]

17a5   17A5: KHMER INDEPENDENT VOWEL QI

Khmer Independent vowel sra-ʔəʔ ស្រៈឥ

[ʔə], eg. ឥត [ʔət] not

[ʔɨ], eg. ឥត [ʔət] not

[edit]

17a6   17A6: KHMER INDEPENDENT VOWEL QII

Khmer Independent vowel sra-ei ស្រៈឦ

[ei]

[edit]

17a7   17A7: KHMER INDEPENDENT VOWEL QU

Khmer Independent vowel sra-ou ស្រៈឧ

[ou]

[edit]

17aa   17AA: KHMER INDEPENDENT VOWEL QUUV

Khmer Independent vowel sra-ou ស្រៈឪ

[ou]

[edit]

17ab   17AB: KHMER INDEPENDENT VOWEL RY

Khmer Independent vowel sra-r̥ ស្រៈឫ

[r̥]

[edit]

17ac   17AC: KHMER INDEPENDENT VOWEL RYY

Khmer Independent vowel sra-r̥̄ ស្រៈឬ

[r̥̄]

[edit]

17ad   17AD: KHMER INDEPENDENT VOWEL LY

Khmer Independent vowel sra-l̥ ស្រៈឭ

[l̥]

[edit]

17ae   17AE: KHMER INDEPENDENT VOWEL LYY

Khmer Independent vowel sra-l̥̄ ស្រៈឮ

[l̥̄]

[edit]

17af   17AF: KHMER INDEPENDENT VOWEL QE

Khmer Independent vowel sra-ae ស្រៈឯ

[ae]

[edit]

17b0   17B0: KHMER INDEPENDENT VOWEL QAI

Khmer Independent vowel sra-aiy ស្រៈឰ

[aiy]

[edit]

17b1   17B1: KHMER INDEPENDENT VOWEL QOO TYPE ONE

Khmer Independent vowel sra-ao ស្រៈឱ

[ao]

[edit]

Combining marks

17cb   17CB: KHMER SIGN BANTOC

Khmer Mark, ɓɑntɑk បន្តក់

Always placed above the final consonant. Basically shortens the preceding vowel. Affects the preceding vowel sound in one of the following ways:

  • After an inherent vowel
    • [ɑ] after an [ɑː] class consonant, eg. ចប់ [cɑp] to finish (cf. ចប [cɑːp] hoe)
    • [u] after an [oː] class consonant and before a labial consonant, eg. លប់ [lup] bird trap (cf. លប [lɔːp] fish trap)
    • [uə̯] otherwise after an [ɔː] class consonant, eg. លក់ [luə̯k] to sell (cf. លក [lɔːk] to channel)
  • After following an [ɔː] class consonant
    • [eə̯] before a velar consonant, eg. ពាក់ [peə̯k] to wear (cf. ពាក្យ [piək] word)
    • [oə̯] elsewhere, eg. មាន់ [moə̯n] to wear (cf. មាន [miən] word)
  • Otherwise, shortens a long vowel, eg. ចាប់ [cap] to catch (cf. ចាប [caːp] sparrow).
[edit]

17cf   17CF: KHMER SIGN AHSDA

Khmer mark, leːk ʔahsɗaː លេខអស្ដា

Used over two consonants to indicate that they represent two specific words:

  • ក៏ [kɑː] meaning auxiliary: also, then, therefore
  • ដ៏ [ɗɑː] means pronoun which; very
[edit]

17cd   17CD: KHMER SIGN TOANDAKHIAT

Khmer mark, tɔnɗɔkʰiət ទណ្ឌឃាត

Used over a consonant, particularly in loan words, to silence it and any attached vowels or subscripts, eg. សាសន៍ [saːh] race, ethnicity, and សប្ដាហ៍ [sɑpɗaː] week; រេហ៍ពល [rɔpuə̆l] army.

[edit]

17c9   17C9: KHMER SIGN MUUSIKATOAN

Khmer mark, museʔkətoə̯n មូសិកទន្ត or tmɨɲ kɑnɗao ធ្មេញកណ្ដរ

Changes the class of a consonant from [ɔː] to [ɑː], affecting the inherent vowel and any other vowel following the consonant, eg. ម៉ត់ចត់ [mɑtcɑt] careful, រ៉ាប់ [rap] to guarrantee. It is used for the following consonants that don't have equivalents in the [ɑː] class: ងញមយរវ. It is usually written over the right-hand side of the consonant glyph. This is also especially useful for spelling foreign names. Eg. យ៉ាង [jaːŋ] kind (cf. យាង [yiəŋ] to go (royalty)).

Changes the sound of ɓɑː from [ɓ] to [p], eg. ប៉ះ [pah] to touch. This is the only way to write an [ɑː] class [p]. Eg. ប៉ាន [paːn] to cover (cf. បាន [ɓaːn] to have).

When this appears with a vowel sign above the consonant, the ក្បៀសក្រោម [kɓiəhkraom] form is used. This looks exactly like sra-o អុ, eg. ម៉ឺន [məɨn] 10,000; ញ៉ាំ [ɲam] to eat.

You can prevent this behaviour using a zero-width non-joiner between this character and the following one, eg. ញ៉‌ាំ.

tmɨɲ kɑnɗao means "rat's teeth".

[edit]

17ca   17CA: KHMER SIGN TRIISAP

Khmer mark, trəisaɓ ត្រីសព្ទ

Changes the class of a consonant from [ɔː] to [oː], affecting the inherent vowel and also any other vowel following the consonant, eg. ក្រុមហ៊ុន [kromhun ] company; ហ៊ាន [hiən] to dare (cf. ហាន[haːn]shop); អ៊ូ [ʔuː] dry dock (cf. អូ[ʔou]exclamation). This is especially useful for spelling foreign names.

When this appears with a vowel sign above the consonant, the ក្បៀសក្រោម [kɓiəhkraom] form is used. This looks exactly like sra-o អុ, eg. in ស៊ី [siː] to eat.

You can prevent this behaviour using a zero-width non-joiner between this character and the following one, eg. ស៊‌ី.

[edit]

17cc   17CC: KHMER SIGN ROBAT

Khmer mark, rɔɓaːt របាទ

Not a very common mark. It silences final consonants, eg. បរិបូណ៌ [ɓɑriɓou] abundant.

Over a word-medial syllable-initial consonant it introduces the sound [rə] before the syllable, eg. ទុគ៌ត [tuːrəkuə̆t] destitute.

It can also convert the vowel sound of the previous consonant from [ɔː] to [ɔə] as well as silencing the consonant it appears over, eg. ពណ៌ [pɔə] colour.

[edit]

17d0   17D0: KHMER SIGN SAMYOK SANNYA

Khmer mark, sanjoːksaɲɲaː សំយោគសញ្ញា

[a] over an [ɑː] class consonant, eg. ស័កិ្ត [sak] rank; ស័កិ្ត [sak] rank.

[oə̯] over an [ɔː] class consonant, in general, eg. ទ័ព [toə̯p]