Arabic character notes

This page lists characters in the Unicode Arabic blocks and provides information about them. See also the companion document, Urdu Script Notes, which gives an overview of how the Arabic script is used for Urdu.

To view this page as intended, you need an Arabic naskh font, and preferably a nastaliq font also. The Noto Naskh Arabic and Scheherazade fonts are downloaded with this page as webfonts. You can also select alternate fonts for the descriptive text from the control to the right.

If you click on any red example text, you will see at the bottom right of the page a list of the characters that make up the example.

To find a character by codepoint, type #char0000 at the end of the URL in the address bar, where 0000 is a four-figure, hex codepoint number, all in uppercase.

Index

Click on a character to jump to its description. Many of the characters are grouped differently from the rest of the page, to help in lookup. See the table of contents for the page to jump to the start of a section that has been merged.
 

Based on ISO 8859-6ء آ أ ؤ إ ئ ا ب ة ت ث ج ح خ د ذ ر ز س ش ص ض ط ظ ع غ ـ ف ق ك ل م ن ه و ى ي
Extended Arabic letters ٱ ٲ ٴ ٵ ݳ ݴ ٶ ٷ ٸ ٹ ٺ ٻ ݐ ݑ ݒ ݓ ݔ ݕ ݖ ࢡ ټ ٽ پ ٿ ڀ ځ ڂ ݗ ݘ ڃ ڄ څ ݮ ݯ ݲ ݼ چ ڇ ڿ ڈ ډ ڊ ڋ ڌ ڍ ڎ ڏ ڐ ݙ ݚ ۮ ڑ ڒ ړ ڔ ڕ ږ ڗ ݱ ژ ڙ ݛ ݫ ݬ ۯ ࢲ ښ ڛ ڜ ݜ ݭ ݾ ݰ ݽ ۺ ڝ ڞ ڟ ۻ ڠ ݝ ݞ ݟ ۼ ڡ ڢ ڣ ڤ ڥ ڦ ݠ ݡ ڧ ڨ ک ݢ ݣ ݤ ػ ؼ ڪ ګ ڬ ڭ ڮ ݿ گ ڰ ڱ ڲ ڳ ڴ ڵ ڶ ڷ ڸ ݪ ݥ ݦ ڹ ں ڻ ڼ ڽ ݧ ݨ ݩ ھ ۀ ۿ ہ ۂ ە ۃ ۄ ۅ ۆ ۇ ۈ ۉ ۊ ۋ ۏ ݸ ݹ ی ؽ ؾ ؿ ݵ ݶ ݷ ۍ ێ ې ۑ ؠ ے ۓ ݺ ݻ
Arabic letters for European and Central Asian languages ࢭ ࢮ ࢯ ࢰ ࢱ
Points from ISO 8859-6  ً   ٌ   ٍ   َ   ُ   ِ   ّ   ْ
Other combining marks   ٓ   ٔ   ٕ   ٖ   ٗ   ٘   ٙ   ٚ   ٛ   ٜ   ٝ   ٞ   ٟ  ٰ  ࣿ
Extended vowel signs for African languages  ࣴ   ࣵ   ࣶ   ࣷ   ࣸ   ࣹ   ࣺ   ࣻ   ࣼ   ࣽ
Extended Arabic letters for Rohingya
Extended vowel signs for Rohingya  ࣤ   ࣥ   ࣦ   ࣧ   ࣨ   ࣩ
Tone marks for Rohingya  ࣪   ࣫   ࣬   ࣮   ࣯
Signs for Sindhi۽ ۾
Archaic lettersٮ ٯ
Punctuation؉ ؊ ؛ ؞ ؟ ٪ ٫ ٬ ٭ ۔
Currency sign؋
Subtending marks؀ ؁ ؂ ؃ ؄
Radix Symbols؆ ؇
Letterlike symbol؈
Poetic marks؎ ؏
Honorifics  ؐ   ؑ   ؒ   ؓ   ؔ
Koranic annotation signs  ؗ   ؘ   ؙ   ؚ   ۖ   ۗ   ۘ   ۙ   ۚ   ۛ   ۜ   ۝   ۞   ۟   ۠   ۡ   ۢ   ۣ   ۤ   ۥ   ۦ   ۧ   ۨ   ۩   ۪   ۫   ۬   ۭ   ࣰ   ࣱ   ࣲ   ࣳ
Deprecated letterٳ
Arabic-Indic digits٠ ١ ٢ ٣ ٤ ٥ ٦ ٧ ٨ ٩
Eastern Arabic-Indic digits۰ ۱ ۲ ۳ ۴ ۵ ۶ ۷ ۸ ۹

Based on ISO 8859-6

ء

U+0621 ARABIC LETTER HAMZA

Description in the Unicode standard:

→ (modifier letter right half ring - 02BE)

Arabic ʔ (glottal stop)

For some historical reason this is treated as an orthographic sign rather than as a letter of the alphabet. It sometimes stands alone, but usually appears with a 'carrier' letter - alef, waw, or yeh (أ إ ؤ ئ) for which separate precomposed characters are available in Unicode.

This codepoint is used for representing the standalone hamza only. On its own it has no joining behaviour.

Combined with base characters: When the hamza is above or below another character you should typically use U+0654 ARABIC HAMZA ABOVE ٔ with the appropriate base character, although there are a number of exceptions.

Some exceptions arise because the NFC normalization form converts the base character and combining hamza to a precomposed character. These instances include

Other exceptions arise where the hamza is an integral part of the character itself (ie. an ijam). Examples of these characters include

Cutting and joining hamza in orthography: Classical arabic distinguishes between 'cutting' and 'joining' hamza. 'Cutting' means always pronounced, 'joining' means frequently elided. The joining hamza is of little practical importance in modern arabic pronounced without the old case endings.

In modern printed arabic, the hamza is rarely shown when it occurs at the beginning of a word.

The following are simplified rules for use of (cutting) hamza:

The sign indicating a joining hamza is called a wasla (see U+0671 ARABIC LETTER ALEF WASLA ٱ).

Urdu vowel separator / calendar indicator, hamzā hamzaː

This is the character code for the standalone hamza.

The hamza is also used in conjunction with other characters in Urdu, for which there are precomposed characters that can be used. See U+0624 ARABIC LETTER WAW WITH HAMZA ABOVE ؤ, U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE ئ, U+06D3 ARABIC LETTER YEH BARREE WITH HAMZA ABOVE ۓ, and U+06C2 ARABIC LETTER HEH GOAL WITH HAMZA ABOVE ۂ.

U+066B ARABIC DECIMAL SEPARATOR ٫ looks like a hamza, but isn't.

A standalone hamza is sometimes used at the end of words derived from Arabic, though it is usually omitted in modern Urdu publications, eg. ضیاء ziaː light, ذکاء zakaː intelligence.

Vowel junctions: The hamzā is used to indicate the boundaries between vowel sounds when there is no intervening consonant. Depending on the vowels concerned, it is used in a number of different ways, usually combined with other characters.

In some cases this standalone form is used, eg. انشاءاللہ ɪnʃallaː God willing.

See other ways in which vowel junctions are formed when the hamza is combined with other characters.

Calendar indicator: Gregorian dates are indicated by placing sahn below the year digits with the word عیسوی iːsviː Christian era. This is usually abbreviated as a hamza, eg. ۲۰۰۴؁ء.

Refs: Abdali; Matthews; Delacy

Not used in Persian.

آ

U+0622 ARABIC LETTER ALEF WITH MADDA ABOVE

Description in the Unicode standard:

≡ 0627 0653

Arabic ʔaː

The madda sign is still very often shown in print. It is used when either of the two following combinations of hamza and a vowel appear in a word:

  • ʔaʔ (hamza, short a, hamza) eg. آثار (ʔaːθaːr)

  • ʔaː (hamza, long a) eg. قرآن (qur'ʔaːn)

Normal pronunciation in both cases is ʔaː.

Joining forms: ـآ

Urdu consonant, alif madd əlɪf mədd

ɑː (used word initially), eg. آب ɑːb now. Unlike the short vowel diacritics, the diacritic madd is never omitted.

As an exception, it used in non-initial position in the word for Koran, القرآن.

madd means increasing.

See also 0627 ARABIC LETTER ALEF

Refs: Matthews; Delacy

أ

U+0623 ARABIC LETTER ALEF WITH HAMZA ABOVE

Description in the Unicode standard:

≡ 0627 0654

Arabic consonant

ʔa, ʔu, ʔ

This character represents the hamza (ء).

It is equivalent to U+0627 ARABIC LETTER ALEF ا + U+0654 ARABIC HAMZA ABOVE ٔ, but since NFC produces this character it is best to use it rather than the decomposed sequence.

At the beginning of a word hamza is always written on an alef carrier, regardless which vowel it takes. In this case, where the hamza appears above the alef, the vowel could be a or u. Examples: أحمد 'aḥmad, أريد 'urīd.

This character is also used to represent hamza in the middle or at the end of a word. Which of the possible alternative sequences (أ, ؤ or ئ) is used mid-word depends on the vowels preceding and following the hamza. The rules are complicated (and a common source of spelling errors among Arabs).

At the end of a word this character is only used after a short vowel. Examples: سأل sa'al, قرأ qara'.

See U+0621 ARABIC LETTER HAMZA ء for more information about hamza. See also U+0625 ARABIC LETTER ALEF WITH HAMZA BELOW إ, U+0624 ARABIC LETTER WAW WITH HAMZA ABOVE ؤ, and U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE ئ.

Joining forms: ـأ

Not used in Persian or Urdu.

ؤ

U+0624 ARABIC LETTER WAW WITH HAMZA ABOVE

Description in the Unicode standard:

≡ 0648 0654

Arabic consonant

ʔu, ʔ

This character represents the hamza (ء) in the middle of a word.

It is equivalent to U+0648 ARABIC LETTER WAW و + U+0654 ARABIC HAMZA ABOVE ٔ, but since NFC produces this character it is best to use it rather than the decomposed sequence.

In the middle of a word the hamza is almost always written above a carrier letter. Which one depends on the vowels preceding and following the hamza, and the rules are complicated (and a common source of spelling errors among Arabs), eg. مؤمن mu'min (but cf. سأل sa'al, نائم nā'im).

See U+0621 ARABIC LETTER HAMZA ء for more information about hamza. See also U+0623 ARABIC LETTER ALEF WITH HAMZA ABOVE أ, U+0625 ARABIC LETTER ALEF WITH HAMZA BELOW إ, and U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE ئ.

Joining forms: ـؤ

Urdu vowel separator+vowel

or o immediately after a preceding vowel (see below).

Vowel junctions: The hamzā is used to indicate the boundaries between vowel sounds when there is no intervening consonant. Depending on the vowels concerned, it is used in a number of different ways. It can also have two different shapes, one like the initial form of 'ain and the other more like an italic 's'.

When the second vowel is an or o represented by و, the hamzā typically sits directly on top of the و, eg. آؤ ɑːo come; جاؤں ʤɑːũː I may go. Often the hamzā is omitted in this situation.

Many words have the vowel combinations iːo, where hamzā is not typically used, eg. لڑکیوں کا laɽkiːõ kɑː of the girls.

See other ways in which vowel junctions are formed when dealing with other combinations of vowels.

Refs: Abdali; Matthews; Delacy

إ

U+0625 ARABIC LETTER ALEF WITH HAMZA BELOW

Description in the Unicode standard:

≡ 0627 0655

Arabic consonant

ʔi

This character represents the hamza (ء).

It is equivalent to U+0627 ARABIC LETTER ALEF ا + U+0655 ARABIC HAMZA BELOW ٕ, but since NFC produces this character it is best to use it rather than the decomposed sequence.

At the beginning of a word hamza is always written on an alef carrier, regardless which vowel it takes. When it takes an i vowel it is written below the alef. Example:إكرام 'ikrām.

The mid-word and word-final equivalent of this character is U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE ئ.

See U+0621 ARABIC LETTER HAMZA ء for more information about hamza. See also U+0623 ARABIC LETTER ALEF WITH HAMZA ABOVE أ, U+0624 ARABIC LETTER WAW WITH HAMZA ABOVE ؤ, and U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE ئ.

Joining forms: ـإ

Not used in Persian or Urdu.

ئ

U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE

Description in the Unicode standard:

≡ 064A 0654

Arabic ʔɪ, ʔ

This character represents the hamza (ء) in the middle of a word.

It is equivalent to U+064A ARABIC LETTER YEH ي + U+0654 ARABIC HAMZA ABOVE ٔ, but since NFC produces this character it is best to use it rather than the decomposed sequence. When yeh is used as a mid-word carrier it loses its dots.

In the middle of a word the hamza is almost always written above a carrier letter. Which one depends on the vowels preceding and following the hamza, and the rules are complicated (and a common source of spelling errors among Arabs), eg. نائم nā'im (but cf. سأل sa'al, مؤمن mu'min).

See U+0621 ARABIC LETTER HAMZA ء for more information about hamza. See also U+0623 ARABIC LETTER ALEF WITH HAMZA ABOVE أ, U+0625 ARABIC LETTER ALEF WITH HAMZA BELOW إ, and U+0624 ARABIC LETTER WAW WITH HAMZA ABOVE ؤ.

Joining forms: ئئئ

Urdu vowel separator / vowel

ɪ or a when following a vowel, eg. کوئلہ koɪlɑː coal; لائن lɑːɪn queue; ہیئت hɛat astronomy. The hamza indicates that this vowel is pronounced separately from the preceding one.

iːɛ when used as izafat (see below).

Otherwise functions as a soundless vowel junction indicator ('hamza on its chair').

Vowel junctions: The hamza is used to indicate the boundaries between vowel sounds when there is no intervening consonant. Depending on the vowels concerned, it is used in a number of different ways. It can also have two different shapes, one like the initial form of 'ain and the other more like an italic 's'.

When the second vowel is an or e represented by ی or ے, the hamzā 'sits on a chair' before the letter representing the second vowel, eg. کئی kaiː several; تیئیس teiːs twenty-three; کوئی koiː someone; گئے gae they went; گائے gɑːe they sang.

Many words, however, have vowel combinations iːe, where hamzā is not typically used, eg. چلیے ʧaliːe come on.

See other ways in which vowel junctions are formed when dealing with other combinations of vowels.

Izāfat: ɪzɑːfat is the name given to the short vowel ɛ used to describe a relationship between two words. It may be translated of, eg. as in the Lion of Punjab.

This sound is mostly represented using zer, but in certain cases can be represented with a combining hamza. One such case occurs when the preceding word ends in ye ی: eg. ولئکامل valiː ɛ kɑːmɪl perfect saint.

There are other ways in which izafat can be formed.

Refs: Abdali; Matthews; Delacy

ا

U+0627 ARABIC LETTER ALEF

Arabic vowel lengthener, hamza carrier

aː, a, ʔa, -

Formally speaking, this letter has no sound of its own. Its main uses in arabic orthography are:

It also has one or two minor functions such as in conjunction with tawiin (nunation) (see U+064B ARABIC FATHATAN ً ).

Certain parts of the arabic verb end in a long u-vowel that is conventionally written with a following alef that has no effect on pronunciation, eg. كتبوا kætæbuː. The alef is omitted if a suffix is added, eg. كتبوها kætæbuː-haa.

Joining forms: ـا

Persian ʔ, ɔ, æ, -

Urdu vowel, alif alɪf

a/ɪ/u on its own in word initial position.

iː/e/ɛ word initial, combined with a following ye, ای

uː/o/ɔ word initial, combined with a following vāū, او

ɑː with madd آ, but see 0622 ARABIC LETTER ALEF WITH MADDA ABOVE for this.

ʊ/∅ sometimes as part of the Arabic definite article (see below).

ɑː elsewhere, unless part of the Arabic definite article (see below).

The alternative sounds possible in the initial combinations can be disambiguated, when necessary, by the use of combining marks. The combining marks are rarely used in normal text (with the exception of madd shown above). See a table of combining marks for vowels.

Arabic definite article The pronunciation of ال (alif followed by lām) varies when it represents the Arabic definite article. This affects many words in Urdu that have come from Arabic, in particular names and adverbial expressions.

Often the alif is not pronounced after a short preceding word that ends in a vowel. If the preceding vowel was long, it is shortened in this process. Examples: بالکل bɪlkul (absolutely); فی الحال filhɑːl (at present).

Often the vowel is pronounced ʊ, eg. دارالحکومت dɑːrʊlhʊkuːmat (capital).

(The lam may also not be pronounced. See 0644 ARABIC LETTER LAM.)

Refs: Matthews; Delacy

ب

U+0628 ARABIC LETTER BEH

Arabic consonant

b

Joining forms: ببب

Persian b

Urdu consonant, be

b

Looks like: ببب ب

bʰe together with 06BE: ARABIC LETTER HEH DOACHASHMEE, to represent the aspirated b in Urdu, a distinct letter of the Urdu alphabet called bhe.

Looks like: بھبھبھ بھ

Refs: Matthews; Delacy

ة

U+0629 ARABIC LETTER TEH MARBUTA

Arabic consonant

usually no sound, sometimes t.

Used for historical reasons to write the feminine ending, æ – the dots are borrowed from teh (ت). Pronounced as t in specific grammatical contexts. Example: مدرسة mædræsæ.

This letter is only used in final position. If any suffix is added the ending is spelled with U+062A ARABIC LETTER TEH ت, eg. مدرستنا mædræsæt-naː.

In modern arabic it is not uncommon to find the two dots omitted, particularly on masculine proper names that have the feminine ending, eg. طلبة tulbæ.

Joining forms: ـة

Persian h, -, ɛ, æ Arabic fem. t

Not used in Urdu.

ت

U+062A ARABIC LETTER TEH

Arabic consonant

t

Joining forms: تتت

Persian t

Urdu consonant, te

t

Looks like: تتت ت

together with 06BE: ARABIC LETTER HEH DOACHASHMEE, to represent the aspirated t in Urdu, a distinct letter of the alphabet called the.

Looks like: تھتھتھ تھ.

Refs: Matthews; Delacy

ث

U+062B ARABIC LETTER THEH

Arabic consonant

θ

Joining forms: ثثث

Persian s

Urdu consonant, se se

s Only occurs in words of Arabic and Persian origin, and is much less common than س 0633 ARABIC LETTER SEEN, which is also pronounced s.

Looks like: ثثث ث

Refs: Matthews; Delacy

ج

U+062C ARABIC LETTER JEEM

Arabic consonant

ʒ

Joining forms: ججج

Persian ʤ

Urdu consonant, jīm ʤiːm

ʤ

Looks like: ججج ج

ʤʰ together with 06BE: ARABIC LETTER HEH DOACHASHMEE, to represent the aspirated ʤ in Urdu, a distinct letter of the alphabet called jhe.

Looks like: جھجھجھ جھ.

Refs: Matthews; Delacy

ح

U+062D ARABIC LETTER HAH

Arabic consonant

ħ

Joining forms: ححح

Persian h, -

Urdu consonant, baṛī he baɽiː he

h

Looks like: ححح ح

Refs: Matthews; Delacy

خ

U+062E ARABIC LETTER KHAH

Arabic consonant

x

Joining forms: خخخ

Persian x

Urdu consonant, xe xe

x

Looks like: خخخ خ

Refs: Matthews; Delacy

د

U+062F ARABIC LETTER DAL

Arabic consonant

d

Joining forms: ـد

Persian d

Urdu consonant, dāl dɑːl

d

Looks like: ـد د

together with 06BE: ARABIC LETTER HEH DOACHASHMEE, to represent the aspirated d in Urdu, a distinct letter of the alphabet called dhe.

Looks like: ـدھ دھ.

Refs: Matthews; Delacy

ذ

U+0630 ARABIC LETTER THAL

Arabic consonant

ð

Joining forms: ـذ

Persian z

Urdu z Called zɑːl.

Nastaliq forms: ـذ ذ

In Urdu, this letter only occurs in words of Arabic and Persian origin, and is much less common than 0632 ARABIC LETTER ZAIN ز, which is also pronounced z. It is not counted as a regular letter of the Urdu alphabet.

ر

U+0631 ARABIC LETTER REH

Arabic consonant

r

Joining forms: ـر

Persian r

Urdu consonant, re re

r (pronounced with a trill).

Looks like: ـر ر

Refs: Matthews; Delacy

ز

U+0632 ARABIC LETTER ZAIN

Arabic consonant

z

Joining forms: ـز

Persian z

Urdu consonant, ze ze

z

Looks like: ـز ز

س

U+0633 ARABIC LETTER SEEN

Arabic consonant

s

Joining forms: سسس

Persian s

Urdu consonant, sīn siːn

s

Looks like: سسس س In Urdu nastiliq text this can have two somewhat different shapes. The main part of the shape may be a wavy line, a little like a 'w', or can sometimes be a single swash - especially when two sīn characters are written together. Use the same character for both visual forms. When one or other of the possible shapes is desired, this should be produced by the font.

Refs: Matthews; Delacy

ش

U+0634 ARABIC LETTER SHEEN

Arabic consonant

ʃ

Joining forms: ششش

Persian ʃ

Urdu consonant, šīn ʃiːn

ʃ

Looks like: ششش ش In Urdu nastiliq text this can have two somewhat different shapes. The main part of the shape may be a wavy line, a little like a 'w', or can sometimes be a single swash - especially when two šīn characters are written together. Use the same character for both visual forms. When one or other of the possible shapes is desired, this should be produced by the font.

Refs: Matthews; Delacy

ص

U+0635 ARABIC LETTER SAD

Arabic consonant

Joining forms: صصص

Persian s

Urdu consonant, svād svɑːd

s Only used in words of Arabic origin.

Looks like: صصص ص

Refs: Matthews; Delacy

ض

U+0636 ARABIC LETTER DAD

Arabic consonant

Joining forms: ضضض

Persian z

Urdu consonant, zvād zvɑːd

z Only used in words of Arabic origin.

Looks like: ضضض ض

Refs: Matthews; Delacy

ط

U+0637 ARABIC LETTER TAH

Arabic consonant

Joining forms: ططط

Persian t

Urdu consonant, toe toe

t Only used in words of Arabic origin.

Looks like: ططط ط

Refs: Matthews; Delacy

ظ

U+0638 ARABIC LETTER ZAH

Arabic consonant

Joining forms: ظظظ

Persian z

Urdu consonant, zoe zoe

z Only used in words of Arabic origin.

Looks like: ظظظ ظ

Refs: Matthews; Delacy

ع

U+0639 ARABIC LETTER AIN

Description in the Unicode standard:

→ (latin small letter ezh reversed - 01B9)
→ (modifier letter left half ring - 02BF)

Arabic consonant

ʕ

Joining forms: ععع

Persian ʔ, - Preceding V → Vː

Urdu consonant, 'ain ain.

Not pronounced when preserved in Arabic words.

If it occurs at the beginning of a word, it can fulfill a similar role to alif, allowing words to begin with a vowel, but also allowing for alternative spellings for different words with the same pronunciation, eg. عرب arab (Arab) vs. ارب arab (necessity).

Note that a word-initial ɑː sound when the spelling begins with alif is written as alif with madd, eg. آج ɑːʤ (today). The same word-initial sound with 'ain is represented by 'ain followed by alif, eg. عادت ɑːdat (habit).

In non-word-initial positions an ain can cause a change in sound to preceding short vowels. This results in long vowels, but not always the long form typically associated with a given short form.

  • a short a becomes ɑː, eg. بعد bɑːd (after).

  • a short ɪ becomes e, eg. سعر ser (verse).

  • a short ʊ becomes o, eg. شعلہ ʃolɑː (flame).

ʔ occasionally between two vowels, although this is often lost in Urdu, eg. معاف mʊʔɑːf or mɑːf (forgiven); سعآدت səʔɑːdət or sɑːdət (fortunate).

Looks like: ععع ع

Refs: Matthews, pp.xix, xxix; Delacy, pp.89-91

غ

U+063A ARABIC LETTER GHAIN

Arabic consonant

ɣ

Joining forms: غغغ

Persian ɣ Between vowels q, ɢ, x

Urdu consonant, ghain ɣain

ɣ

Used in words that came into Urdu from Arabic and Persian.

Looks like: غغغ غ

Refs: Matthews; Delacy

ـ

U+0640 ARABIC TATWEEL

Description in the Unicode standard:

= kashida
• inserted to stretch characters
• also used with Syriac
ف

U+0641 ARABIC LETTER FEH

Arabic consonant

f

Joining forms: ففف

In arabic material printed in North Africa this letter sometimes has one dot below (like 06A2 ڢ) (and qaf has only one dot above).

Persian f

Urdu consonant, fe fe

f

Looks like: ففف ف

Refs: Matthews; Delacy

ق

U+0642 ARABIC LETTER QAF

Arabic consonant

q

Joining forms: ققق

In arabic material printed in North Africa this letter sometimes has only one dot above (like 06A7 ARABIC LETTER QAF WITH DOT ABOVE ڧ) (and feh has one dot below).

Persian q, ɢ

Urdu consonant, qāf qɑːf

q

Used in words that came into Urdu from Arabic and Persian.

Looks like: ققق ق

Refs: Matthews; Delacy

ك

U+0643 ARABIC LETTER KAF

Arabic consonant

k

Joining forms: ككك

Persian k

Not used in Urdu. See 06A9: ARABIC LETTER KEHEH.

ل

U+0644 ARABIC LETTER LAM

Arabic consonant

l

Joining forms: للل

Persian l

Urdu consonant, lām lɑːm

l

when part of the Arabic definite article (see below).

Looks like: للل ل

Combined with a following alif, lām is usually written as لا, eg. گلاس gilɑːs (glass). Sometimes, however, especially in words of Arabic origin such as the equivalent of the English prefix 'un-', the more Arabic form لا is used, eg. لاعلاج lɑːʕilɑːʒ (incurable).

Note that I can't find a way to make this example work with a single font. To produce it I had to mix two different fonts!

Arabic definite article The pronunciation of ال (alif followed by lām) varies when it represents the Arabic definite article . This affects many words in Urdu that have come from Arabic, in particular names and adverbial expressions.

The lām is not pronounced if it precedes one of the following characters: ت‎062A te, ث‎062B se, د‎062F dāl, ذ‎0630 zāl, ر‎0631 re, ز‎0632 ze, س‎0633 sīn, ش‎0634 šīn, ص‎0635 svād, ض‎0636 zvād, ط‎0637 toe, ظ‎0638 zoe, ل‎0644 lām, ن‎0646 nūn. Instead, the following sound is doubled. A tašdīd may sometimes be used to indicate this. Example: السلام علیکم asːalɑːm alaikum (greetings).

There may also be effects to the sound of the alif too. See 0627 ARABIC LETTER ALEF.

Refs: Matthews; Delacy

م

U+0645 ARABIC LETTER MEEM

Arabic consonant

m

Joining forms: ممم

Persian m

Urdu consonant, mīm miːm

m

Looks like: ممم م

Refs: Matthews; Delacy

ن

U+0646 ARABIC LETTER NOON

Arabic consonant

n

Joining forms: ننن

Persian n

Urdu consonant, nūn nuːn

n

Looks like: ننن ن

Within a word this looks exactly the same as U+06BA ARABIC LETTER NOON GHUNNA, which is used for nasalization of vowels, eg. ٹاںگ ʈɑː̃g (leg).

Refs: Matthews; Delacy

ه

U+0647 ARABIC LETTER HEH

Arabic consonant

h

Joining forms: ههه

Persian h, -, ɛ, æ, e

This character in final position is sometimes pronounced e, eg. خانه xaːne

Not used in Urdu.

و

U+0648 ARABIC LETTER WAW

Arabic consonant or lengthener of u-vowel

w

In certain foreign words, pronounced more like , eg. بنطلون bænt̴æloːn

The male proper name عمرو ʕæmr is written with an unpronounced final waw to distinguish it from the name عمر ʕumar that would otherwise be written identically.

Joining forms: ـو

Persian v, u, o, ow, - or lengthener of u-vowel

Not pronounced after خ, eg. خوابيدن xaːbiːdan.

Urdu consonant / vowel, vāū vɑːuː

β as consonant, eg. والد vaːlɪd (father), نومبر navambar (November).

or o or ɔ as a vowel, whether word initial after alif, او, or elsewhere on its own, eg. اوپر uːpər (above); لوگ log (people); شوق ʃɔq (keenness). The alternative vowel sounds can be disambiguated, when necessary, by the use of combining marks. The combining marks are rarely used in normal text. See a table of combining marks for vowels.

in a number of words of Persian origin beginning with خوا, eg. خواب xɑːb (dream).

ʊ in two very common words: خود xʊd (self), and خوش xʊʃ (happy).

Looks like: ـو و.

Refs: Matthews, pp. xxii-xxiv; Delacy

ى

U+0649 ARABIC LETTER ALEF MAKSURA

Description in the Unicode standard:

• represents YEH-shaped letter with no dots in any positional form
• not intended for use in combination with 0654
→ (arabic letter yeh with hamza above - 0626)

Arabic

The long a-vowel at th end of many words is written with yeh instead of an alef. In this case the yeh is typically printed without dots, to avoid confusion, although this is not universal. Example: معنى mæʕnaː. This spelling only occurs with certain words, and only when the final sound is . If any suffix is added, the spelling reverts to the normal alef, eg. معناهم mæʕnaː-hum. Do you still use the teh marbuta or heh? If you switch, you'll need a clever string match.

Vowelled text may miss out the short æ diacritic before the teh marbuta, because the sound is always the same.

Joining forms: ىىى. Note that older fonts may not show dual joining. Here is the same sequence using the Noto Naskh Arabic font, for those who have this font available: ىىى

Not used in Persian. See 06CC ARABIC LETTER FARSI YEH.

Not used in Urdu.

ي

U+064A ARABIC LETTER YEH

Description in the Unicode standard:

• loses its dots when used in combination with 0654
• retains its dots when used in combination with other combining marks

Arabic consonant

j and

In certain foreign words, pronounced more like , eg. سكرتير sɪkrɪteːr.

Use with hamza: When used with U+0654 ARABIC HAMZA ABOVE ٔ the two dots are suppressed in all positions. Text in NFC actually uses U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE ئ rather than the decomposed sequence, so that is recommended.

Unlike this character, U+08A8 ARABIC LETTER YEH WITH TWO DOTS BELOW AND HAMZA ABOVE retains the two dots in all forms, however it also represents a semantically different character.

Joining forms: ييي

Not used in Persian. See 06CC ARABIC LETTER FARSI YEH.

Not used in Urdu.

Points from ISO 8859-6

ً

U+064B ARABIC FATHATAN

Arabic æn

In classical arabic indefinite nouns and adjectives were marked by a final n-sound, called تنوين tænwiːn or, in English, 'nunation'. This is normally indicated by doubling the vowel diacritic. On a word ending with an a-vowel (though not with a feminine ending or some other suffixes) an extra alef was also added at the end of the word. In modern arabic printing the fathatan is usually dropped, but the alef is retained. The pronunciation of the ending æn is also retained in many words. Examples: كِتَابًا kɪtæːbæn, فَرَسًا færæsæn.

Urdu vowel

an

This is a doubled zabar. These marks appear at the ends of certain Arabic adverbs. The doubled zabar is the most common of the three marks of this type. Although the mark appears over an alif the vowel sound is short. Examples, یقیناً yakiːnan (certainly); مثلاً masalan (for example).

ٌ

U+064C ARABIC DAMMATAN

Arabic un

In classical arabic indefinite nouns and adjectives were marked by a final n-sound, called تنوين tænwiːn or, in English, 'nunation'. This is normally indicated by doubling the vowel diacritic. Example: جَبَلٌ or جَبَلُ ُ ʒælæbun.

Not usually shown in modern text (exceptions in the Koran, difficult older texts, and children's schoolbooks).

Urdu vowel

un

Doubled peš.

ٍ

U+064D ARABIC KASRATAN

Arabic ɪn

In classical arabic indefinite nouns and adjectives were marked by a final n-sound, called تنوين tænwiːn or, in English, 'nunation'. This is normally indicated by doubling the vowel diacritic. Example: جَبَلٌ or قَلَمٍ qælæmɪn.

Not usually shown in modern text (exceptions in the Koran, difficult older texts, and children's schoolbooks).

Urdu vowel

in

Doubled zer.

َ

U+064E ARABIC FATHA

Arabic æ or a after ص ض ط ظ غ ق and sometimes after خ ر ل. Actual pronunciation varies with context.

Short vowel diacritic, eg. كَتَبَ kætæbæ.

Not usually shown in text (exceptions tend to be the Koran, difficult older texts, and children's schoolbooks)

Urdu vowel, zabar zəbər

Rarely used; only where pronunciation needs to be spelled out. Indicates a vowel following its base character. zabar means above.

ə above a consonant, eg. بَب bəb. At the begining of a word it appears above alif or 'ain, eg. اَب əb (now), and عَرَب ərəb (Arab).

When the base consonant is followed by certain other letters, zabar represents different sounds, as shown below:

  • ɑː when followed by alif, silent choṭī he, or 'ain, eg. بَاغ bɑːɣ (garden), مکَہ makːɑː (Mecca), and بَعد bɑːd (after).

  • ɛ when followed by je (both forms), eg. جَیسا ʤɛsɑː (as), اَیسا ɛsɑː (such), and ہَے (is).

  • ɛ when followed by choṭī he or baṛī he, eg. اَحمد ɛhmad (Ahmed), and رَہنا rɛhnɑː (to remain).

  • ɔ when followed by vɑːuː, eg. شَوق ʃɔq (keenness), and اَور ɔr (and).

See a table of combining marks for vowels.

ُ

U+064F ARABIC DAMMA

Arabic u Actual pronunciation varies with context.

Short vowel diacritic (miniature waw), eg. كُتُب (kutub).

Not usually shown in text (exceptions tend to be the Koran, difficult older texts, and children's schoolbooks).

Urdu vowel, peš peʃ.

Rarely used; only where pronunciation needs to be spelled out. Indicates a vowel following its base character. peš means forward.

ʊ above a consonant, eg. بُب bʊb. At the begining of a word it appears above alif or 'ain, eg. اُب ʊb.

When the base consonant is followed by certain other letters, peš represents different sounds, as shown below:

  • when followed by vɑːuː, eg. پُورا puːrɑː (full), and اُوپر uːpar (above).

  • o when followed by 'ain, eg. شُعلہ ʃolɑː (flame), and توُّع tavaqːo (hope).

  • ɔ when followed by ʧʰoʈiː he or baṛī he, eg. شُہرت ʃɔhrat (fame), and توجُّہ tavajːɔh (attention).

ʊ, rather than a long vowel, in two very common words with a following vɑːuː: خُود xʊd (self), and خُوش xʊʃ (happy).

The word وہ vo (that, he, she, it) is irregular.

See a table of combining marks for vowels.

ِ

U+0650 ARABIC KASRA

Arabic ɪ Actual pronunciation varies with context.

Short vowel diacritic, eg. بِهِ bɪhɪ.

Not usually shown in text (exceptions tend to be the Koran, difficult older texts, and children's schoolbooks.)

Urdu vowel, zer zer

Rarely used; only where pronunciation needs to be spelled out. Indicates a vowel following its base character. zer means below.

ɪ below a base consonant, eg. بِب bɪb. At the begining of a word it appears below alif or 'ain, eg. اِتْنَا ɪtnɑː (so much) and عِلْم ɪlm (knowledge).

When the base consonant is followed by certain other letters, zer represents different sounds, as shown below:

  • when followed by je, eg. سِینہ siːnɑː (breast), and اِیمان iːmɑːn (faith).

  • e when followed by ain, eg. شِعر ʃer (verse), and واقِع vɑːqe (situated).

  • ɛ when followed by ʧʰoʈiː he or baɽiː he, eg. مِہربانی mɛhrbɑːniː (kindness), and واضِح vɑːzɛh (clear).

See a table of combining marks for vowels.

ɪzāfat: ɪzɑːfat is the name given to the short vowel ɛ when used to describe a relationship between two words. It may be translated of, eg. as in the Lion of Punjab.

This sound is mostly represented using zer. Sometimes, however, the combining mark is not shown, even though pronounced. Examples: شیرِ پنجاب ʃer ɛ panʤɑːb (Lion of the Panjab); طالبِ علم tɑːlɪb ɛ ɪlm (seeker of knowledge (a student)).

There are other ways in which ɪzāfatcan be formed.

ّ

U+0651 ARABIC SHADDA

Arabic

Diacritic that doubles the length of the supporting consonant, eg. رتّب rætːæbæ. Visible in arabic printing, but not always marked consistently.

Common practise is to display any combining kasra below the shadda, rather than below the base consonant, eg. قَبِّل qæbːɪl. (To achieve this, you need to order the diacritics as shadda followed by kasra, otherwise you get قَبِّل.)

The sign derives from a miniature nucleus of seen, without dots.

Urdu mark, tašdīd taʃdiːd.

Doubles the sound of the base consonant, eg. ستّر sattar seventy. More often than not, this is not written.

tašdīd means strengthening.

ْ

U+0652 ARABIC SUKUN

Description in the Unicode standard:

• marks absence of a vowel after the base consonant
• used in some Korans to mark a long vowel as ignored
• can have a variety of shapes, including a circular one and a shape that looks like '06E1'
→ (arabic small high dotless head of khah - 06E1)

Arabic

Indicates that no vowel follows the consonant to which this is attached, eg. مَكْتَب maktab.

Not usually shown in text (exceptions tend to be the Koran, difficult older texts, and children's schoolbooks).

Urdu mark, sukūn sukuːn or jazm ʤazm.

Rarely used; indicates absence of a vowel between consonants, eg. سَخْت saxt (hard).

It has various possible forms, including a small round circle, something that looks like peʃ, and something like a circumflex. (There is another Unicode character that provides an alternative visual form, 06E1: ARABIC SMALL HIGH DOTLESS HEAD OF KHAH, but it is better to use this character and select the variant required using a font.)

This diacritic is never written above the final character in a word, because as a rule a short vowel is not pronounced in this position.

Sukūn is an Arabic word meaning rest or pause.

Combining maddah and hamza

ٓ

U+0653 ARABIC MADDAH ABOVE

ٔ

U+0654 ARABIC HAMZA ABOVE

Description in the Unicode standard:

• not restricted to hamza semantics
• may also occur as a diacritic forming new letters

Arabic ʔ (glottal stop)

The hamza sometimes stands alone (see U+0621 ARABIC LETTER HAMZA ء), but usually appears with a 'carrier' letter - alef, waw, or yeh (أ إ ؤ ئ) for which separate precomposed characters are available in Unicode.

Combined with base characters: When the hamza is above or below another character you should typically this character with the appropriate base character, however there are a number of exceptions, where you would not normally use this character.

Some exceptions arise because the NFC normalization form converts the base character and combining hamza to a precomposed character. These instances include

Other exceptions arise where the hamza is an integral part of the character itself (ie. an ijam). Examples of these characters include

Cutting and joining hamza in orthography: Classical arabic distinguishes between 'cutting' and 'joining' hamza. 'Cutting' means always pronounced, 'joining' means frequently elided. The joining hamza is of little practical importance in modern arabic pronounced without the old case endings.

In modern printed arabic, the hamza is rarely shown when it occurs at the beginning of a word.

The following are simplified rules for use of (cutting) hamza:

The sign indicating a joining hamza is called a wasla (see U+0671 ARABIC LETTER ALEF WASLA ٱ).

ٕ

U+0655 ARABIC HAMZA BELOW

Other combining marks

ٖ

U+0656 ARABIC SUBSCRIPT ALEF

Urdu mark

Used to indicate that the vowel is or i rather than e, eg. نُحْیٖ.

This diacritic is not usually needed, and serves only to emphasise that the vowel is long.

ٗ

U+0657 ARABIC INVERTED DAMMA

Description in the Unicode standard:

= ulta pesh
• Kashmiri, Urdu

Urdu mark

Used to indicate that the vowel is or ʊ rather than ɔ, eg. حبل حلالہٗ.

This diacritic is not usually needed, and serves only to emphasise that the vowel is long.

٘

U+0658 ARABIC MARK NOON GHUNNA

Description in the Unicode standard:

• Baluchi
• indicates nasalization in Urdu

Urdu mark

Nasalisation of Urdu vowels is normally indicated by 06BA ARABIC LETTER NOON GHUNNA ں. At the end of a word this has no dot above, but in the middle of a word it looks exactly like 0646 ARABIC LETTER NOON ن (and some people may mix up the use of these characters).

This diacritic is used when people want to make it clear that this glyph represents nasalisation rather than the letter nūn.

It is not used in a standard way, just when the user prefers, and is fairly uncommon, eg. ساں٘گ The CRULP fonts don't appear to show the diacritic as expected.

ٙ

U+0659 ARABIC ZWARAKAY

Description in the Unicode standard:

• Pashto
ٚ

U+065A ARABIC VOWEL SIGN SMALL V ABOVE

Description in the Unicode standard:

• African languages
ٛ

U+065B ARABIC VOWEL SIGN INVERTED SMALL V ABOVE

Description in the Unicode standard:

• African languages
ٜ

U+065C ARABIC VOWEL SIGN DOT BELOW

Description in the Unicode standard:

• African languages
ٝ

U+065D ARABIC REVERSED DAMMA

Description in the Unicode standard:

• Ormuri, African languages
ٞ

U+065E ARABIC FATHA WITH TWO DOTS

Description in the Unicode standard:

• Kalami
ٟ

U+065F ARABIC WAVY HAMZA BELOW

Description in the Unicode standard:

• Kashmiri
ࣿ

U+08FF ARABIC MARK SIDEWAYS NOON GHUNNA

Point

ٰ

U+0670 ARABIC LETTER SUPERSCRIPT ALEF

Description in the Unicode standard:

• actually a vowel sign, despite the name

Urdu vowel

ɑː

Used in a few Arabic words over the final form of 06CC ARABIC LETTER FARSI YEH ی to produce the sound ɑː: eg. اعلیٰ alɑː (paramount, highest); دعویٰ davɑː (law suit, claim).

Extended Arabic letters

ٱ

U+0671 ARABIC LETTER ALEF WASLA

Description in the Unicode standard:

• Koranic Arabic
ٲ

U+0672 ARABIC LETTER ALEF WITH WAVY HAMZA ABOVE

Description in the Unicode standard:

• Baluchi, Kashmiri
ٴ

U+0674 ARABIC LETTER HIGH HAMZA

Description in the Unicode standard:

• Kazakh
• forms digraphs
ٵ

U+0675 ARABIC LETTER HIGH HAMZA ALEF

Description in the Unicode standard:

• Kazakh
≈ 0627 0674
ٶ

U+0676 ARABIC LETTER HIGH HAMZA WAW

Description in the Unicode standard:

• Kazakh
≈ 0648 0674
ٷ

U+0677 ARABIC LETTER U WITH HAMZA ABOVE

Description in the Unicode standard:

• Kazakh
≈ 06C7 0674
ٸ

U+0678 ARABIC LETTER HIGH HAMZA YEH

Description in the Unicode standard:

• Kazakh
≈ 064A 0674
ٹ

U+0679 ARABIC LETTER TTEH

Description in the Unicode standard:

• Urdu

Not used in Arabic.

Urdu consonant, ṭe ʈe

ʈ

Looks like: ٹٹٹ ٹ

ʈʰ together with 06BE: ARABIC LETTER HEH DOACHASHMEE, to represent the aspirated retroflex t in Urdu, a distinct letter of the alphabet called ṭhe.

Looks like: ٹھٹھٹھ ٹھ.

Refs: Matthews; Delacy

ٺ

U+067A ARABIC LETTER TTEHEH

Description in the Unicode standard:

• Sindhi
ٻ

U+067B ARABIC LETTER BEEH

Description in the Unicode standard:

• Sindhi
ټ

U+067C ARABIC LETTER TEH WITH RING

Description in the Unicode standard:

• Pashto
ٽ

U+067D ARABIC LETTER TEH WITH THREE DOTS ABOVE DOWNWARDS

Description in the Unicode standard:

• Sindhi
پ

U+067E ARABIC LETTER PEH

Description in the Unicode standard:

• Persian, Urdu, ...

Not used in Arabic.

Persian consonant, pe

p

Urdu consonant, pe

p

Looks like: پپپ پ

together with 06BE: ARABIC LETTER HEH DOACHASHMEE, to represent the aspirated p in Urdu, a distinct letter of the alphabet called phe.

Looks like: پھپھپھ پھ.

Refs: Matthews; Delacy

ٿ

U+067F ARABIC LETTER TEHEH

Description in the Unicode standard:

• Sindhi
ڀ

U+0680 ARABIC LETTER BEHEH

Description in the Unicode standard:

• Sindhi
ځ

U+0681 ARABIC LETTER HAH WITH HAMZA ABOVE

Description in the Unicode standard:

• Pashto letter 'dze'

Pashto consonant

dz

This character does not decompose. It is treated as a separate letter, and is not equivalent to U+062D ARABIC LETTER HAH ح + U+0654 ARABIC HAMZA ABOVE ٔ.

Joining forms: ځځځ

ڂ

U+0682 ARABIC LETTER HAH WITH TWO DOTS VERTICAL ABOVE

Description in the Unicode standard:

• not used in modern Pashto
ڃ

U+0683 ARABIC LETTER NYEH

Description in the Unicode standard:

• Sindhi
ڄ

U+0684 ARABIC LETTER DYEH

Description in the Unicode standard:

• Sindhi
څ

U+0685 ARABIC LETTER HAH WITH THREE DOTS ABOVE

Description in the Unicode standard:

• Pashto, Khwarazmian

Pashto consonant

ts

Joining forms: څڅڅ

چ

U+0686 ARABIC LETTER TCHEH

Description in the Unicode standard:

• Persian, Urdu, ...

Not used in Arabic.

Urdu consonant, ce ʧe

ʧ

Looks like: چچچ چ

ʧʰ together with 06BE: ARABIC LETTER HEH DOACHASHMEE, to represent the aspirated ʧ in Urdu, a distinct letter of the alphabet called che.

Looks like: چھچھچھ چھ.

Refs: Matthews; Delacy

ڇ

U+0687 ARABIC LETTER TCHEHEH

Description in the Unicode standard:

• Sindhi
ڈ

U+0688 ARABIC LETTER DDAL

Description in the Unicode standard:

• Urdu

Not used in Arabic.

Urdu consonant, ḍāl ɖɑːl

ɖ

Looks like: ـڈ ڈ

ɖʰ together with 06BE: ARABIC LETTER HEH DOACHASHMEE, to represent the aspirated retroflex d in Urdu, a distinct letter of the alphabet called ḍhe.

Looks like: ـڈھ ڈھ.

Refs: Matthews; Delacy

ډ

U+0689 ARABIC LETTER DAL WITH RING

Description in the Unicode standard:

• Pashto
ڊ

U+068A ARABIC LETTER DAL WITH DOT BELOW

Description in the Unicode standard:

• Sindhi, early Persian
ڋ

U+068B ARABIC LETTER DAL WITH DOT BELOW AND SMALL TAH

Description in the Unicode standard:

• Lahnda
ڌ

U+068C ARABIC LETTER DAHAL

Description in the Unicode standard:

• Sindhi
ڍ

U+068D ARABIC LETTER DDAHAL

Description in the Unicode standard:

• Sindhi
ڎ

U+068E ARABIC LETTER DUL

Description in the Unicode standard:

• older shape for DUL, now obsolete in Sindhi
• Burushaski
ڏ

U+068F ARABIC LETTER DAL WITH THREE DOTS ABOVE DOWNWARDS

Description in the Unicode standard:

• Sindhi
• current shape used for DUL
ڐ

U+0690 ARABIC LETTER DAL WITH FOUR DOTS ABOVE

Description in the Unicode standard:

• old Urdu, not in current use
ڑ

U+0691 ARABIC LETTER RREH

Description in the Unicode standard:

• Urdu

Not used in Arabic.

Urdu consonant, ṛe ɽe

ɽ

Looks like: ـڑ ڑ

ɽʰ together with 06BE: ARABIC LETTER HEH DOACHASHMEE, to represent the aspirated retroflex r in Urdu, a distinct letter of the alphabet called ṛhe.

Looks like: ـڑھ ڑھ.

Refs: Matthews; Delacy

ڒ

U+0692 ARABIC LETTER REH WITH SMALL V

Description in the Unicode standard:

• Kurdish
ړ

U+0693 ARABIC LETTER REH WITH RING

Description in the Unicode standard:

• Pashto
ڔ

U+0694 ARABIC LETTER REH WITH DOT BELOW

Description in the Unicode standard:

• Kurdish, early Persian
ڕ

U+0695 ARABIC LETTER REH WITH SMALL V BELOW

Description in the Unicode standard:

• Kurdish
ږ

U+0696 ARABIC LETTER REH WITH DOT BELOW AND DOT ABOVE

Description in the Unicode standard:

• Pashto
ڗ

U+0697 ARABIC LETTER REH WITH TWO DOTS ABOVE

Description in the Unicode standard:

• Dargwa
ژ

U+0698 ARABIC LETTER JEH

Description in the Unicode standard:

• Persian, Urdu, ...

Not used in Arabic.

Urdu consonant, že ʒe

ʒ

Looks like: ـژ ژ

Refs: Matthews; Delacy

ڙ

U+0699 ARABIC LETTER REH WITH FOUR DOTS ABOVE

Description in the Unicode standard:

• Sindhi
ښ

U+069A ARABIC LETTER SEEN WITH DOT BELOW AND DOT ABOVE

Description in the Unicode standard:

• Pashto
ڛ

U+069B ARABIC LETTER SEEN WITH THREE DOTS BELOW

Description in the Unicode standard:

• early Persian
ڜ

U+069C ARABIC LETTER SEEN WITH THREE DOTS BELOW AND THREE DOTS ABOVE

Description in the Unicode standard:

• Moroccan Arabic
ڝ

U+069D ARABIC LETTER SAD WITH TWO DOTS BELOW

Description in the Unicode standard:

• Turkic
ڞ

U+069E ARABIC LETTER SAD WITH THREE DOTS ABOVE

Description in the Unicode standard:

• Berber, Burushaski
ڟ

U+069F ARABIC LETTER TAH WITH THREE DOTS ABOVE

Description in the Unicode standard:

• old Hausa
ڠ

U+06A0 ARABIC LETTER AIN WITH THREE DOTS ABOVE

Description in the Unicode standard:

• old Malay
ڡ

U+06A1 ARABIC LETTER DOTLESS FEH

Description in the Unicode standard:

• Adighe
ڢ

U+06A2 ARABIC LETTER FEH WITH DOT MOVED BELOW

Description in the Unicode standard:

• Maghrib Arabic
ڣ

U+06A3 ARABIC LETTER FEH WITH DOT BELOW

Description in the Unicode standard:

• Ingush
ڤ

U+06A4 ARABIC LETTER VEH

Description in the Unicode standard:

• Middle Eastern Arabic for foreign words
• Kurdish, Khwarazmian, early Persian
ڥ

U+06A5 ARABIC LETTER FEH WITH THREE DOTS BELOW

Description in the Unicode standard:

• North African Arabic for foreign words
ڦ

U+06A6 ARABIC LETTER PEHEH

Description in the Unicode standard:

• Sindhi
ڧ

U+06A7 ARABIC LETTER QAF WITH DOT ABOVE

Description in the Unicode standard:

• Maghrib Arabic
ڨ

U+06A8 ARABIC LETTER QAF WITH THREE DOTS ABOVE

Description in the Unicode standard:

• Tunisian Arabic
ک

U+06A9 ARABIC LETTER KEHEH

Description in the Unicode standard:

• Persian, Urdu, ...

Not used in Arabic.

Urdu consonant, kāf kɑːf

k

Looks like: ککک ک

When followed by alif or lām, this has a special rounded shape, eg. کا kɑː (of); کل kal (yesterday).

together with 06BE: ARABIC LETTER HEH DOACHASHMEE, to represent the aspirated k in Urdu, a distinct letter of the alphabet called khe.

Looks like: کھکھکھ کھ.

Refs: Matthews; Delacy

ڪ

U+06AA ARABIC LETTER SWASH KAF

ګ

U+06AB ARABIC LETTER KAF WITH RING

Description in the Unicode standard:

• Pashto
• may appear like an Arabic KAF (0643) with a ring below the base
ڬ

U+06AC ARABIC LETTER KAF WITH DOT ABOVE

Description in the Unicode standard:

• old Malay
ڭ

U+06AD ARABIC LETTER NG

Description in the Unicode standard:

• Uighur, Kazakh, old Malay, early Persian, ...
ڮ

U+06AE ARABIC LETTER KAF WITH THREE DOTS BELOW

Description in the Unicode standard:

• Berber, early Persian
گ

U+06AF ARABIC LETTER GAF

Description in the Unicode standard:

• Persian, Urdu, ...

Not used in Arabic.

Urdu consonant, gāf gɑːf

g

Looks like: گگگ گ

When followed by alif or lām, this has a special rounded shape, eg. گام gɑːm (step); گل gul (rose).

together with 06BE: ARABIC LETTER HEH DOACHASHMEE, to represent the aspirated g in Urdu, a distinct letter of the alphabet called ghe.

Looks like: گھگھگھ گھ.

Refs: Matthews; Delacy

ڰ

U+06B0 ARABIC LETTER GAF WITH RING

Description in the Unicode standard:

• Lahnda
ڱ

U+06B1 ARABIC LETTER NGOEH

Description in the Unicode standard:

• Sindhi
ڲ

U+06B2 ARABIC LETTER GAF WITH TWO DOTS BELOW

Description in the Unicode standard:

• not used in Sindhi
ڳ

U+06B3 ARABIC LETTER GUEH

Description in the Unicode standard:

• Sindhi
ڴ

U+06B4 ARABIC LETTER GAF WITH THREE DOTS ABOVE

Description in the Unicode standard:

• not used in Sindhi
ڵ

U+06B5 ARABIC LETTER LAM WITH SMALL V

Description in the Unicode standard:

• Kurdish
ڶ

U+06B6 ARABIC LETTER LAM WITH DOT ABOVE

Description in the Unicode standard:

• Kurdish
ڷ

U+06B7 ARABIC LETTER LAM WITH THREE DOTS ABOVE

Description in the Unicode standard:

• Kurdish
ڸ

U+06B8 ARABIC LETTER LAM WITH THREE DOTS BELOW

ڹ

U+06B9 ARABIC LETTER NOON WITH DOT BELOW

ں

U+06BA ARABIC LETTER NOON GHUNNA

Description in the Unicode standard:

• Urdu

Not used in Arabic.

Urdu nasalisation indicator, nun ghunna nuːn ɣunna.

Looks like: ںںں ں

Indicates that the preceding vowel is nasalised.

At the end of a word, an undotted form is used, eg. ماں mãː, mother, کروں karũː, I may do.

Nasalization within a word uses a form with a dot that looks just like the letter ن 0646 ARABIC LETTER NOON, eg. ٹاںگ tãːg leg.

This is not counted as a regular letter of the Urdu alphabet.

ڻ

U+06BB ARABIC LETTER RNOON

Description in the Unicode standard:

• Sindhi
ڼ

U+06BC ARABIC LETTER NOON WITH RING

Description in the Unicode standard:

• Pashto
ڽ

U+06BD ARABIC LETTER NOON WITH THREE DOTS ABOVE

Description in the Unicode standard:

• old Malay
ھ

U+06BE ARABIC LETTER HEH DOACHASHMEE

Description in the Unicode standard:

• Urdu
• forms aspirate digraphs

Urdu aspiration marker / calendar indicator, do cašmī he.

Aspiration: Used to create the aspirated letters of the Urdu alphabet. Each letter is composed of two characters. The letters are: بھ bʰe, پھ pʰe, تھ tʰe, ٹھ ʈʰe, جھ ʤʰe, چھ ʧʰe, دھ dʰe, ڈھ ɖʰe, ڑھ ɽʰe, کھ kʰe, and گھ gʰe.

Until recently choṭī he 06C1 ARABIC LETTER HEH GOAL ہ and do cašmī he could be used interchangeably to express aspiration, eg. ہاں or ھاں for hãː yes. Modern practice is to use this character exclusively for aspiration, though people do still occasionally confuse the two.

Calendar indicator: Dates using the Muslim calendar are followed by the word ہجری hɪʤriː, which is abbreviated with the symbol ھ.

ڿ

U+06BF ARABIC LETTER TCHEH WITH DOT ABOVE

ۀ

U+06C0 ARABIC LETTER HEH WITH YEH ABOVE

Description in the Unicode standard:

= arabic letter hamzah on ha (1.0)
= izafet
• Urdu
• actually a ligature, not an independent letter
≡ 06D5 0654
ہ

U+06C1 ARABIC LETTER HEH GOAL

Description in the Unicode standard:

• Urdu

Not used in Arabic.

Urdu consonant, choṭī he ʧʰoʈiː he

h

ɑː as 'silent he' (see below).

ɛ occasionally as a variant of 'silent he' (see below).

when doubled at the end of a word (see below).

Silent he: In Urdu words this letter is pronounced ɑː at the end of a word. Many Arabic and Persian words end in a he that is pronounced ɑː (just like alif), eg. مکّہ məkkɑː (Mecca).

A word like rɑːʤɑː (king), can be spelled with either an alif or a he, ie. راجا or راجہ. This is because the original Indian word was borrowed into Persian, then back into Urdu. Both spellings are now acceptable.

In a few words, the pronunciation of silent he is irregular, eg. کہ (that) and نہ (no).

Doubled he: In order to distinguish some words where the final h is pronounced rather than representing ɑː (or ɛ in irregular pronunciations), the choṭī he is sometimes doubled, eg. کہہ kɛh (say) vs. کہ .

Aspiration: Until recently choṭī he ہ and do cašmī he ھ could be used interchangeably, eg. ہاں or ھاں for hãː (yes). Modern practice is to use the latter exclusively for aspiration, though people do still occasionally confuse the two.

Vowel changes: choṭī he can change the preceding vowel as follows:

  • a to ɛ, eg. رَہنا rɛhnɑː (to remain ).

  • ɪ to ɛ, eg. مہربانی mɛhrbɑːniː (kindness).

  • ʊ to o, eg. , شہرت ʃohrət (fame).

Looks like: ہہہ ہ

The initial form is written with a hook beneath, eg. ہندو hinduː (Hindu). The medial can be written with or without, eg. کہاں xɑːb (dream).

A special initial form is used before alif or lam, eg. ہاں hãː (yes), and اہل ahl (people).

Refs: Matthews, pp. xxiv-xxvi,xxviii-xxix; Delacy,pp.104-105

ۂ

U+06C2 ARABIC LETTER HEH GOAL WITH HAMZA ABOVE

Description in the Unicode standard:

• Urdu
• actually a ligature, not an independent letter
≡ 06C1 0654

Not used in Arabic.

Urdu consonant with izafat, ɪzɑːfat

when used as izafat.

NOTE: The Unicode Standard indicates that this grapheme should be represented using U+06C0 ARABIC LETTER HEH WITH YEH ABOVE, however that doesn't work with the Nafees Nastaleeq font, and I have seen evidence elsewhere that in common use this HEH GOAL WITH HAMZA character is used for this purpose. Need to investigate further.

Izāfat ɪzɑːfat is the name given to the short vowel ɛ used to describe a relationship between two words. It may be translated of, eg. as in the Lion of Punjab.

This sound is mostly represented using zer, but in certain cases can be represented with a combining hamza. One such case occurs when the preceding word ends in choṭī he ہ: eg. قطرۂآب qatrah ɛ ɑːb drop of water.

There are other ways in which izafat can be formed.

ۃ

U+06C3 ARABIC LETTER TEH MARBUTA GOAL

Description in the Unicode standard:

• Urdu
ۄ

U+06C4 ARABIC LETTER WAW WITH RING

Description in the Unicode standard:

• Kashmiri
ۅ

U+06C5 ARABIC LETTER KIRGHIZ OE

Description in the Unicode standard:

• Kirghiz
ۆ

U+06C6 ARABIC LETTER OE

Description in the Unicode standard:

• Uighur, Kurdish, Kazakh, Azerbaijani
ۇ

U+06C7 ARABIC LETTER U

Description in the Unicode standard:

• Kirghiz, Azerbaijani
ۈ

U+06C8 ARABIC LETTER YU

Description in the Unicode standard:

• Uighur
ۉ

U+06C9 ARABIC LETTER KIRGHIZ YU

Description in the Unicode standard:

• Kazakh, Kirghiz
ۊ

U+06CA ARABIC LETTER WAW WITH TWO DOTS ABOVE

Description in the Unicode standard:

• Kurdish
ۋ

U+06CB ARABIC LETTER VE

Description in the Unicode standard:

• Uighur, Kazakh
ی

U+06CC ARABIC LETTER FARSI YEH

Description in the Unicode standard:

• Arabic, Persian, Urdu, Kashmiri, ...
• initial and medial forms of this letter have dots
→ (arabic letter alef maksura - 0649)
→ (arabic letter yeh - 064A)

Not used in Arabic.

Urdu consonant / vowel, ye je

The Urdu letter je has two distinct visual forms requiring the use of two Unicode characters: this one ی and ے. For more information on the latter, see 06D2 ARABIC LETTER YEH BARREE.

j as a consonant (word initial or medial), یار jɑːr (friend) and سایہ sɑːjɑː (shadow).

or e or ɛ as an initial or medial vowel (initially it is used after alif, ای), eg. ایک ek (one), سینہ siːnɑː (breast), and کیسا kɛsɑː (how).

The alternative vowel sounds can be disambiguated, when necessary, by the use of combining marks. The combining marks are rarely used in normal text. See a table of combining marks for vowels.

in word final position, eg. لڑکی ləɽkiː (girl).

To represent the vowels e or ɛ in final position or in the isolated form, 06D2 ARABIC LETTER YEH BARREE ے is used, eg. لڑکے ləɽke boys.

Looks like: ییی ی

This character has two dots below it in initial and medial position, but no dots in final or independent form.

Refs: Matthews; Delacy

ۍ

U+06CD ARABIC LETTER YEH WITH TAIL

Description in the Unicode standard:

• Pashto, Sindhi
ێ

U+06CE ARABIC LETTER YEH WITH SMALL V

Description in the Unicode standard:

• Kurdish
ۏ

U+06CF ARABIC LETTER WAW WITH DOT ABOVE

ې

U+06D0 ARABIC LETTER E

Description in the Unicode standard:

• Pashto, Uighur
• used as the letter bbeh in Sindhi
ۑ

U+06D1 ARABIC LETTER YEH WITH THREE DOTS BELOW

Description in the Unicode standard:

• old Malay
ے

U+06D2 ARABIC LETTER YEH BARREE

Description in the Unicode standard:

• Urdu

Not used in Arabic.

Urdu vowel, baṛī ye baɽiː je

The Urdu letter je has two distinct visual forms requiring the use of two Unicode characters: this one ے and ی. For more information on the latter, see 06CC ARABIC LETTER FARSI YEH. The latter represents both a consonant and a vowel, but this form is used only for vowels. This form is used only in word final or isolated position.

e or ɛ in word-final or isolated position, eg. لڑکے laɽke, (boys).

The alternative sounds possible in the initial combinations can be disambiguated, when necessary, by the use of combining marks. The combining marks are rarely used in normal text. See a table of combining marks for vowels

Looks like: ـے ے.

This shape is also used with a hamza to represent the izāfat ɪzɑːfat. For this you should use 06D3 ARABIC LETTER YEH BARREE WITH HAMZA ABOVE.

Refs: Matthews; Delacy

ۓ

U+06D3 ARABIC LETTER YEH BARREE WITH HAMZA ABOVE

Description in the Unicode standard:

• Urdu
• actually a ligature, not an independent letter
≡ 06D2 0654

Not used in Arabic.

Urdu Izāfat ɪzɑːfat marker

ɛ

Izāfat is the name given to the short vowel ɛ used to describe a relationship between two words. It may be translated of, eg. as in the Lion of Punjab.

This sound is mostly represented using zer, but can also be represented with a combining hamza in a couple of cases.

Izāfat may also be shown as ے with or without a combining hamzā when the preceding word ends in a long vowel: eg. صدا ۓ بلن sadɑː ɛ buland a high voice; روۓزمین ruː ɛ zamiːn the surface of the ground.

There are other ways in which izafat can be formed.

See also 06D2 ARABIC LETTER YEH BARREE.

Refs: Matthews; Delacy

ە

U+06D5 ARABIC LETTER AE

Description in the Unicode standard:

• Uighur, Kazakh, Kirghiz
ۺ

U+06FA ARABIC LETTER SHEEN WITH DOT BELOW

ۻ

U+06FB ARABIC LETTER DAD WITH DOT BELOW

ۼ

U+06FC ARABIC LETTER GHAIN WITH DOT BELOW

ݐ

U+0750 ARABIC LETTER BEH WITH THREE DOTS HORIZONTALLY BELOW

ݑ

U+0751 ARABIC LETTER BEH WITH DOT BELOW AND THREE DOTS ABOVE

ݒ

U+0752 ARABIC LETTER BEH WITH THREE DOTS POINTING UPWARDS BELOW

ݓ

U+0753 ARABIC LETTER BEH WITH THREE DOTS POINTING UPWARDS BELOW AND TWO DOTS ABOVE

ݔ

U+0754 ARABIC LETTER BEH WITH TWO DOTS BELOW AND DOT ABOVE

ݕ

U+0755 ARABIC LETTER BEH WITH INVERTED SMALL V BELOW

ݖ

U+0756 ARABIC LETTER BEH WITH SMALL V

ݗ

U+0757 ARABIC LETTER HAH WITH TWO DOTS ABOVE

ݘ

U+0758 ARABIC LETTER HAH WITH THREE DOTS POINTING UPWARDS BELOW

ݙ

U+0759 ARABIC LETTER DAL WITH TWO DOTS VERTICALLY BELOW AND SMALL TAH

Description in the Unicode standard:

• Saraiki
ݚ

U+075A ARABIC LETTER DAL WITH INVERTED SMALL V BELOW

ݛ

U+075B ARABIC LETTER REH WITH STROKE

ݜ

U+075C ARABIC LETTER SEEN WITH FOUR DOTS ABOVE

Description in the Unicode standard:

• Shina
ݝ

U+075D ARABIC LETTER AIN WITH TWO DOTS ABOVE

ݞ

U+075E ARABIC LETTER AIN WITH THREE DOTS POINTING DOWNWARDS ABOVE

ݟ

U+075F ARABIC LETTER AIN WITH TWO DOTS VERTICALLY ABOVE

ݠ

U+0760 ARABIC LETTER FEH WITH TWO DOTS BELOW

ݡ

U+0761 ARABIC LETTER FEH WITH THREE DOTS POINTING UPWARDS BELOW

ݢ

U+0762 ARABIC LETTER KEHEH WITH DOT ABOVE

Description in the Unicode standard:

• old Malay, preferred to 06AC
→ (arabic letter kaf with dot above - 06AC)
ݣ

U+0763 ARABIC LETTER KEHEH WITH THREE DOTS ABOVE

Description in the Unicode standard:

• Moroccan Arabic, Amazigh, Burushaski
→ (arabic letter ng - 06AD)
ݤ

U+0764 ARABIC LETTER KEHEH WITH THREE DOTS POINTING UPWARDS BELOW

ݥ

U+0765 ARABIC LETTER MEEM WITH DOT ABOVE

ݦ

U+0766 ARABIC LETTER MEEM WITH DOT BELOW

Description in the Unicode standard:

• Maba
ݧ

U+0767 ARABIC LETTER NOON WITH TWO DOTS BELOW

Description in the Unicode standard:

• Arwi
ݨ

U+0768 ARABIC LETTER NOON WITH SMALL TAH

Description in the Unicode standard:

• Saraiki, Pathwari
ݩ

U+0769 ARABIC LETTER NOON WITH SMALL V

Description in the Unicode standard:

• Gojri
ݪ

U+076A ARABIC LETTER LAM WITH BAR

ݫ

U+076B ARABIC LETTER REH WITH TWO DOTS VERTICALLY ABOVE

Description in the Unicode standard:

• Torwali, Ormuri
ݬ

U+076C ARABIC LETTER REH WITH HAMZA ABOVE

Description in the Unicode standard:

• Ormuri

Ormuri consonant

ʑ

Joining forms: ـݬ

ݭ

U+076D ARABIC LETTER SEEN WITH TWO DOTS VERTICALLY ABOVE

Description in the Unicode standard:

• Kalami, Ormuri

Addition for Kashmiri

ؠ

U+0620 ARABIC LETTER KASHMIRI YEH

Additions for early Persian and Azerbaijani

ػ

U+063B ARABIC LETTER KEHEH WITH TWO DOTS ABOVE

ؼ

U+063C ARABIC LETTER KEHEH WITH THREE DOTS BELOW

ؽ

U+063D ARABIC LETTER FARSI YEH WITH INVERTED V

Description in the Unicode standard:

• Azerbaijani
ؾ

U+063E ARABIC LETTER FARSI YEH WITH TWO DOTS ABOVE

ؿ

U+063F ARABIC LETTER FARSI YEH WITH THREE DOTS ABOVE

ݾ

U+077E ARABIC LETTER SEEN WITH INVERTED V

ݿ

U+077F ARABIC LETTER KAF WITH TWO DOTS ABOVE

Extended Arabic letters for Parkari

ۮ

U+06EE ARABIC LETTER DAL WITH INVERTED V

ۯ

U+06EF ARABIC LETTER REH WITH INVERTED V

Description in the Unicode standard:

• also used in early Persian
ۿ

U+06FF ARABIC LETTER HEH WITH INVERTED V

Extended Arabic letters for African languages

U+08A0 ARABIC LETTER BEH WITH SMALL V BELOW

ࢡ

U+08A1 ARABIC LETTER BEH WITH HAMZA ABOVE

Adamawa Fulfulde

ɓ (bilabial implosive)

This character does not decompose. It is treated as a separate letter, and is not equivalent to U+0628 ARABIC LETTER BEH ب with U+0654 ARABIC HAMZA ABOVE ٔ.

Joining forms: ࢡࢡࢡ

U+08A2 ARABIC LETTER JEEM WITH TWO DOTS ABOVE

U+08A3 ARABIC LETTER TAH WITH TWO DOTS ABOVE

U+08A4 ARABIC LETTER FEH WITH DOT BELOW AND THREE DOTS ABOVE

U+08A5 ARABIC LETTER QAF WITH DOT BELOW

U+08A6 ARABIC LETTER LAM WITH DOUBLE BAR

U+08A7 ARABIC LETTER MEEM WITH THREE DOTS ABOVE

U+08A8 ARABIC LETTER YEH WITH TWO DOTS BELOW AND HAMZA ABOVE

Adamawa Fulfulde

(palatal implosive)

Unlike U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE ئ, which loses the two dots when combined with hamza, this character retains the two dots in all forms.

Note that this character does not decompose. It is treated as a separate letter.

Joining forms: ࢨࢨࢨ

U+08A9 ARABIC LETTER YEH WITH TWO DOTS BELOW AND DOT ABOVE

Extended vowel signs for African languages

U+08F4 ARABIC FATHA WITH RING

U+08F5 ARABIC FATHA WITH DOT ABOVE

U+08F6 ARABIC KASRA WITH DOT BELOW

Description in the Unicode standard:

• also used in Philippine languages

U+08F7 ARABIC LEFT ARROWHEAD ABOVE

U+08F8 ARABIC RIGHT ARROWHEAD ABOVE

U+08F9 ARABIC LEFT ARROWHEAD BELOW

U+08FA ARABIC RIGHT ARROWHEAD BELOW

U+08FB ARABIC DOUBLE RIGHT ARROWHEAD ABOVE

U+08FC ARABIC DOUBLE RIGHT ARROWHEAD ABOVE WITH DOT

U+08FD ARABIC RIGHT ARROWHEAD ABOVE WITH DOT

Dependent consonants for Rohingya

U+08AA ARABIC LETTER REH WITH LOOP

Description in the Unicode standard:

= bottya-reh

U+08AB ARABIC LETTER WAW WITH DOT WITHIN

Description in the Unicode standard:

= nota-wa

U+08AC ARABIC LETTER ROHINGYA YEH

Description in the Unicode standard:

= bottya-yeh

Extended vowel signs for Rohingya

U+08E4 ARABIC CURLY FATHA

U+08E5 ARABIC CURLY DAMMA

U+08E6 ARABIC CURLY KASRA

U+08E7 ARABIC CURLY FATHATAN

U+08E8 ARABIC CURLY DAMMATAN

U+08E9 ARABIC CURLY KASRATAN

Tone marks for Rohingya

U+08EA ARABIC TONE ONE DOT ABOVE

U+08EB ARABIC TONE TWO DOTS ABOVE

U+08EC ARABIC TONE LOOP ABOVE

U+08ED ARABIC TONE ONE DOT BELOW

U+08EE ARABIC TONE TWO DOTS BELOW

U+08EF ARABIC TONE LOOP BELOW

Signs for Sindhi

۽

U+06FD ARABIC SIGN SINDHI AMPERSAND

۾

U+06FE ARABIC SIGN SINDHI POSTPOSITION MEN

Additions for Khowar

ݮ

U+076E ARABIC LETTER HAH WITH SMALL ARABIC LETTER TAH BELOW

ݯ

U+076F ARABIC LETTER HAH WITH SMALL ARABIC LETTER TAH AND TWO DOTS

ݰ

U+0770 ARABIC LETTER SEEN WITH SMALL ARABIC LETTER TAH AND TWO DOTS

ݱ

U+0771 ARABIC LETTER REH WITH SMALL ARABIC LETTER TAH AND TWO DOTS

Addition for Torwali

ݲ

U+0772 ARABIC LETTER HAH WITH SMALL ARABIC LETTER TAH ABOVE

Additions for Burushaski

ݳ

U+0773 ARABIC LETTER ALEF WITH EXTENDED ARABIC-INDIC DIGIT TWO ABOVE

ݴ

U+0774 ARABIC LETTER ALEF WITH EXTENDED ARABIC-INDIC DIGIT THREE ABOVE

ݵ

U+0775 ARABIC LETTER FARSI YEH WITH EXTENDED ARABIC-INDIC DIGIT TWO ABOVE

ݶ

U+0776 ARABIC LETTER FARSI YEH WITH EXTENDED ARABIC-INDIC DIGIT THREE ABOVE

ݷ

U+0777 ARABIC LETTER FARSI YEH WITH EXTENDED ARABIC-INDIC DIGIT FOUR BELOW

ݸ

U+0778 ARABIC LETTER WAW WITH EXTENDED ARABIC-INDIC DIGIT TWO ABOVE

ݹ

U+0779 ARABIC LETTER WAW WITH EXTENDED ARABIC-INDIC DIGIT THREE ABOVE

ݺ

U+077A ARABIC LETTER YEH BARREE WITH EXTENDED ARABIC-INDIC DIGIT TWO ABOVE

ݻ

U+077B ARABIC LETTER YEH BARREE WITH EXTENDED ARABIC-INDIC DIGIT THREE ABOVE

ݼ

U+077C ARABIC LETTER HAH WITH EXTENDED ARABIC-INDIC DIGIT FOUR BELOW

ݽ

U+077D ARABIC LETTER SEEN WITH EXTENDED ARABIC-INDIC DIGIT FOUR ABOVE

Arabic letters for European and Central Asian languages

ࢭ

U+08AD ARABIC LETTER LOW ALEF

Description in the Unicode standard:

• Bashkir, Tatar
ࢮ

U+08AE ARABIC LETTER DAL WITH THREE DOTS BELOW

Description in the Unicode standard:

• Belarusian
ࢯ

U+08AF ARABIC LETTER SAD WITH THREE DOTS BELOW

Description in the Unicode standard:

• Belarusian
ࢰ

U+08B0 ARABIC LETTER GAF WITH INVERTED STROKE

Description in the Unicode standard:

• Crimean Tatar, Chechen, Lak
ࢱ

U+08B1 ARABIC LETTER STRAIGHT WAW

Description in the Unicode standard:

• Tatar

Arabic letter for Berber

ࢲ

U+08B2 ARABIC LETTER ZAIN WITH INVERTED V ABOVE

Archaic letters

ٮ

U+066E ARABIC LETTER DOTLESS BEH

ٯ

U+066F ARABIC LETTER DOTLESS QAF

Punctuation

؉

U+0609 ARABIC-INDIC PER MILLE SIGN

Description in the Unicode standard:

→ (per mille sign - 2030)
؊

U+060A ARABIC-INDIC PER TEN THOUSAND SIGN

Description in the Unicode standard:

→ (per ten thousand sign - 2031)
؛

U+061B ARABIC SEMICOLON

Description in the Unicode standard:

• also used with Thaana and Syriac in modern text
→ (semicolon - 003B)
→ (turned semicolon - 2E35)

Urdu punctuation

؜

U+061C ARABIC LETTER MARK

؞

U+061E ARABIC TRIPLE DOT PUNCTUATION MARK

؟

U+061F ARABIC QUESTION MARK

Description in the Unicode standard:

• also used with Thaana and Syriac in modern text
→ (question mark - 003F)
→ (reversed question mark - 2E2E)

Arabic Called .

Looks like ؟

Urdu punctuation

٪

U+066A ARABIC PERCENT SIGN

Description in the Unicode standard:

→ (percent sign - 0025)

Urdu punctuation

٫

U+066B ARABIC DECIMAL SEPARATOR

Arabic Called .

Looks like ٫

Urdu punctuation, ašāriya əʃɑːrɪjɑ.

In Urdu this looks like a hamza ٫, eg. ۲۵۲۴٫۲۳ do hazɑːr pɑ̃ːʧ sau caubiːs aʃɑːrɪjɑː do tiːn (2524.23).

٬

U+066C ARABIC THOUSANDS SEPARATOR

Description in the Unicode standard:

→ (apostrophe - 0027)
→ (right single quotation mark - 2019)
٭

U+066D ARABIC FIVE POINTED STAR

Description in the Unicode standard:

• appearance rather variable
→ (asterisk - 002A)
۔

U+06D4 ARABIC FULL STOP

Description in the Unicode standard:

• Urdu

Arabic Called .

Looks like ۔

Urdu punctuation

Currency sign

؋

U+060B AFGHANI SIGN

Subtending marks

؀

U+0600 ARABIC NUMBER SIGN

Urdu symbol

Used to indicate the beginning of a number, eg. ۱۲۳؀.

The stroke may be elongated and pass under the number, but this is not a combining character.

؁

U+0601 ARABIC SIGN SANAH

Urdu symbol, sanh sənh.

Gregorian dates are indicated by placing this long sweep below the year digits with the word عیسوی iːsviː Christian era. This is usually abbreviated as a hamza ء.

Dates using the Muslim calendar are followed by the word ہجری hɪʤriː, which is abbreviated with the symbol ھ.

The sanh sign is typed before the digits (in a rtl context): eg. ۲۰۰۴؁ء ‎(2004). It is not a combining character, even though it displays beneath the digits.

The sanh is derived from the Arabic word for year سنة.

؂

U+0602 ARABIC FOOTNOTE MARKER

Urdu symbol

Used to indicate that a number is a footnote, eg. ۵؎.

The number usually sits above the symbol, although this is not a combining character however I can't figure out whether it needs to be typed in before or after the number - though I think before. None of the fonts I have put the number above it.

Do not confuse this with 060E ARABIC POETIC VERSE SIGN.

؃

U+0603 ARABIC SIGN SAFHA

Urdu symbol, safah səfəh

Used to indicate a page number, where English would use an abbreviation such as "pp. 35-45", eg. ؃۴۵. The stroke may be elongated and pass under the number.

The symbol is derived from the stroke used for 0635: ARABIC LETTER SAD.

؄

U+0604 ARABIC SIGN SAMVAT

Description in the Unicode standard:

• used for writing Samvat era dates in Urdu

Radix Symbols

؆

U+0606 ARABIC-INDIC CUBE ROOT

Description in the Unicode standard:

→ (cube root - 221B)
؇

U+0607 ARABIC-INDIC FOURTH ROOT

Description in the Unicode standard:

→ (fourth root - 221C)

Letterlike symbol

؈

U+0608 ARABIC RAY

Poetic marks

؎

U+060E ARABIC POETIC VERSE SIGN

Urdu Often used to mark the beginning of poetic verse. For an example see Figure 8 in Jonathan Kew's examples.

Do not confuse this with 0602 ARABIC FOOTNOTE MARKER.

؏

U+060F ARABIC SIGN MISRA

Urdu symbol misra misrə

Urdu poetry typically creates poems from couplets. This symbol is used to indicate a single line (misra) of a couplet (shayr) from an Urdu poem, when quoted in text.

This sign is used when quoting a line of verse in text. It is used at the beginning of the line, and is followed by the line of verse. See an example.

Honorifics

ؐ

U+0610 ARABIC SIGN SALLALLAHOU ALAYHE WASSALLAM

Description in the Ufnicode standard:

• represents sallallahu alayhe wasallam 'may God's peace and blessings be upon him'

Urdu Represents sallallahu alayhe wasallam sallallao alae va sallam (may God's peace and blessings be upon him) صلّى الله عليه وسلّم. Used over the name of Mohammed.

One of several marks that represent phrases expressing the status of a person, most having specifically religious meaning.

The mark is really associated with a word, rather than a character, but the placement is left to the user. The mark is often added somewhere in the middle of a name, but commonly appears towards the end. This depends to some extent on the letter shapes present and the calligraphic style in use, eg. محمّدؐ muhamːed sallallao alae va sallam.

ؑ

U+0611 ARABIC SIGN ALAYHE ASSALLAM

Description in the Unicode standard:

• represents alayhe assalam 'upon him be peace'

Urdu Represents alayhe asallam alejsallam (upon him be peace) عليه السّلام. Used over the name of prophets other than Mohammed.

One of several marks that represent phrases expressing the status of a person, most having specifically religious meaning.

The mark is really associated with a word, rather than a character, but the placement is left to the user. The mark is often added somewhere in the middle of a name, but commonly appears towards the end. This depends to some extent on the letter shapes present and the calligraphic style in use, eg. عیسؑیٰ isaː salejsallam Christ, upon him be peace!.

ؒ

U+0612 ARABIC SIGN RAHMATULLAH ALAYHE

Description in the Unicode standard:

• represents rahmatullah alayhe 'may God have mercy upon him'

Urdu Represents rahmatulla alayhe raːmatʊlla alee (may God have mercy upon him) رحمت الله عليه. Used over the names of saints, religious authorities, and other deceased pious persons.

One of several marks that represent phrases expressing the status of a person, most having specifically religious meaning.

The mark is really associated with a word, rather than a character, but the placement is left to the user. The mark is often added somewhere in the middle of a name, but commonly appears towards the end. This depends to some extent on the letter shapes present and the calligraphic style in use, eg. قاضی نور محمّدؒ kaziː nur mamed rahmatulla alayhe Qazi Nur Muhammad, may God have mercy upon him!.

ؓ

U+0613 ARABIC SIGN RADI ALLAHOU ANHU

Description in the Unicode standard:

• represents radi allahu 'anhu 'may God be pleased with him'

Urdu Represents radi allahu 'anhu raziallaːo ano (may God be pleased with him) رضي الله عنه. Used over the names of the Companions of the Prophet.

One of several marks that represent phrases expressing the status of a person, most having specifically religious meaning.

The mark is really associated with a word, rather than a character, but the placement is left to the user. The mark is often added somewhere in the middle of a name, but commonly appears towards the end. This depends to some extent on the letter shapes present and the calligraphic style in use, eg. ابوبکرؓ abu bakr, raziallaːo ano Abu Bakr, may God be pleased with him!.

ؔ

U+0614 ARABIC SIGN TAKHALLUS

Description in the Unicode standard:

• sign placed over the name or nom-de-plume of a poet, or in some writings used to mark all proper names

Urdu Sign placed over the name or nom-de-plume of a poet, or in some writings used to mark all proper names.

The mark is really associated with a word, rather than a character, but the placement is left to the user. The mark is often added somewhere in the middle of a name, but commonly appears towards the end. This depends to some extent on the letter shapes present and the calligraphic style in use, eg. عطاشادؔ ataː ʃaːd Ata Shad (author's name) There seems to be a problem displaying this with Nafees fonts.

Koranic annotation signs

ؗ

U+0617 ARABIC SMALL HIGH ZAIN

ؘ

U+0618 ARABIC SMALL FATHA

Description in the Unicode standard:

• should not be confused with 064E FATHA
ؙ

U+0619 ARABIC SMALL DAMMA

Description in the Unicode standard:

• should not be confused with 064F DAMMA
ؚ

U+061A ARABIC SMALL KASRA

Description in the Unicode standard:

• should not be confused with 0650 KASRA
ۖ

U+06D6 ARABIC SMALL HIGH LIGATURE SAD WITH LAM WITH ALEF MAKSURA

ۗ

U+06D7 ARABIC SMALL HIGH LIGATURE QAF WITH LAM WITH ALEF MAKSURA

ۘ

U+06D8 ARABIC SMALL HIGH MEEM INITIAL FORM

ۙ

U+06D9 ARABIC SMALL HIGH LAM ALEF

ۚ

U+06DA ARABIC SMALL HIGH JEEM

ۛ

U+06DB ARABIC SMALL HIGH THREE DOTS

ۜ

U+06DC ARABIC SMALL HIGH SEEN

۝

U+06DD ARABIC END OF AYAH

۞

U+06DE ARABIC START OF RUB EL HIZB

۟

U+06DF ARABIC SMALL HIGH ROUNDED ZERO

Description in the Unicode standard:

• smaller than the typical circular shape used for 0652
۠

U+06E0 ARABIC SMALL HIGH UPRIGHT RECTANGULAR ZERO

ۡ

U+06E1 ARABIC SMALL HIGH DOTLESS HEAD OF KHAH

Description in the Unicode standard:

= Arabic jazm
• presentation form of 0652, using font technology to select the variant is preferred
• used in some Korans to mark absence of a vowel
→ (arabic sukun - 0652)
ۢ

U+06E2 ARABIC SMALL HIGH MEEM ISOLATED FORM

ۣ

U+06E3 ARABIC SMALL LOW SEEN

ۤ

U+06E4 ARABIC SMALL HIGH MADDA

ۥ

U+06E5 ARABIC SMALL WAW

ۦ

U+06E6 ARABIC SMALL YEH

ۧ

U+06E7 ARABIC SMALL HIGH YEH

ۨ

U+06E8 ARABIC SMALL HIGH NOON

۩

U+06E9 ARABIC PLACE OF SAJDAH

Description in the Unicode standard:

• there is a range of acceptable glyphs for this character
۪

U+06EA ARABIC EMPTY CENTRE LOW STOP

۫

U+06EB ARABIC EMPTY CENTRE HIGH STOP

۬

U+06EC ARABIC ROUNDED HIGH STOP WITH FILLED CENTRE

ۭ

U+06ED ARABIC SMALL LOW MEEM

U+08F0 ARABIC OPEN FATHATAN

Description in the Unicode standard:

= successive fathatan

U+08F1 ARABIC OPEN DAMMATAN

Description in the Unicode standard:

= successive dammatan

U+08F2 ARABIC OPEN KASRATAN

Description in the Unicode standard:

= successive kasratan

U+08F3 ARABIC SMALL HIGH WAW

Deprecated letter

ٳ

U+0673 ARABIC LETTER ALEF WITH WAVY HAMZA BELOW

Description in the Unicode standard:

• Kashmiri
• this character is deprecated and its use is strongly discouraged
• use the sequence 0627 065F instead

Arabic-Indic digits

٠

U+0660 ARABIC-INDIC DIGIT ZERO

١

U+0661 ARABIC-INDIC DIGIT ONE

٢

U+0662 ARABIC-INDIC DIGIT TWO

٣

U+0663 ARABIC-INDIC DIGIT THREE

٤

U+0664 ARABIC-INDIC DIGIT FOUR

٥

U+0665 ARABIC-INDIC DIGIT FIVE

٦

U+0666 ARABIC-INDIC DIGIT SIX

٧

U+0667 ARABIC-INDIC DIGIT SEVEN

٨

U+0668 ARABIC-INDIC DIGIT EIGHT

٩

U+0669 ARABIC-INDIC DIGIT NINE

Eastern Arabic-Indic digits

۰

U+06F0 EXTENDED ARABIC-INDIC DIGIT ZERO

Urdu digit, sifr sɪfr.

Looks like: ۰

۱

U+06F1 EXTENDED ARABIC-INDIC DIGIT ONE

Urdu digit, ek ek

Looks like: ۱

۲

U+06F2 EXTENDED ARABIC-INDIC DIGIT TWO

Urdu digit, do do

Looks like: ۲

۳

U+06F3 EXTENDED ARABIC-INDIC DIGIT THREE

Urdu digit, tīn tiːn

Looks like: ۳

۴

U+06F4 EXTENDED ARABIC-INDIC DIGIT FOUR

Description in the Unicode standard:

• Persian has a different glyph than Sindhi and Urdu

Urdu digit, cār ʧɑːr

Shape is different from Persian and Arabic. Looks like: ۴

۵

U+06F5 EXTENDED ARABIC-INDIC DIGIT FIVE

Description in the Unicode standard:

• Persian, Sindhi, and Urdu share glyph different from Arabic

Urdu digit, pāṅc pɑ̃ːʧ

Shape is different from Arabac. Looks like: ۵

۶

U+06F6 EXTENDED ARABIC-INDIC DIGIT SIX

Description in the Unicode standard:

• Persian, Sindhi, and Urdu have glyphs different from Arabic

Urdu digit, che ʧʰe

Shape is different from Arabic. Looks like: ۶

۷

U+06F7 EXTENDED ARABIC-INDIC DIGIT SEVEN

Description in the Unicode standard:

• Urdu and Sindhi have glyphs different from Arabic

Urdu digit, sāt sɑːt

Shape is different from Arabic. Looks like: ۷

۸

U+06F8 EXTENDED ARABIC-INDIC DIGIT EIGHT

Urdu digit, āṭh ɑːʈʰ

Looks like: ۸

۹

U+06F9 EXTENDED ARABIC-INDIC DIGIT NINE

Urdu digit, nau nəʊ.

Looks like: ۹

Author: Richard Ishida.

Content first published 3 February, 2014. This version 2014-10-13 6:47 GMT