High Surrogates

There are no glyphs in this Unicode block. The High Surrogates first appeared in Unicode 2.0, in 1996.

Unicode blocks High Surrogates
Alternate names
Timeframe 1996 to present
Regions
Type
Alternate names
Status
Number of speakers
Languages Awadhi, Bhojpuri, Magahi, Maithili, Urdu, and other Hindi-related languages
Main sources The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, p. 535 (Section 16.6).
Secondary sources
Proposal

IPA Extensions

IPA Extensions include the symbols of the International Phonetic Alphabet, which is a standardized system for representing speech sounds. The IPA first appeared in 1886 and has undergone several revisions of content and usage since that time. The Unicode Standard covers all single symbols and diacritics in the last published IPA revision (1999) as well as a few earlier IPA symbols.

Unicode blocks IPA Extensions
Alternate names
Timeframe 1886 to present
Regions South Asian
Type alphabet
Alternate names left to right
Status living
Number of speakers
Languages Awadhi, Bhojpuri, Magahi, Maithili, Urdu, and other Hindi-related languages
Main sources MacMahon, M. 1996. “Phonetic Notation” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 821-846.
Secondary sources
Proposal

Ideographic Description Characters

Ideographic Description characters are a set of 12 characters to be used as a reference for unencoded ideographs. Unencoded ideographs can be described by using these characters and the encoded ideographs, so the reader can create a mental picture of the ideographs from the description. These were first introduced in Unicode 3.0 (1999-2000).

Unicode blocks Ideographic Description Characters
Alternate names
Timeframe 20C to present
Regions
Type
Alternate names
Status
Number of speakers
Languages Awadhi, Bhojpuri, Magahi, Maithili, Urdu, and other Hindi-related languages
Main sources The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, pp. 409-412 (Section 12.2).
Secondary sources
Proposal

Imperial Aramaic

The Imperial Aramaic script was used to write the Aramaic language from the middle of the 8C BCE. It became widely used when Aramaic became the principal administrative language of the Assyrian empire and then the official language of the Achaemenid Persian empire. Imperial Aramaic evolved from Phoenician and was the source of many other scripts, such as the square Hebrew script, the Arabic script, and scripts used for the Middle Persian languages (such as Inscriptional Parthian, Inscriptional Pahlavi, and Avestan).

Unicode blocks Imperial Aramaic
Alternate names
Timeframe x-8C to -4C
Regions South Asian
Type abjad
Alternate names right to left
Status historical
Number of speakers 0
Languages Awadhi, Bhojpuri, Magahi, Maithili, Urdu, and other Hindi-related languages
Main sources O’Connor, M. 1996. “Epigraphic Semitic scripts” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 88-107.
Secondary sources Skjaervo, P.O. 1996. "Aramaic Scripts for Iranian Languages" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 515-535.
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3339.pdf

Inscriptional Pahlavi

Inscriptional Pahlavi is an historical script that was used to write a number of Iranian and Indo-European languages, chiefly Parthian and Middle Persian, in the area of present-day Iran and surrounding areas. It was descended from the Imperial Aramaic script and was used regularly as a monumental script until the 5 CE.

Unicode blocks Inscriptional Pahlavi
Alternate names
Timeframe 2C to 5C
Regions South Asian
Type abjad
Alternate names right to left
Status historical
Number of speakers 0
Languages Awadhi, Bhojpuri, Magahi, Maithili, Urdu, and other Hindi-related languages
Main sources Skjaervo, P.O. 1996. "Aramaic Scripts for Iranian Languages" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 515-535.
Secondary sources
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3286.pdf

Inscriptional Parthian

Inscriptional Parthian is an historical script that was used to write a number of Iranian and Indo-European languages, chiefly Parthian, and Middle Persian, in present-day Iran and surrounding areas. It derives from the Imperial Aramaic script. By the 2C CE the script was used as an official script of the Sassanid Empire, alongside Inscriptional Pahlavi, which was used to write the Sassanians’ own language. Inscriptional Parthian continued to be used into the 3C CE; the last known inscription of Inscriptional Parthian dates to 292 CE.

Unicode blocks Inscriptional Parthian
Alternate names
Timeframe 2C to 3C
Regions South Asian
Type abjad
Alternate names right to left
Status historical
Number of speakers 0
Languages Awadhi, Bhojpuri, Magahi, Maithili, Urdu, and other Hindi-related languages
Main sources Skjaervo, P.O. 1996. "Aramaic Scripts for Iranian Languages" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 515-535.
Secondary sources
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3286.pdf

Javanese

The Javanese script is used to write the Javanese language, as well as Sanskrit, Jawa Kuna (Sanskritized Javanese), Kawi transcriptions, and the modern languages Sundanese and Sasak. The script descends from the ancient Brahmi script. Although the Javanese script has been supplanted by use of the Latin alphabet, it is still used in ceremonial domains. The traditional Javanese texts appear, for example, on palm leaves, which are bound together in books called "lontar".

Unicode blocks Javanese
Alternate names Kaithınagarı, Kayathi
Timeframe 17C to present
Regions South Asian
Type abugida
Alternate names left to right
Status living
Number of speakers 114 million
Languages Awadhi, Bhojpuri, Magahi, Maithili, Urdu, and other Hindi-related languages
Main sources Kuipers, J., and R. McDermott. 1996. "Insular Southeast Asian Scripts" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 474-484.
Secondary sources
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3319.pdf

Kaithi

Kaithi is a Brahmi-derived script used to write the Maithili language and other languages in northern India. The Kaithi script was used in administrative documents since the 16C, as well as in routine writing, commercial transactions, and religious and literary manuscripts. Kaithi script is still used to a limited extent today, having been largely replaced by Devanāgarī in the early 20C.

Unicode blocks Kaithi
Alternate names Kaithınagarı, Kayathi
Timeframe 16C to present
Regions South Asian
Type abugida
Alternate names left to right
Status living
Number of speakers 185 million (potential users)
Languages Awadhi, Bhojpuri, Magahi, Maithili, Urdu, and other Hindi-related languages
Main sources Grierson, George A. 1899. A Handbook to the Kaithi Character. 2nd rev. ed. of the title A Kaithi Handbook, 1881. Calcutta: Thacker, Spink & Co.
Secondary sources Grierson, George A. 1903. The Linguistic Survey of India. Vol. V. Indo-Aryan Family. Eastern Group. Part II. Specimens of the Biharı and Oriya languages. Calcutta: Office of the Superintendent of Government Printing, India.
Proposal http://std.dkuug.dk/JTC1/SC2/WG2/docs/n3389.pdf

Kana Supplement

The Kana Supplement block is made up of historic and variant forms of Japanese kana characters, including those variants that are referred to as hentaigana in Japanese.

Unicode blocks Kana Supplement
Alternate names
Timeframe
Regions South Asian
Type syllabary
Alternate names variable
Status historical
Number of speakers 0
Languages Awadhi, Bhojpuri, Magahi, Maithili, Urdu, and other Hindi-related languages
Main sources Okumura, T., and T. Ooya. 1977. Kogen'e eben: Kogen'e ebenshōho. Tōkyō: Benseisha.
Secondary sources
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3388.pdf

Kanbun

The Kanbun block is composed of symbols used in Japanese texts to indicate the Japanese reading order of classical Chinese texts. These marks are widely used in literature, and are typically written in an annotation style to the left of each line of vertically rendered Chinese text.

Unicode blocks Kanbun
Alternate names
Timeframe
Regions South Asian
Type ideographic
Alternate names variable
Status historical
Number of speakers 122 million
Languages Awadhi, Bhojpuri, Magahi, Maithili, Urdu, and other Hindi-related languages
Main sources Japanese Industrial Standards Committee. 2004. Nihongo Bunsho no Kumihan Houhou (Formatting rules for Japanese documents). Tokyo: Japanese Standards Association. (=JIS X 4051:2004).
Secondary sources
Proposal

Kangxi Radicals

Kangxi Radicals are East Asian ideographs or fragments of ideographs that are used to index dictionaries and word lists, and serve as the basis for creating new ideographs. The set of 214 radicals in the KangXi Radicals block derives from the 18C KangXi dictionary, which serves as a universally-recognized sets of radicals. (The CJK Radicals Supplement block contains variants of these radicals.) The Chinese standard CNS 11643-1992 includes 212 of the 214 radicals.

Unicode blocks Kangxi Radicals
Alternate names
Timeframe 18C to present
Regions South Asian
Type ideographic
Alternate names variable
Status living
Number of speakers
Languages Awadhi, Bhojpuri, Magahi, Maithili, Urdu, and other Hindi-related languages
Main sources Zhongwen biaozhun jiaohuanma (Chinese standard interchange code). Taipei: 1992. (=CNS 11643-1992)
Secondary sources
Proposal

Kannada

The Kannada script is used to write the Kannada (or Kanarese) language of the Karnataka state in India and is also used to write minority languages such as Tulu. Kannada is a South Indian script that is very closely related to the Telugu script and shares many features common to other Indic scripts. The Kannada language is also used in many parts of Tamil Nadu, Kerala, Andhra Pradesh, and Maharashtra. It dates to at least 1500.

Unicode blocks Kannada
Alternate names
Timeframe 1500 to present
Regions South Asian
Type abugida
Alternate names left to right
Status living
Number of speakers 35 million
Languages Awadhi, Bhojpuri, Magahi, Maithili, Urdu, and other Hindi-related languages
Main sources Bright, W. 1996. "Kannada and Telugu Writing” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 413-419.
Secondary sources
Proposal

Katakana

Katakana is a syllabary script used to write non-Japanese (usually Western) words phonetically in Japanese. It is also used to write Japanese words with visual emphasis. Katakana syllables are phonetically equivalent to corresponding Hiragana syllables. The script was developed in the early Heian Period (794-1185).

Unicode blocks Katakana, Katakana Phonetic Extensions
Alternate names
Timeframe 8C to present
Regions South Asian
Type syllabary
Alternate names variable
Status living
Number of speakers 122 million
Languages Awadhi, Bhojpuri, Magahi, Maithili, Urdu, and other Hindi-related languages
Main sources Japanese Industrial Standards Committee. 1997. 7 bitto oyobi 8 bitto no 2 baito jouhou koukan you fugouka kanji shuugou (7-bit and 8-bit double byte coded kanji sets for information interchange). Tokyo: Japanese Standards Association. (=JIS X 0208)
Secondary sources
Proposal

Kayah Li

The Kayah Li alphabet was devised by Htae Bu Phae in 1962, to write the Eastern and Western Kayah Li languages in the Kayah and Karen states of Myanmar. The script is also taught in schools in refugee camps in Thailand. The Kayah Li or Kayah language is a member of the Karen branch of the Sino-Tibetan language family.

Unicode blocks Kayah Li
Alternate names
Timeframe 1962 to present
Regions South Asian
Type alphabet
Alternate names left to right
Status living
Number of speakers 570000
Languages Awadhi, Bhojpuri, Magahi, Maithili, Urdu, and other Hindi-related languages
Main sources Bennett, J. Fraser. 1993. Kayah Li Script: A Brief Description. Urbana-Champaign: University of Illinois.
Secondary sources Solnit, David B. 1997. Eastern Kayah Li: Grammar, Texts, Glossary. Honolulu: University of Hawai‘i Press.
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3038.pdf

Khmer

Khmer, also called aksaa khmae (“Khmer letters”), is the script of the Khmer language, also known as Cambodian. The Khmer script is also used to write a number of regional minority languages, such as Tampuan, Krung, and Cham. It is the official script of Cambodia and is descended from the Brahmi script of South India.

Unicode blocks Khmer, Khmer Symbols
Alternate names Kaithınagarı, Kayathi
Timeframe 6C to present
Regions South Asian
Type abugida
Alternate names left to right
Status living
Number of speakers 13.9 million
Languages Awadhi, Bhojpuri, Magahi, Maithili, Urdu, and other Hindi-related languages
Main sources Schiller, E. 1996. "Khmer Writing" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 467-473.
Secondary sources
Proposal

Lao

The Lao script (Aksone Lao) is used to write the Lao language and other minority languages in Laos. Both the language and script are closely related to Thai. The Lao script ultimately derives from Brahmi.

Unicode blocks Lao
Alternate names Kaithınagarı, Kayathi
Timeframe 16C? to present
Regions South Asian
Type abugida
Alternate names left to right
Status living
Number of speakers 3 million
Languages Awadhi, Bhojpuri, Magahi, Maithili, Urdu, and other Hindi-related languages
Main sources Diller, A. 1996. "Thai and Lao Writing" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 457-466.
Secondary sources
Proposal

Latin

The Latin script (or Roman alphabet) is used to write a wide variety of languages and is one of the most widely used alphabetic writing system in the world today. It was derived from a form of the Western Greek alphabet from Euboea, which was borrowed and modified by the Etruscans, and then further modified by the Romans to write the Latin language. The earliest documents date to 7C BCE. In the process of adapting Latin to other languages, numerous extensions have been devised and appear in the various Latin Extended blocks. Some of the characters that appear in the Latin Extended blocks come from earlier character sets.

Unicode blocks Latin Extended Additional, Latin Extended-A, Latin Extended-B, Latin Extended-C, Latin Extended-D, Latin-1 Supplement
Alternate names Kaithınagarı, Kayathi
Timeframe x-7C to present
Regions South Asian
Type alphabet
Alternate names left to right
Status living
Number of speakers 2.1 billion
Languages Awadhi, Bhojpuri, Magahi, Maithili, Urdu, and other Hindi-related languages
Main sources Tuttle, E, W. Senner, et al. 1996. "Adaptations of the Roman Alphabet" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 633-699.
Secondary sources
Proposal

Lepcha

The Lepcha (Róng) script is used to write the Lepcha language in the Indian states of Sikkim, West Bengal and Kalimpong, as well as in Nepal and Bhutan. Lepcha is based on Tibetan writing with some influence from the Burmese script. Some believe it was inspired by Buddhist missionaries and invented by the Lepcha scholar Thikúng Men Salóng in 1720. Today the Lepcha script is used in newspapers, magazines, textbooks, collections of poetry, prose and plays.

Unicode blocks Lepcha
Alternate names Kaithınagarı, Kayathi
Timeframe 1720 to present
Regions South Asian
Type abugida
Alternate names left to right
Status living
Number of speakers 65000
Languages Awadhi, Bhojpuri, Magahi, Maithili, Urdu, and other Hindi-related languages
Main sources Van der Kuijp, L. 1996. "The Tibetan Script and Derivatives” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 431-441.
Secondary sources
Proposal http://www.evertype.com/standards/iso10646/pdf/leke.pdf

Letterlike Symbols

Letterlike symbols derive from ordinary letters of an alphabetic script, but have become symbols. This set includes symbols based on Latin, Greek, and Hebrew letters.

Unicode blocks Letterlike Symbols
Alternate names
Timeframe various
Regions South Asian
Type symbols
Alternate names
Status living
Number of speakers
Languages Awadhi, Bhojpuri, Magahi, Maithili, Urdu, and other Hindi-related languages
Main sources The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, pp. 480-481 (Section 15.2).
Secondary sources
Proposal

Limbu

The Limbu script is a Brahmic script primarily used to write Limbu, a Tibeto-Burman language that is mainly spoken in eastern Nepal, but also in the Indian states of Sikkim and West Bengal. It is often called “Sirijanga” after the Limbu cultural hero Sirijanga, who is credited with inventing the script. It is also called “Kirat,” after a Sanskrit term that probably refers to some type of non-Aryan hill-dwellers. There are three forms of the script: the early form based on the forms found in 19C manuscripts, the early-modern script used in publications from 1928 to 1970s, and the modern script, dating to 1970s.

Unicode blocks Limbu
Alternate names Kaithınagarı, Kayathi
Timeframe 19C to present
Regions South Asian
Type abugida
Alternate names left to right
Status living
Number of speakers 421500
Languages Awadhi, Bhojpuri, Magahi, Maithili, Urdu, and other Hindi-related languages
Main sources Driem, George van. 1987. A Grammar of Limbu. Berlin, New York: Mouton de Gruyter.
Secondary sources Cemjonga, Imana Simha, and Bairagi Kaila, eds. 2059 [2002] Limbu-Nepali-Angreji sabdakos (Limbu-Nepali-English Dictionary). Kathmandu: Royal Nepal Academy.
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2410.pdf