Small Form Variants

Small form variants is a block of small variants of ASCII punctuation marks, including a small ampersand, small percent sign, small question mark and a small comma. These were encoded in the Unicode Standard as compatibility characters from the Chinese standard, CNS 11643.

Unicode blocks Small Form Variants
Alternate names
Timeframe
Regions
Type
Alternate names
Status
Number of speakers
Languages Dhivehi (Maldivian)
Main sources The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, p. 201 (Section 6.2).
Secondary sources CNS 11643-1992: Zhongwen biaozhun jiaohuanma (Chinese standard interchange code). Taipei: 1992.
Proposal

Spacing Modifier Letters

The Spacing Modifier Letters block is primarily made up of a set of phonetic modifiers used to indicate that the pronunciation of an adjacent letter is different in some way, or to mark stress or tone. In some cases, the character may itself represent a sound. The block includes many characters required for the International Phonetic Alphabet, and a number of Uralic Phonetic Alphabet modifers. Spacing clones of diacritics, specified in some corporate standards, are also included.

Unicode blocks Spacing Modifier Letters
Alternate names
Timeframe various
Regions
Type alphabet
Alternate names
Status living
Number of speakers
Languages Dhivehi (Maldivian)
Main sources The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, pp. 228-229 (Section 7.8).
Secondary sources
Proposal

Specials

Characters in the Specials block are not interpreted as control or graphic characters but are provided to facilitate current software practices. These symbols include byte order marks, special character definitions, annotation characters and replacement characters. The first characters in this block appeared in Unicode 1.1 in 1993.

Unicode blocks Specials
Alternate names
Timeframe 1993 to present
Regions
Type
Alternate names
Status
Number of speakers
Languages Dhivehi (Maldivian)
Main sources The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, pp. 537-541 (Section 16.8).
Secondary sources
Proposal

Sundanese

The Sundanese script is used to write the Sundanese language, which is spoken on west Java in Indonesia.  Sundanese is a descendant of the Brahmi script, and hence is related to many other scripts of South Asia and Southeast Asia that are derived from Brahmi. Today Sundanese is primarily written using the Latin script, but the Sundanese script is taught in the schools and appears on signage. Old Sundanese (Sunda Kuna) dates from 14C to 18C, and is handled by the characters in the Sundanese and the Sundanese Supplement blocks. Modern Sundanese has been in use from the 17C. The current form of the script was made official in 1996.

Unicode blocks Sundanese
Alternate names Taana, Tāna
Timeframe 14C to present
Regions South Asian
Type abugida
Alternate names left to right
Status living
Number of speakers 34 million
Languages Dhivehi (Maldivian)
Main sources Baidillah, Idin, Cucu Komara, and Deuis Fitni. [2002] Ngalagena: Panglengkep Pangajaran Aksara Sunda pikeun Murid Sakola Dasar/Dikdas 9 Taun. [Bandung]: CV Walatra.
Secondary sources
Proposal http://std.dkuug.dk/JTC1/SC2/WG2/docs/n3022.pdf; http://std.dkuug.dk/JTC1/SC2/WG2/docs/n3666.pdf

Superscripts and Subscripts

The Superscripts and Subscripts block includes letters or digits that are positioned above or below the baseline in typographical layout. In many cases, superscripts and subscripts should be handled with style or mark-up (instead of using the characters from this block), in cases where the raised or lowered characters do not belong to plain text. The exception is when the superscript or subscript letters are part of a specialized phonetic alphabet, such as the Uralic Phonetic Alphabet. Several of the characters in this block derive from other standards or vendor code pages, and are considered compatibility characters.

Unicode blocks Superscripts and Subscripts
Alternate names
Timeframe various
Regions
Type
Alternate names
Status living
Number of speakers
Languages Dhivehi (Maldivian)
Main sources The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, pp. 488-489 (Section 15.3).
Secondary sources
Proposal

Syloti Nagri

The Syloti Nagri script is used for writing the Sylheti language, an Indo-European language spoken in the Barak Valley region of northeast Bangladesh and southeast Assam in India. The script is derived from Brahmi. It has traditionally been dated to 14C, but may be dated to 16C or 18C.

Unicode blocks Syloti Nagri
Alternate names Taana, Tāna
Timeframe 14C? to present
Regions South Asian
Type abugida
Alternate names left to right
Status living
Number of speakers 10.3 million
Languages Dhivehi (Maldivian)
Main sources Bhuiya, M.A. 2000. Jalalavadi Nagri: a unique script & literature of Sylheti Bangla. Badarpur, Assam, India: National Publishers.
Secondary sources Qadir, Dr. S.M. Ghulam. 1999. Sileti Nagri Lipi - Bhasha O Sahitya (The Sylheti Nagri script - language and literature). PhD thesis, Bangla Academy, Dhaka.
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2591.pdf; http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2592.pdf

Syriac

The Syriac script is used for writing a number of modern languages and dialects, including literary usages, Neo-Aramaic dialects, Garshuni (Arabic written in the Syriac script), Christian Palestinian Aramaic, and historically for writing Armenian, Persian, and other languages. The earliest datable Syriac writing dates from the 6 CE. Syriac is also the active liturgical language for several communities in the Middle East (Syrian Orthodox, Assyrian, Maronite, Syrian Catholic, and Chaldaean) and southeast India (Syro-Malabar and Syro- Malankara).

Unicode blocks Syriac
Alternate names
Timeframe 6C to present
Regions South Asian
Type abjad
Alternate names right to left
Status living
Number of speakers 501000
Languages Dhivehi (Maldivian)
Main sources Daniels, P. 1996. "Aramaic Scripts for Aramaic Languages” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 499-510.
Secondary sources
Proposal

Tagalog

The Tagalog script was used to write Tagalog, Bisaya, Ilocano, and other languages in the Philippines. There are accounts dated to the mid-1500s written by Spanish missionaries mentioning the Tagalog script. However, the script fell out of common usage by the mid-1700s. The modern Tagalog language, also known as Filipino, is today written in the Latin script. The Tagalog script is a Brahmi-derived script, distantly related to the South Indian scripts. It is closely related to the Buhid, Hanunóo, and Tagbanwa scripts of the Philippines, though it may not be their direct parent. The ancestor of all four Philippine scripts may have been transported to the Philippines via palaeographic scripts of western Java between the 10 and 14 C CE.

Unicode blocks Tagalog
Alternate names Taana, Tāna
Timeframe 16C to mid-18C
Regions South Asian
Type abugida
Alternate names left to right
Status historical
Number of speakers 0
Languages Dhivehi (Maldivian)
Main sources Kuipers, J., and R. McDermott. 1996. "Insular Southeast Asian Scripts" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 474-484.
Secondary sources Santos, Hector. 1994. The Living Scripts. Los Angeles: Sushi Dog Graphics. (Ancient Philippine scripts series, 2).
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n1933.pdf

Tagbanwa

Tagbanwa is a living script used to write the Tagbanwa language (also known as Apurahuanoin) in Palawan, the Philippines. Tagbanwa is a Brahmi-derived script, distantly related to the South Indian scripts. It is closely related to the Hanunóo and Buhid scripts of the Philippines. All three scripts are related to Tagalog, but may not be directly descended from it. The ancestor of these Philippine scripts (including Tagalog) may have been transported to the Philippines via palaeographic scripts of western Java between the 10 and 14 C CE.

Unicode blocks Tagbanwa
Alternate names Taana, Tāna
Timeframe pre-19C to present
Regions South Asian
Type abugida
Alternate names left to right
Status living
Number of speakers 10000
Languages Dhivehi (Maldivian)
Main sources Kuipers, J., and R. McDermott. 1996. "Insular Southeast Asian Scripts" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 474-484.
Secondary sources Santos, Hector. 1994. The Living Scripts. Los Angeles: Sushi Dog Graphics. (Ancient Philippine scripts series, 2).
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n1933.pdf

Tags

Characters in the Tags block were meant to provide a mechanism for language tagging in Unicode plain text. However, the characters in this block are deprecated and their use is highly discouraged. Instead, users should use higher-level protocols, such as HTML or XML, which allow language tagging via markup. Characters in this block have no visible appearance in normal text; the tags themselves are not displayed. Language tags appeared in Unicode 3.1, March 2001.

Unicode blocks Tags
Alternate names
Timeframe 2001 to present
Regions
Type
Alternate names
Status
Number of speakers
Languages Dhivehi (Maldivian)
Main sources The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, pp. 541-545 (Section 16.9).
Secondary sources
Proposal

Tai Le

The Tai Le script is used to write the Tai Le language (also known as Tai Nüa, Dehong Dai, Tai Mau, Tai Kong, and Chinese Shan), spoken primarily in south central Yunnan, China. The script derives from Old Dehong Dai, whose history goes back some 700-800 years. The present form of the script dates to ca. 1954, when a systematic representation of the tones was introduced with the use of combining diacritics. The script was revised again in 1988.

Unicode blocks Tai Le
Alternate names Taana, Tāna
Timeframe ca. 1954 to present
Regions South Asian
Type alphabet
Alternate names left to right
Status lviing
Number of speakers 647400
Languages Dhivehi (Maldivian)
Main sources Coulmas, Florian. 1996. The Blackwell Encyclopedia of Writing Systems. Oxford, Cambridge: Blackwell, pp. 118-119.
Secondary sources
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2672.pdf

Tai Tham

Tai Tham script, sometimes called Lanna, Old Tai Lue, or Old Xishuangbanna Dai, is a descendant of the Brahmi and Old Mon script. It is used for the Kam Mu'ang (Northern Thai), Tai Lue, and Khün languages. It is also used for religious purposes to write Lao Tham (Old Lao), and can be found as the alphabet of old manuscripts in temples in Northern Thailand.

Unicode blocks Tai Tham
Alternate names Taana, Tāna
Timeframe 13C to present
Regions South Asian
Type abugida
Alternate names left to right
Status living
Number of speakers 100000
Languages Dhivehi (Maldivian)
Main sources Peltier, Anatole-Roger. 1996. Lanna Reader. Chiang Mai: Wat Tha Kradas.
Secondary sources
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3207.pdf

Tai Viet

The Tai Viet script is used to write three Tai languages spoken primarily in northwestern Vietnam, northern Laos, and central Thailand—Tai Dam (also known as Black Tai or Tai Noir), Tai Dón (White Tai or Tai Blanc), and Thai Song (Lao Song or Lao Song Dam). The script reflects great diversity in the traditional form of the script, depending upon the community. There has been an attempt to establish a standard for the Tai script, which was called Unified Alphabet. The script is used today by the Tai people in Vietnam.

Unicode blocks Tai Viet
Alternate names Taana, Tāna
Timeframe 16C? to present
Regions South Asian
Type abugida
Alternate names left to right
Status living
Number of speakers 1.3 million
Languages Dhivehi (Maldivian)
Main sources Cầm Trọng. 2005. “Thai Scripts in Vietnam” in Workshop on the Preservation and Digitization of Tai Scripts. Hanoi, Vietnam.
Secondary sources Baccam Don, Baccam Faluang, Baccam Hung, and Dorothy Fippinger. 1989. Tai Dam – English, English – Tai Dam Vocabulary Book. Summer Institute of Linguistics.
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3220.pdf

Tai Xuan Jing Symbols

The Tai Xuan Jing symbols include sets of monogram, digram and tetragram signs. These symbols appeared in China in a text called Tai Xuan Jing (literally, “the exceedingly arcane classic”), composed in 2 BCE by Yang Xiong (53 BCE-18 CE). The text is known in the West by several titles, including The Alternative I Ching and The Elemental Changes. The work is still published today.

Unicode blocks Tai Xuan Jing Symbols
Alternate names
Timeframe x-2C to present
Regions South Asian
Type symbols
Alternate names variable
Status historical
Number of speakers 0
Languages Dhivehi (Maldivian)
Main sources The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, pp. 506-507 (Section 15.8).
Secondary sources
Proposal

Tamil

The Tamil script descends from the South Indian branch of Brahmi. It is used to write the Tamil language of the Tamil Nadu state in south India and surrounding states, as well as for minority languages such as Badaga, Irula, Paniya, and Saurashtra. Tamil is also spoken in Sri Lanka, Singapore, and parts of Malaysia.

Unicode blocks Tamil
Alternate names
Timeframe 6C or 7C? to present
Regions South Asian
Type abugida
Alternate names left to right
Status living
Number of speakers 66.5 million
Languages Dhivehi (Maldivian)
Main sources Steever S. 1996. Tamil Writing” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 426-430.
Secondary sources
Proposal

Telugu

The Telugu script is used to write the Telugu language, spoken in the south central Indian state of Andhra Pradesh and nearby states. It is also used to write minority languages such as Gondi and Lambadi. It became a distinct script in 13C CE. Telugu is has a common descendent with the Kannada script.

Unicode blocks Telugu
Alternate names
Timeframe 13C to present
Regions South Asian
Type abugida
Alternate names left to right
Status living
Number of speakers 69.7 million
Languages Dhivehi (Maldivian)
Main sources Bright, W. 1996. "Kannada and Telugu Writing” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 413-419.
Secondary sources
Proposal

Thaana

The Thaana (or Taana, Tāna) script is used to write the modern Dhivehi (Divehi) language of the Republic of Maldives. Although Thaana has borrowed many of its glyphs from Arabic and shares a number of features with Arabic writing, Thaana is a true alphabet because the writing of vowels is mandatory. Thaana also derives some of its letters from an earlier script that was used on the Maldives, Dhives Akuru. Thaana was developed in the 18C and largely replaced Dhives Akuru at that time.

Unicode blocks Thaana
Alternate names Taana, Tāna
Timeframe 18C to present
Regions South Asian
Type alphabet
Alternate names right to left
Status living
Number of speakers 371000
Languages Dhivehi (Maldivian)
Main sources Geiger, Wilhelm. 1996. Maldivian Linguistic Studies. New Delhi: Asian Educational Services.
Secondary sources Maniku, Hassan Ahmed. 1990. Say It in Maldivian (Dhivehi), [by] H. A. Maniku [and] J. B. Disanayaka. Colombo: Lake House Investments.
Proposal

Thai

The Thai script is used to write the Thai language and other languages, such as Kuy and Pali. The Thai alphabet is is a member of the Indic family of scripts descended from Brahmi. Tradition holds that the script was created in 1283 by King Ramkhamhaeng.

Unicode blocks Thai
Alternate names
Timeframe 1283 to present
Regions South Asian
Type abugida
Alternate names left to right
Status living
Number of speakers 35.9 million
Languages Dhivehi (Maldivian)
Main sources Diller, A. 1996. "Thai and Lao Writing" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 457-466.
Secondary sources
Proposal

Tibetan

The Tibetan script is used to write the Tibetan language in various countries throughout the Himalayas, including Tibet, Nepal, and northern India, where large Tibetan populations reside. It is also used in Bhutan to write Dzongkha. Tibetan also serves as the language of Buddhist traditions that spread from Tibet into the cultural area of Mongolia. The script and its grammar were reported to be devised in 6C CE by Thonmi Sambhota, who was sent to India to study its languages at the behest of Songstem Gampo, the king of Tibet, as a means to bring Buddhism to Tibet. As a result of its origin, the script has been used to represent Indic words.

Unicode blocks Tibetan
Alternate names Taana, Tāna
Timeframe 6C to present
Regions South Asian
Type abugida
Alternate names left to right
Status living
Number of speakers 1.2 million
Languages Dhivehi (Maldivian)
Main sources Van der Kuijp, L. 1996. "The Tibetan Script and Derivatives” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 431-441.
Secondary sources
Proposal

Tifinagh

The Tifinagh script is used to write various languages commonly called Berber or Amazigh, which are spoken by people living in the Maghreb of North Africa and a few countries of West Africa (Mali, Burkina Faso, and Niger). The script comes from an older form of the alphabet called Libyan. The Tifinagh block in the Unicode Standard is based on the Neo-Tifinagh writing systems which were developed to cover the Maghreb Berber dialects. The characters are based on four Tifinagh character subsets: a basic set from the Institut Royal de la Culture Amazighe (IRCAM), an extended IRCAM set, additional Tifinagh letters, and modern Tuareg letters. Since September 2003, it has been taught in primary schools in Morocco.

Unicode blocks Tifinagh
Alternate names Taana, Tāna
Timeframe 3C to present
Regions South Asian
Type alphabet
Alternate names left to right
Status living
Number of speakers 20 million
Languages Dhivehi (Maldivian)
Main sources O’Connor, M. 1996. “The Berber Scripts” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 112-119.
Secondary sources
Proposal http://std.dkuug.dk/JTC1/SC2/WG2/docs/n2739.pdf