Braille Patterns

Braille is an international writing system for the blind. It was invented in 1821 in Paris by Louis Braille, who was himself blind. The script is used today worldwide. Braille uses a system of six or eight raised dots that are arranged in two vertical rows of three or four dots.

Unicode blocks Braille Patterns
Alternate names
Timeframe 1821 to present
Regions East Asian
Type alphabet
Alternate names left to right
Status living
Number of speakers
Languages International
Main sources Daniels, P. 1996. “Shorthand: Braille” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 807-820.
Secondary sources
Proposal

Buginese

The Buginese script, also known as Lontara, is used to write the Buginese (Bugis) language on the island of Sulawesi in Indonesia, primarily in southwest Sulawesi, but the language is also spoken in other areas. The script is a descendant of Brahmi, and may be related to Javanese. It shows some affinity to the Tagalog script. Buginese has been in use since the 14C. It has also been used to write the Makasar, Bima, and Mandar languages.

Unicode blocks Buginese
Alternate names Lontara
Timeframe 14C to present
Regions East Asian
Type abugida
Alternate names left to right
Status living
Number of speakers 4 million
Languages Buginese (Bugis), Makasar, Bima, Mandar
Main sources Matthes, B. F. 1875. Boeginesche Spraakkunst. Den Haag: Martinus Nijhoff.
Secondary sources Sirk, Ü. 1983. The Buginese language. Moscow: Nauka. (Languages of Asia and Africa.)
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2633r.pdf

Buhid

Buhid is a living minority script used in Mindoro in the Philippines used to write the Buhid language. Buhid is a Brahmi-derived script, distantly related to the South Indian scripts. Buhid is closely related to the Hanunóo and Tagbanwa scripts of the Philippines. All three scripts are related to Tagalog, but may not be directly descended from it. The ancestor of these Philippine scripts (including Tagalog) may have been transported to the Philippines via palaeographic scripts of western Java between the 10 and 14C CE.

Unicode blocks Buhid
Alternate names Mangyan
Timeframe pre-19C to present
Regions East Asian
Type abugida
Alternate names left to right
Status living
Number of speakers 8000
Languages Buhid
Main sources Kuipers, J., and R. McDermott. 1996. "Insular Southeast Asian Scripts" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 474-484.
Secondary sources Santos, Hector. 1994. The Living Scripts. Los Angeles: Sushi Dog Graphics. (Ancient Philippine scripts series, 2).
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n1933.pdf

Byzantine Musical Symbols

Byzantine Musical Symbols are musical symbols used to write the religious music and hymns of the Christian Orthodox Church and some folk music manuscripts. These symbols first appeared in the 7C to 8C CE. In 1881, the Orthodox Patriarchy Musical Committee established the New Analytical Byzantine Musical Notation System, which is used today. Most of the manuscripts are in Greek, although a few are in in Russian, Bulgarian, Romanian, and Arabic.

Unicode blocks Byzantine Musical Symbols
Alternate names
Timeframe 7C or 8C to present
Regions East Asian
Type symbols
Alternate names left to right
Status historical
Number of speakers 0
Languages
Main sources Hellenic Organization for Standardization (ELOT). 1997. The Greek Byzantine Musical Notation System. Athens. (=ELOT 1373)
Secondary sources
Proposal

CJK

"CJK" refers to to the unified Han characters used to write to the Chinese, Japanese, and Korean languages. Technically, CJK can also be used for the Vietnamese language, since early Vietnamese writing systems were based on Han. There are several blocks of CJK characters, including multiple blocks of unified ideographs, and blocks of compatibilty ideographs, symbols and punctuation marks, strokes, and radicals. A description of the CJK blocks and the unification principles appears in section 12.1 of The Unicode Standard, and a history of the encoding appears in Appendix E of the Standard.

Unicode blocks CJK Compatibility Forms, CJK Compatibility Ideographs, CJK Compatibility Ideographs Supplement, CJK Radicals Supplement, CJK Strokes, CJK Symbols and Punctuation, CJK Unified Ideographs, CJK Unified Ideographs Extension A, CJK Unified Ideographs Extension B, CJK Unified Ideographs Extension C, CJK Unified Ideographs Extension D
Alternate names
Timeframe
Regions East Asian
Type logosyllabary
Alternate names variable
Status living
Number of speakers 1.3 billion
Languages Chinese, Japanese, Korean, Vietnamese
Main sources Mair, V. 1996. "Modern Chinese Writing" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 200-208; Smith, J. 1996. "Japanese Writing" in Daniels & Bright, pp. 209-217; King, R. 1996. "Korean Writing" in Daniels & Bright, pp. 218-227.
Secondary sources Lunde, Ken. 2009. CJKV Information Processing. 2nd ed. Beijing, Cambridge, MA: O’Reilly.
Proposal

Carian

Carian is a partly undeciphered script, which has some relationship to the Greek alphabet. It is used to write the Carian language. Carian dates to the first millennium BCE. A few inscriptions have been found in Caria, on the western portion of present-day Turkey, and a fragmentary bilingual has also been found in Athens. However, the bulk of extant texts have been found in Egypt, and were left by Carian mercenaries. In 1996 an extensive Carian-Greek bilingual was discovered at Kaunos in southwestern Turkey. The bilingual clearly demonstrated Carian was a member of the Anatolian branch of Indo-European, though details of the language still remain unclear.

Unicode blocks Carian
Alternate names
Timeframe x-7C to -3C
Regions East Asian
Type alphabet
Alternate names variable
Status historical
Number of speakers 0
Languages Carian
Main sources Melchert, H. C. 2004. "Carian" in The Cambridge Encyclopedia of the World’s Ancient Languages, ed. Roger Woodard. Cambridge: Cambridge University Press, pp. 609-613.
Secondary sources Swiggers, P., and W. Jenniges. 1996. “The Anatolian Alphabets” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 281-287.
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3020.pdf

Chakma

The Chakma script is used to write the Chakma language, which is spoken in Bangladesh and India, and is being adapted for the Tanchangya language in Bangladesh. The script has been used for liturgical purposes, but also is used in teaching materials today. The Chakma language is also written in the Latin and Bengali scripts.

Unicode blocks Chakma
Alternate names Ojhopath, Ajha path
Timeframe ? to present
Regions East Asian
Type abugida
Alternate names left to right
Status living
Number of speakers 560000
Languages Chakma, Tanchangya
Main sources Khisa, Bhagadatta. 2001. Cāṅmā pattham pāt = Chakma primer. Rāṅamāṭi: Tribal Cultural Institute (TCI).
Secondary sources Cāṅmā, Cirajyoti, and Maṅgal Cāṅgmā. 1982. Cāṅmār āg pudhi (Chakma primer). Rāṅamāṭi: Cāṅmābhāṣā Prakāśanā Pariṣad.
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3645.pdf

Cham

The Cham script, also known as Akhar Thrah, is a Brahmi-derived script used write the Cham language, an Austronesian language. There are two main varieties of the Cham language: Western Cham, which spoken in Cambodia (and to a lesser extent in Vietnam and Thailand), and Eastern Cham, spoken in Vietnam. Speakers of the former tend to use the Arabic script while some speakers of the latter still use the Cham script.

Unicode blocks Cham
Alternate names Akhar Thrah
Timeframe 1000C to present
Regions East Asian
Type abugida
Alternate names left to right
Status living
Number of speakers 290000
Languages Western Cham, Eastern Cham
Main sources Aymonier, Étienne, and Antoine Cabaton. 1906. Dictionnaire cam-Français. Paris.
Secondary sources Kono Rokuro, Chino Eiichi, and Nishida Tatsuo. 2001. The Sanseido Encyclopaedia of Linguistics. Volume 7: Scripts and Writing Systems of the World (Gengogaku dai ziten (bekkan) sekai mozi ziten). Tokyo: Sanseido Press.
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3120.pdf

Cherokee

The Cherokee script is used to write the indigenous Cherokee language. Cherokee is the native tongue to about 20,000 people, though most speakers today use it as a second language. It was originally invented by Sequoyah, a Cherokee silversmith, in the early 19C, between 1815 and 1821. Sequoyah devised a system of numbers for Cherokee, but today Latin numbers are used instead. The script is still taught today.

Unicode blocks Cherokee
Alternate names
Timeframe early 19C to present
Regions East Asian
Type syllabary
Alternate names left to right
Status living
Number of speakers 20000
Languages Cherokee
Main sources Scancarelli, J. 1996. “Cherokee Writing” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 587-592.
Secondary sources
Proposal

Combining Diacritical Marks

Diacritical marks are ancillary marks that are added to a base character, and can be used to indicate how the character is to be pronounced or stressed. The Combining Diacritical Marks block includes characters intended for general use with any script. Diacritical marks that are specific to a particular script are encoded with that script. The Combining Diacritical Marks Supplement comprises a set of lesser-used combining diacritical marks.

Unicode blocks Combining Diacritical Marks, Combining Diacritical Marks Supplement
Alternate names
Timeframe
Regions East Asian
Type symbols
Alternate names
Status living
Number of speakers
Languages
Main sources The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, p. 234 (Section 7.9).
Secondary sources
Proposal

Combining Diacritical Marks for Symbols

This set of Combining Diacritical Marks for Symbols are, in general, to be applied to mathematical or technical symbols, and serve to extend the set of such symbols. A number of compatibility enclosing marks are also included in this block, which can enclose the base character in different ways.

Unicode blocks Combining Diacritical Marks for Symbols
Alternate names
Timeframe
Regions East Asian
Type symbols
Alternate names
Status living
Number of speakers
Languages
Main sources The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, pp. 234-235 (Section 7.9).
Secondary sources
Proposal

Combining Half Marks

The Combining Half Marks set consists of a number of combining mark pieces that can be used to visually encode certain combining marks that extend over multiple base letterforms. They are included to facilitate the support of such marks in legacy implementations. However, double diacritics, such as U+0360 and U+0361, are to be preferred. The block also includes macron marks that are recommended for representing a style of supralineation in Coptic.

Unicode blocks Combining Half Marks
Alternate names
Timeframe
Regions East Asian
Type symbols
Alternate names
Status lviing
Number of speakers
Languages
Main sources The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, p. 235 (Section 7.9).
Secondary sources
Proposal

Common Indic Number Forms

The Common Indic Number Forms are characters that are used to represent fractional values in various scripts of North India, Pakistan and Nepal. These signs were used to write currency, weight, measure, time, and other units. They have been used since the 16C and are still employed today in a limited capacity.

Unicode blocks Common Indic Number Forms
Alternate names
Timeframe 16C to present
Regions East Asian
Type numeric
Alternate names left to right
Status living
Number of speakers
Languages Gujarati, Gurmukhi, Devanagari, Bhojpuri, Magahi, Awadhi, Maithili, Urdu, Hindi, Marwari, Punjabi
Main sources The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, p. 486-487 (Section 15.3).
Secondary sources
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3367.pdf

Control Pictures

Control pictures are conventional representations of nongraphic characters for use when it is necessary to show the position of a control code within a data stream. Three characters are included in this block to visibly represent ASCII space.

Unicode blocks Control Pictures
Alternate names
Timeframe various
Regions East Asian
Type symbols
Alternate names
Status living
Number of speakers
Languages
Main sources The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, p. 495 (Section 15.6).
Secondary sources
Proposal

Coptic

The Coptic script represents the final stage in the development of the Egyptian writing system and was used for writing the Coptic language. Coptic was based on the Greek uncial alphabets but several letters were added that were unique to Coptic. Although the language died out in the 14C, it is still maintained as a liturgical language by Coptic Christians. Before Unicode 4.1, Coptic was considered a stylistic variant of Greek, so 14 Coptic characters appear in the "Greek and Coptic" block. Coptic is now considered to be disunified from Greek, so one can use the 14 Coptic characters in the "Greek and Coptic" block as well as additional letters that appear in the new "Coptic" block (which also includes characters for Old Coptic and Nubian).

Unicode blocks Coptic, Greek and Coptic
Alternate names
Timeframe 4C to present
Regions East Asian
Type alphabet
Alternate names left to right
Status liturgical
Number of speakers 0
Languages Coptic, Nubian, Old Coptic
Main sources Ritner R. 1996. "The Coptic Alphabet" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 287-290.
Secondary sources
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2744.pdf

Counting Rod Numerals

Chinese counting-rod numerals were used to represent and manipulate numbers in pre-modern East Asian mathematical texts. The rods consisted of a set of small sticks that were several centimeters in length; these were arranged in patterns on a gridded counting board. The glyph shapes represent the conventions of the Song dynasty (960 - 1279), when traditional Chinese mathematics was at its height. The symbols go back to the Warring States Period in China, ca. 4C or 5C BCE.

Unicode blocks Counting Rod Numbers
Alternate names
Timeframe x-4C or -5C to present
Regions East Asian
Type numeric
Alternate names variable
Status historical
Number of speakers 0
Languages
Main sources Pettersson, J. S. 1996. “Numerical Notation" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 795-806.
Secondary sources
Proposal

Cuneiform

Sumero-Akkadian Cuneiform is a logosyllabary that was used from the end of the third millennium until the 1C CE. It spread beyond Mesopotamia to Elam, Assyria, eastern Syria, southern Anatolia, and Egypt, and was used for many languages outside of Sumerian and Akkadian. The script developed from Proto-Cuneiform and Early Dynastic cuneiform. Cuneiform is one of the world's oldest writing systems.

Unicode blocks Cuneiform, Cuneiform Numbers and Punctuation
Alternate names
Timeframe x-2350 to 1C
Regions East Asian
Type logosyllabary
Alternate names left to right
Status historical
Number of speakers 0
Languages Sumerian, Akkadian (Babylonian, Assyrian), Elamite, Hittite, Hurrian, Luvian, Eblaite, Urartian
Main sources Cooper, J. 1996. “Sumerian and Akkadian” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 37-57.
Secondary sources Gragg, G. 1996. “Other languages” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 58-72.
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2786.pdf

Currency Symbols

The Currency Symbols block includes customary symbols used to indicate certain currencies in general text. The signs may vary in shape and are often used for more than one currency. Not all currencies are represented by a currency symbol; some use sequences of multiple-letters, while the abbreviations for currencies can vary by language. Some contemporary or historic currency symbols, not found in the Currency Symbols block, may be found in other blocks.

Unicode blocks Currency Symbols
Alternate names
Timeframe various
Regions East Asian
Type symbols
Alternate names
Status living
Number of speakers
Languages
Main sources The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, pp. 478-480 (Section 15.1).
Secondary sources
Proposal

Cypriot Syllabary

The Cypriot syllabary is an historical script used to write the Cypriot dialect of Greek and a non-Indo-European language, "Eteo-Cypriot." The script was used from the middle of the 11C to 3C BCE and appears at be descended from one of the Cypro-Minoan scripts of Cyprus. The Cypriot syllabary shares some orthographic conventions with Linear B, but the script is written right to left.

Unicode blocks Cypriot Syllabary
Alternate names
Timeframe x-11C to -3C
Regions East Asian
Type syllabary
Alternate names right to left
Status historical
Number of speakers 0
Languages Ancient Greek, "Eteo-Cypriot"
Main sources Woodard, Roger. 2004. "Greek dialects" in The Cambridge Encyclopedia of the World’s Ancient Languages, ed. Roger Woodard. Cambridge: Cambridge University Press, pp. 650-672.
Secondary sources Bennett, E. 1996. "Aegean Scripts" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 125-133.
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2378.pdf

Cyrillic

The Cyrillic script has traditionally been used for writing various Slavic languages, including Russian. The script dates to the 9C or 10C CE, and is named after St. Cyril, a Byzantine missionary. It is one of several scripts that were ultimately derived from the Greek script. Cyrillic has been extended to write non-Slavic languages, particularly the minority languages of Russia and surrounding countries. The Cyrillic Extended-A block is made up of Old Church Slavonic superscripted combining letters, while Cyrillic Extended-B comprises various historic characters, such as those used in Old Cyrillic and Old Abkhasian. The Cyrillic Supplement block is made up of additional letters needed to write various non-Slavic languages.

Unicode blocks Cyrillic, Cyrillic Extended-A, Cyrillic Extended-B, Cyrillic Supplement
Alternate names
Timeframe 9C or 10C to present
Regions East Asian
Type alphabet
Alternate names left to right
Status living
Number of speakers 629 million
Languages Abkhazian, Abaza, Adyghe, Assyrian Neo-Aramaic, Southern Altai, Avaric, Azerbaijani, Bashkir, Belarusian, Bulgarian, Buriat, Russia Buriat, Chechen, Mari, Shor, Chukot, Crimean Turkish, Chuvash, Chuvash, Dargwa, Dungan, Evenki, Nanai, Ingush, Kara-Kalpak, Kabardian, Khanty, Khakas, Kazakh, Komi-Permyak, Komi-Zyrian, Koryak, Karachay-Balkar, Karelian, Kurdish, Kumyk, Komi, Kirghiz, Lak, Lezghian, Moksha, Macedonian, Mongolian, Mansi, Erzya, Nogai, Ossetic, Romany, Russian, Yakut, Serbian, Tabassaran, Tajik, Turkmen, Tatar, Muslim Tat, Tuvinian, Udihe, Udmurt, Ukrainian, Uzbek, Kalmyk, Nenets, Gagauz, Romanian, Northern Sami, Selkup, Uighur
Main sources Cubberly, P. 1996. "The Slavic Alphabets" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 346-355.
Secondary sources Comrie, B. 1996. "Adaptations of the Cyrillic Alphabet" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 700-726.
Proposal