Alphabetically
The standard ISO 15924 list of "Codes for the representation of names of scripts. " The Unicode Consortium manages the office of the registering authority and maintenance of the standard on behalf of ISO, which defines and approves the standard. However, ISO 15924 is not part of the standard Unicode (which uses relating solely to the distinctions of abstract characters).
Summary |
Appointment and organization of writing systems according to ISO 15924
The standard defines for each script:
- a descriptive name in English ;
- a descriptive name in French ;
- an alphabetical code element (normative) to four letters, for example:
- a digital code value (normative) between 000 and 999, and finally
- a reference date for tracking changes (and corrections if any) for each script in the standard itself.
For a complete list (and current) set of codes and names, we refer simply to the website listed at end of article.
Nomenclature and numerical classification
The digital code elements are grouped in series of a hundred depending on the type and the relative proximity of writing systems (see examples below).
The codes and names are also defined to take into account the needs for bibliographic texts and whole documents, and are not reserved only for single characters. Also, different styles of writing using the same alphabet abstract feature specific code elements, code elements classified close to the same series, consecutive if possible. For this, the digital code values are not simply allocated in increments of 1 (there are "holes" in the numbering).
The following series are currently used:
- 000-099: hieroglyphics (Egyptian or Mayan) and cuneiform (including Ugaritic);
- 100 and 199: alphabetic characters from right to left (including Phoenician alphabet, tifinaghs , Abjad Semitic, Mongolian, Hungarian N'Ko and old);
- 200-299: alphabetic characters from left to right (including European alphabets derived from Ancient Greek, Hangul and alphabet bobomofo or literary invented alphabets);
- 300-399: alphasyllabic records (including many abugida Brahmic south and southeast Asia);
- 400-499: syllabic scripts (including primers A or Linear B, Cypriot, Hiragana or Katakana, Ethiopian, Native Canadian, Cherokee, etc.).
- 500-599: ideographic scripts or symbolic (whose writing Braille );
- 600-699: Undeciphered scripts (classification still unknown, as the industrial and rongorongo );
- 700 to 799 or 800 to 899: Series still used;
- 900-999: codepoints for private use, alias (currently none), special codepoints.
Composition and allocation of alphabetic code values
The four-letter alphabetic code elements using the basic Latin alphabet to 26 letters. The case of these code elements are not significant, but the case recommended uses a capital letter followed by three lower case letters. The codes are alphabetic writings inspired names for mnemonic reasons. However, variants of the same writing styles differ, as far as possible, that by their fourth letter. These variations are also recognizable by their numeric code elements close in the same series. For example:
- Latn = 21 5 = (en) ' Latin '= (fr) "Lat i n";
- Lat = 21 6 = f (en) "Latin (broken version)" = (en) "Latin (F raktur variant)";
- Lat = 21 7 = g (en) "Latin (variant Gaelic ) "= (en)" Latin (G aelic variant).
Or:
- Geor = 24 0 = (en) " Georgia ( mkhdrouli ) "= (en)" Geor gian (Mkhedruli) ";
- Geo 1 = k = 24 (en) " khoutsouri ( assomtavrouli and nouskhouri ) "= (en)" K hutsuri (Asomtavruli and Nuskhuri).
And also:
- Hani = 50 0 = (en) " i dogrammes han '= (fr) "Han (Hanzi, Kanji, Hanja)";
- Han s = 50 = 1 (en) "Han ideographs (simplified version)" = (en) "Han (S implified variant)";
- Han T = 50 2 = (en) "Han ideographs (Traditional variant)" = (en) "Han (T raditional variant).
However, two alphabetic code elements beginning with the same first three letters does not necessarily mean two variants of the same writing system (which may possibly be due to the numerical classification in separate series):
- Hani = 5 00 = (en) " i dogrammes han '= (fr) "Han (Hanzi, Kanji, Hanja)";
- Hano = 3 71 = (en) " Hanunoo "= (en)" Han uno o (Hanunoo).
Special code values
If the standard entries are not enough, there are 50 code elements used at the discretion of the users (the names used are not normative and are subject to change):
- Qaaa = 900 = (en) "reserved for private use (start)" = (en) "Reserved for private use (start)";
- Qaab = 901 = (en) "reserved for private use (2)" = (en) "Reserved for private use (2 nd)";
- ...
- Qaaz = 925 = (en) "reserved for private use (26 e) '= (fr)" Reserved for private use (26 th).
- Qaba = 926 = (en) "reserved for private use (27 th) '= (fr)" Reserved for private use (27 th) ";
- ...
- Qabx = 949 = (en) "reserved for private use (end)" = (en) "Reserved for private use (end).
There are code elements for special case of unwritten languages (eg the use classification of photographs and video recordings or sound systems in the collections of media libraries and museums), or when a script can not be determined reliably for many (in separate families and for which the set has no preset code more accurate), or even when the writing was not specified but could possibly be given more accurately with a other code:
- Zxxx = 997 = (en) "codepoint for unwritten languages' = (fr)" Code for unwritten languages ";
- Zyyy = 998 = (en) "Writing for indefinite codepoint" = (en) "Code for undetermined script";
- Zzzz = 999 = (en) "codepoint to write unencoded" = (en) "Code for uncoded script".
History
This list of codes and names of scripts was created and is maintained by Michael Everson , a member of the Technical Committee of Unicode (UTC). The text of ISO 15924 was approved for the first time on 9 January 2004 , which set out general principles for defining codes.
The first list of codes, while extensive, was published on 1 May 2004 online at the website of the Unicode Consortium. It included, inter alia, all records used or defined in the standard, so Unicode 4.0 and ISO / IEC 10646. A significant number of corrections followed within weeks, and the list was finalized on 29 May 2004.
Since then, a few new entries were added regularly for the purpose of writing being standardized in ISO / IEC 19646 and Unicode, or uses literature as well as entries that are not standardized yet still subject to studies.
Relations with other standards and recommendations
Relation to language code elements of the standard ISO 639
In addition, the alphabetic code elements of ISO 15924 records began, as far as possible, with the same letters that the three-letter code elements of languages according to ISO 639 -2 or its extension ISO 639 -3 (which covers an extensive list of languages) when the names of the writing and language are homonyms. For example:
- = language name (en) "Latin" = (en) "Latin" codepoint language ISO 639-2 alpha = lat;
- name = write (in) 'Latin' = (en) "Latin" homonyms, so: codepoint alphabetical writing ISO 15924 = Latn.
The future standard ISO 639 -6 in preparation, which should extend the four-letter language code elements (to identify a larger number of language variants) incorporates this principle, and if possible use the same code elements already included in ISO 15924 entries for homonyms language, to preserve compatibility with the current standard RFC 5646 (BCP 47):
- name = write (in) 'Latin' = (en) "Latin." : Alphabetically codepoint ISO 15924 script = Latn.
- = language name (en) "Latin" = (en) "Latin" homonyms, so: codepoint alphabetical language ISO / CD 639-6 = LATN.
Appointment of local RFC 5646, with ISO 639 and ISO 3166
In practice, the alphabetic code values are preferable in applications that are internationalized locate data. These are the alphabetic code values for use in local codes, together with the alpha language code elements of the standard ISO 639 and the alpha or numeric code elements for countries and regions of the standard ISO 3166.
The local applications are described in accordance with the current RFC 5646 (BCP 47) to take into account both the ISO 15924 codepoints writing, in addition to code elements of language ISO 639 code elements and countries and regions ISO 3166.
Differences names with those of standard ISO / IEC 10646
There is no exact bijection between English and French names of scripts defined in ISO 15924 and designations in English and French names used in the normative character and character blocks allocated in the standards ISO / IEC 10646 (and So as Unicode ).
However, future blocks of characters and character standardized in ISO / IEC 10646 (Unicode and thus also) will be appointed, if possible, in accordance with ISO 15924.
Differences alphabetic code values with those of the standard Unicode
Similarly, there is no bijection between the exact code elements of alphabetic writing in ISO 15924 standard and codes of scripts used in the tables of character properties Unicode. Indeed, the ISO 15924 contains additional elements to make distinctions to use bibliographic entries between which were unified in ISO and Unicode character encoding. ISO 15924 contains code elements and distinctive names for the entries that were so unified in a single Unicode (which treats them as typographical variants without differential encoding at the characters and their properties normative or informative).
On the other hand, ISO 15924 was created after the Unicode standard, the format of alphabetic code values may differ from ISO 15924 normative codes used in the tables of Unicode properties (which may be longer and contain underscores).
For information purposes only, ISO 15924 defines an alias (or "documentation of property value) for standardized records, to find the correspondence with the character properties defined in the Unicode standard, where such a difference exists. Since ISO 15924 was published, the Unicode Consortium has committed not to establish new codes different from those defined in ISO 15924, and uses, whenever possible, the alphabetic code elements of ISO 15924. Therefore all the synonyms of Unicode properties are not listed in the tables ISO 15924 (see actual code used in the properties files in the Unicode standard itself, and Unicode added synonyms Property Values Character , which can now only use code elements in ISO 15924-compliant applications to Unicode).
See also
External Links
- (In) http://www.unicode.org/iso15924/ - Registration Authority and maintenance codes for the representation of names of scripts, Unicode Consortium :
- (En) (en) standard ISO 15924 , the official online version;
- (En) (fr) Table 1. Alphabetical list of code elements to write four letters ;
- (En) (fr) Table 2. Numerical list of code elements of writing ;
- (En) (fr) Table 3. Alphabetical list of names of scripts in English ;
- (En) (fr) Table 4. Alphabetical list of entries in French.
- (In) BCP 47 , list of prescriptive recommendations for the selection of language tags for the classification of linguistic resources used in the localization of applications, IETF , this list is currently comprised of:
- (In) RFC 5646 , Tags for Identifying Languages (Tags for the Identification of Languages), Addison Phillips (ed.) and Mark Davis (ed.), September 2009;
- (In) Erratum for RFC 5646 , search page corrections made after the publication of RFC 5646 (no known correction in July 2010);
- (In) RFC 4647 , Matching of Language Tags (Matching of Language Tags), Addison Phillips (ed.) and Mark Davis (ed.), September 2006;
- (In) Erratum for RFC 4647 , search page corrections made after the publication of RFC 4647 (no known correction in July 2010);
- Annexes to BCP 47 (contain subtags, derived from ISO 15924, for the identification of records used for the transcription of languages):
- (Fr) Language Subtag Registry (Registry subtags language), normative data for BCP 47, IANA.
- (In) RFC 5645 (informational), Update to the Language Subtag Registry (Updated registry subtags language), Addison Phillips (ed.) and Mark Davis (ed.), September 2009.
Related articles
- ISO 639 (language codes)
- ISO 3166 (codes of countries and regions)
- ISO / IEC 10646 , Unicode (character encoding)
- International Phonetic Alphabet (IPA)
| Character Sets | UCS (ISO / IEC 10646) ISO 646 , ASCII ISO 8859-1 WGL4 Unihan |
| Equivalency standard | NFC (precomposed) NFD (decomposed) NFKC (compatibility) NFKD (compatibility) |
| Properties and algorithms | ISO 15924 Scrap Scheduling UCA Bidirectional text BOM |
| Coding | UTF-7 UTF-8 CESU-8 UTF-EBCDIC BOCU-1 SCSU UTF-16 UTF-32 |
| Other transformations | Punycode GB 18030 |
| Applications for data exchange | Email and Unicode Unicode and HTML |
3 9 31 216 228 233 259 269 639 646 690 843 1000 2022 2108 2709 3103 3166 3166-1 3166-2 3166-3 3297 3901 4217 5218 6166 6358 6709 7185 7810 8217 8601 8613 8859 9000 9001 9002 9003 9004 9075 9126 9362 9407 9594 9646 9660 9945 9984 10006 10118-3 10303 10303-11 10303-238 10383 10589 10646 10664 10957 11179 11544 11783 11801 11898 12207 10303 13211-1 13216 13250 13335 13399 13485 13568 13616 14000 14001 14396 14882 15189 15408 15444 15489 15504 15511 15706 15836 15924 16023 16262 17799 18004 19005 19110 19115 19439 19501:2005 19775-1 20000 20252 21127 22000 23270 25178 26000 26300 27001 27002 27005 27006 29500 32000 | |
| Lists: List of ISO standards List of ISO romanization standards List of IEC standards Categories: Category: ISO |
Leave a Reply
Frequently Asked Questions
- who is better Shania Twain or Lady Gaga?" shania and I prefer you guys a href httpanswersyahoocomquestionindexqidAASnrtFhttpanswersyahoocomquestionindex..."
- According to the Bible, Jesus was a historic gay?"a href httpanswersyahoocomquestionindexqidAArCNhttpanswersyahoocomquestionindexqidAArCNa..."
- Facebook friend request from a person with my picture?"I recently received a friend request from quotJames Davisquot he has no mutual friends and his profile picture is a picture of me in my old profil..."
- Need help with this problem in C + +?" Whole Building on starting at until the maximum value br less than Calculate the sum of the integers that are divisible generated br..."
- What steps can playstaion network or customers take to protect customers privacy?"like playstaion can sell top up card to stores so people can buy it with cash instead of cridit cardsa href httpanswersyahoocomquestioninde..."
- How do you write five or ten thousand in Roman numerals?" I thought that was written with the M and one or more sticks on top but recently told me that and not written how to write today ie in ..."
- How to rank Battlefield 2 online?" Why I play Battlefield online and offline without the server to be ranked so foreign when shooting with it a return youve recorded it on V or..."
- what is the best girl underwear ?"a href httpanswersyahoocomquestionindexqidAAYjmSPhttpanswersyahoocomquestionindexqidAAYjmSPa..."
- Please teach me Thai!?" I really want to learn Thai gtlt a href httpanswersyahoocomquestionindexqidAAOKBtYhttpanswersyahoocomq..."
- Do You Know A Name Of This Song?"Do You know the name of this song httpzrdea href httpanswersyahoocomquestionindexqidAAmEZyjhttpanswersyaho..."
- that hagooooo?" I tell you hello br I have a group of friends quotwho always live everywhere we went for something near my house we went on holiday togeth..."
- Geometry question, pyramid?"a href httpanswersyahoocomquestionindexqidAAbvoMhttpanswersyahoocomquestionindexqidAAbvoMa..."

(1 votes, average: 4.00 out of 5, rated)