Home  ›  Diacritical

Diacritical

A (sign) diacritic (Greek "distinguishes") is a sign with a letter or grapheme. Compared to the grapheme , the combining can be placed above (diacritic suscrit), below (diacritic concurring), into or through (diacritic registered), after (diacritic adscrit) before (diacritic prescribed) or around ( diacritical circumscribed).

Summary

/ / Introduction

Its objective is to:

  • change the phonetic value of the letter (or grapheme);
  • allow a more accurate reading (the diacritics are then not required);
  • avoid ambiguity between homographs.

There are also letters diacritics, dumb and necessarily written next to the letter they modify. Incidentally, they have become a diacritic (cf. Umlaut and Chief Round ).

Just as the ligatures and the additional letters invented after the fact, adding diacritical expands the number of graphemes of writing. In many cases, the letter diacrite is not considered a grapheme but as an independent allograph, that is to say another written version of the letter simple. The letter diacrite intervenes when not in alphabetical order.

For example, the acute accent of French modifies the phonetic value of e, generally pronounced Diacritics according to the writing and the alphabet

Each script has developed her own diacritics:

Diacritics not French from the Latin alphabet

  • German : for Transcript of diacritics in computer

    Transcripts ASCII

    The character set ASCII standard, dependent on the octal system widely used in the early days of computing, has 128 codes, 95 characters can be displayed, including 52 characters alpha , the 26 letters of the Latin alphabet in breakages uppercase and lowercase ( or low-to-break ), but not accented letters.

    There are several sets of characters often referred to as ASCII , which have 256 codes, the 128 additional codes used especially to represent certain vowels and consonants of the Latin alphabet with diacritical marks.

    The first extended character sets, called code pages were created by the company IBM for its PCs " PC "in this system, a page of code or" PC "(codepage) is specified by a number and associated a particular set: cp437 is the whole "American" or "graphic", the CP850 is the whole "European multilingual.

    With the advent of graphical interfaces ( Apple Macintosh , Microsoft Windows , X Window , etc..), the characters "graphics" code pages are no more relevant to a larger number of extended codes were used to write down diacritical characters. The sets created jointly by IBM and the company Microsoft for their two graphics platforms, Windows and OS / 2 "Presentation Manager", used as a basis for a series of character sets ISO , the standard ISO 8859 which is divided into fifteen groups:

    • 8859-1 to 8859-4, 8859-9 to 8859-10, 8859-13 to 8859-15, "latin1" to "Latin9" variants of the Latin alphabet characters with diacritics from various countries and regions (France, Italy, Spain, Albania, Turkey, Scandinavia, Hungary, Poland, etc.).
    • 8859-6: Latin and Arabic alphabets
    • 8859-7: Latin and Greek alphabets
    • 8859-8: Latin and Hebrew alphabets
    • 8859-11: Thai alphabet

    When you do not have a French computer keyboard or an application does not support accented characters, you can make these diacritical adding a character before and / or after the letter to emphasize. This can give example:

    Le garc,on ne pouvait 'e`tre l`a cet e'te'.

    See also the examples in each section on diacritics, and in section VIQR.

    Transcripts in Unicode

    The Unicode Consortium , which includes most of the big names in computing, was founded in the mid -1980s to supplement the problem of the incompatibility of various character encodings developed for various hardware and software platforms ( EBCDIC system and "codepage" IBM / Microsoft, Apple sets specific to HP games, Unix , etc..) and in association with the development of the standard ISO 10646.

    The initial goal was to develop a coding system instead of 8 bits but 16 bits, allowing the encoding of 2 16 is 65,536 characters. Currently, the standard has been extended beyond 16 bits, because the variety of characters and symbols (including mathematical and scientific symbols) to represent far exceeds this limit, the only Chinese script with its various variants already exceeding this limit 65 536.

    The principle adopted was to group sets or subsets of characters and symbols with "pages" of 256 codes or "blocks" for example, blocks 0 through 3 correspond to four subsets of the Latin alphabet, the Block 6 to the "combined diacritics" associative character of the Latin alphabet, block 7 with Greek and Coptic characters, the block 11 in Hebrew, the blocks 12 to 14 alphabets in Arabic and Cyriac, block 58 symbols monetary, blocks 63, 73, 77 and 78 for mathematical symbols, etc..

    In its final version on 16-bit Unicode system did not retain the pictographic writings which meet another standard.

    There are three ways to insert a Unicode character in a document:

    • by value
    • by serial number
    • Alias

    The registration value is to place the document in the numerical sequence of 16 bits which corresponds to a given character. The methods by serial number is used in only certain types of documents, including format files RTF and HTML or similar ( XML , PHP in particular). In all cases, the principle is the same: to precede or surround the number or the alias of "escaped".

    In HTML documents, place the sequence "&" (alias) or "& # '(number) at the beginning and the sign"; "at the end of the sequence, and in between the serial number or alias.

    For example, the sequences " and "  & "used to represent the sign ampersand (" Ampersand ") (" ampersand ") " & "

    Related articles

    External Links


Leave a Reply


Frequently Asked Questions

1 vote, average: 4.00 out of 51 vote, average: 4.00 out of 51 vote, average: 4.00 out of 51 vote, average: 4.00 out of 51 vote, average: 4.00 out of 5 (1 votes, average: 4.00 out of 5, rated)
Loading ... Loading ...
Help us improve the wiki Send Your Comments