Intelligent Cultural Resources Information Management Language: en de it
Position: Home > Documents > Unicodelast update april 07, 2008    XHTML 1.1

Unicode

Data Input and Typefaces

On Linux platforms, data input is not a hassle, since most tools support Unicode input right away, mostly in forma of a acharacter map. For the proprietary MS platform, we'd like to single out Sharmahd Computing's Unipad, for which we've found an additional keyboard for Polytonic Greek. We lost the original place of download, and unfortunately the original author's name isn't included.

Most modern typefaces tend to support Extended Latin and Basic Greek. The Extended (Polytonic) Greek set is found, for example, in the Palatino Linotype, Sylfaen and Thorndale typefaces. A special mention must go to Victor Gaultney's Gentium, for which there are truly wonderful specimens, including the full Ancient Greek set. Another is Juan José Marcos' Alphabetum font, which contains not only classic & medieval Latin, ancient Greek, but also Old Italic — Etruscan, Oscan, Umbrian, Faliscan, Messapic, Picene — Gothic, Iberian, Celtiberian, old & middle English, Hebrew, Sanskrit, Runic, Ogham, Ugaritic, Old Persian cuneiform, old & medieval Nordic and I.P.A.

Browser Support

Usually, the Firefox browser makes the best job of rendering Unicode texts because it automatically tries to provide a complete display, even switching typefaces as necessary.

Here are short samples in Arabic and Polytonic Greek.

ما هي الشفرة الموحدة "يونِكود" ؟

أساسًا، تتعامل الحواسيب فقط مع الأرقام، وتقوم بتخزين الأحرف والمحارف الأخرى بعد أن تُعطي رقما معينا لكل واحد منها. وقبل اختراع "يونِكود"، كان هناك مئات الأنظمة للتشفير وتخصيص هذه الأرقام للمحارف، ولم يوجد نظام تشفير واحد يحتوي على جميع المحارف الضرورية. وعلى سبيل المثال، فإن الاتحاد الأوروبي لوحده، احتوى العديد من الشفرات المختلفة ليغطي جميع اللغات المستخدمة في الاتحاد. وحتى لو اعتبرنا لغة واحدة، كاللغة الإنجليزية، فإن جدول شفرة واحد لم يكف لاستيعاب جميع الأحرف وعلامات الترقيم والرموز الفنية والعلمية الشائعة الاستعما

ΙΛΙΑΔΟΣ

Α Μῆνιν ἄειδε θεὰ Πηληϊάδεω ᾿Αχιλῆος / οὐλομένην, ἣ μυρί’ ᾿Αχαιοῖς ἄλγε’ ἔθηκε, / πολλὰς δ’ ἰφθίμους ψυχὰς ῎Αϊδι προΐαψεν / ἡρώων, αὐτοὺς δὲ ἑλώρια τεῦχε κύνεσσιν / οἰωνοῖσί τε πᾶσι, Διὸς δ’ ἐτελείετο βουλή, / ἐξ οὗ δὴ τὰ πρῶτα διαστήτην ἐρίσαντε / ᾿Ατρεΐδης τε ἄναξ ἀνδρῶν καὶ δῖος ᾿Αχιλλεύς.

Merely another test, here is a small group of Unicode signs. They should display in various degrees, according to which Unicode fonts are installed on your system (Arial Unicode MS, Gentium, Palatino Linotype, Lucida Sans Unicode, &c.).

ASCII + Latin 1 Supplement

! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ;< = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ­ ® ¯ ° ± ² ³ ´ µ ¶ · ¸ ¹ º » ¼ ½ ¾ ¿ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è é ê ë ì í î ï ð ñ ò ó ô õ ö ÷ ø ù ú û ü ý þ ÿ

Latin Extended A

Ā ā Ă ă Ą ą Ć ć Ĉ ĉ Ċ ċ Č č Ď ď Đ đ Ē ē Ĕ ĕ Ė ė Ę ę Ě ě Ĝ ĝ Ğ ğ Ġ ġ Ģ ģ Ĥ ĥ Ħ ħ Ĩ ĩ Ī ī Ĭ ĭ Į į İ ı IJ ij Ĵ ĵ Ķ ķ ĸ Ĺ ĺ Ļ ļ Ľ ľ Ŀ ŀ Ł ł Ń ń Ņ ņ Ň ň ʼn Ŋ ŋ Ō ō Ŏ ŏ Ő ő Œ œ Ŕ ŕ Ŗ ŗ Ř ř Ś ś Ŝ ŝ Ş ş Š š Ţ ţ Ť ť Ŧ ŧ Ũ ũ Ū ū Ŭ ŭ Ů ů Ű ű Ų ų Ŵ ŵ Ŷ ŷ Ÿ Ź ź Ż ż Ž ž ſ

Latin Extended B

ƀ Ɓ Ƃ ƃ Ƅ ƅ Ɔ Ƈ ƈ Ɖ Ɗ Ƌ ƌ ƍ Ǝ Ə Ɛ Ƒ ƒ Ɠ Ɣ ƕ Ɩ Ɨ Ƙ ƙ ƚ ƛ Ɯ Ɲ ƞ Ɵ Ơ ơ Ƣ ƣ Ƥ ƥ Ʀ Ƨ ƨ Ʃ ƪ ƫ Ƭ ƭ Ʈ Ư ư Ʊ Ʋ Ƴ ƴ Ƶ ƶ Ʒ Ƹ ƹ ƺ ƻ Ƽ ƽ ƾ ƿ ǀ ǁ ǂ ǃ DŽ Dž dž LJ Lj lj NJ Nj nj Ǎ ǎ Ǐ ǐ Ǒ ǒ Ǔ ǔ Ǖ ǖ Ǘ ǘ Ǚ ǚ Ǜ ǜ ǝ Ǟ ǟ Ǡ ǡ Ǣ ǣ Ǥ ǥ Ǧ ǧ Ǩ ǩ Ǫ ǫ Ǭ ǭ Ǯ ǯ ǰ DZ Dz dz Ǵ ǵ Ƕ Ƿ Ǹ ǹ Ǻ ǻ Ǽ ǽ Ǿ ǿ Ȁ ȁ Ȃ ȃ Ȅ ȅ Ȇ ȇ Ȉ ȉ Ȋ ȋ Ȍ ȍ Ȏ ȏ Ȑ ȑ Ȓ ȓ Ȕ ȕ Ȗ ȗ Ș ș Ț ț Ȝ ȝ Ȟ ȟ Ƞ Ȣ ȣ Ȥ ȥ Ȧ ȧ Ȩ ȩ Ȫ ȫ Ȭ ȭ Ȯ ȯ Ȱ ȱ Ȳ ȳ

Latin Extended Additional

Ḁ ḁ Ḃ ḃ Ḅ ḅ Ḇ ḇ Ḉ ḉ Ḋ ḋ Ḍ ḍ Ḏ ḏ Ḑ ḑ Ḓ ḓ Ḕ ḕ Ḗ ḗ Ḙ ḙ Ḛ ḛ Ḝ ḝ Ḟ ḟ Ḡ ḡ Ḣ ḣ Ḥ ḥ Ḧ ḧ Ḩ ḩ Ḫ ḫ Ḭ ḭ Ḯ ḯ Ḱ ḱ Ḳ ḳ Ḵ ḵ Ḷ ḷ Ḹ ḹ Ḻ ḻ Ḽ ḽ Ḿ ḿ Ṁ ṁ Ṃ ṃ Ṅ ṅ Ṇ ṇ Ṉ ṉ Ṋ ṋ Ṍ ṍ Ṏ ṏ Ṑ ṑ Ṓ ṓ Ṕ ṕ Ṗ ṗ Ṙ ṙ Ṛ ṛ Ṝ ṝ Ṟ ṟ Ṡ ṡ Ṣ ṣ Ṥ ṥ Ṧ ṧ Ṩ ṩ Ṫ ṫ Ṭ ṭ Ṯ ṯ Ṱ ṱ Ṳ ṳ Ṵ ṵ Ṷ ṷ Ṹ ṹ Ṻ ṻ Ṽ ṽ Ṿ ṿ Ẁ ẁ Ẃ ẃ Ẅ ẅ Ẇ ẇ Ẉ ẉ Ẋ ẋ Ẍ ẍ Ẏ ẏ Ẑ ẑ Ẓ ẓ Ẕ ẕ ẖ ẗ ẘ ẙ ẚ ẛ Ạ ạ Ả ả Ấ ấ Ầ ầ Ẩ ẩ Ẫ ẫ Ậ ậ Ắ ắ Ằ ằ Ẳ ẳ Ẵ ẵ Ặ ặ Ẹ ẹ Ẻ ẻ Ẽ ẽ Ế ế Ề ề Ể ể Ễ ễ Ệ ệ Ỉ ỉ Ị ị Ọ ọ Ỏ ỏ Ố ố Ồ ồ Ổ ổ Ỗ ỗ Ộ ộ Ớ ớ Ờ ờ Ở ở Ỡ ỡ Ợ ợ Ụ ụ Ủ ủ Ứ ứ Ừ ừ Ử ử Ữ ữ Ự ự Ỳ ỳ Ỵ ỵ Ỷ ỷ Ỹ ỹ

Greek and Coptic

ʹ ͵ ͺ ; ΄ ΅ Ά · Έ Ή Ί Ό Ύ Ώ ΐ Α Β Γ Δ Ε Ζ Η Θ Ι Κ Λ Μ Ν Ξ Ο Π Ρ Σ Τ Υ Φ Χ Ψ Ω Ϊ Ϋ ά έ ή ί ΰ α β γ δ ε ζ η θ ι κ λ μ ν ξ ο π ρ ς σ τ υ φ χ ψ ω ϊ ϋ ό ύ ώ ϐ ϑ ϒ ϓ ϔ ϕ ϖ Ϙ ϙ ϗ Ϛ ϛ Ϝ ϝ Ϟ ϟ Ϡ ϡ Ϣ ϣ Ϥ ϥ Ϧ ϧ Ϩ ϩ Ϫ ϫ Ϭ ϭ Ϯ ϯ ϰ ϱ ϲ ϳ ϴ ϵ ϶

Greek Extended

ἀ ἁ ἂ ἃ ἄ ἅ ἆ ἇ Ἀ Ἁ Ἂ Ἃ Ἄ Ἅ Ἆ Ἇ ἐ ἑ ἒ ἓ ἔ ἕ Ἐ Ἑ Ἒ Ἓ Ἔ Ἕ ἠ ἡ ἢ ἣ ἤ ἥ ἦ ἧ Ἠ Ἡ Ἢ Ἣ Ἤ Ἥ Ἦ Ἧ ἰ ἱ ἲ ἳ ἴ ἵ ἶ ἷ Ἰ Ἱ Ἲ Ἳ Ἴ Ἵ Ἶ Ἷ ὀ ὁ ὂ ὃ ὄ ὅ Ὀ Ὁ Ὂ Ὃ Ὄ Ὅ ὐ ὑ ὒ ὓ ὔ ὕ ὖ ὗ Ὑ Ὓ Ὕ Ὗ ὠ ὡ ὢ ὣ ὤ ὥ ὦ ὧ Ὠ Ὡ Ὢ Ὣ Ὤ Ὥ Ὦ Ὧ ὰ ά ὲ έ ὴ ή ὶ ί ὸ ό ὺ ύ ὼ ώ ᾀ ᾁ ᾂ ᾃ ᾄ ᾅ ᾆ ᾇ ᾈ ᾉ ᾊ ᾋ ᾌ ᾍ ᾎ ᾏ ᾐ ᾑ ᾒ ᾓ ᾔ ᾕ ᾖ ᾗ ᾘ ᾙ ᾚ ᾛ ᾜ ᾝ ᾞ ᾟ ᾠ ᾡ ᾢ ᾣ ᾤ ᾥ ᾦ ᾧ ᾨ ᾩ ᾪ ᾫ ᾬ ᾭ ᾮ ᾯ ᾰ ᾱ ᾲ ᾳ ᾴ ᾶ ᾷ Ᾰ Ᾱ Ὰ Ά ᾼ ᾽ ι ᾿ ῀ ῁ ῂ ῃ ῄ ῆ ῇ Ὲ Έ Ὴ Ή ῌ ῍ ῎ ῏ ῐ ῑ ῒ ΐ ῖ ῗ Ῐ Ῑ Ὶ Ί ῝ ῞ ῟ ῠ ῡ ῢ ΰ ῤ ῥ ῦ ῧ Ῠ Ῡ Ὺ Ύ Ῥ ῭ ΅ ` ῲ ῳ ῴ ῶ ῷ Ὸ Ό Ὼ Ώ ῼ ´ ῾

Runic

ᚠ ᚡ ᚢ ᚣ ᚤ ᚥ ᚦ ᚧ ᚨ ᚩ ᚪ ᚫ ᚬ ᚭ ᚮ ᚯ ᚰ ᚱ ᚲ ᚳ ᚴ ᚵ ᚶ ᚷ ᚸ ᚹ ᚺ ᚻ ᚼ ᚽ ᚾ ᚿ ᛀ ᛁ ᛂ ᛃ ᛄ ᛅ ᛆ ᛇ ᛈ ᛉ ᛊ ᛋ ᛌ ᛍ ᛎ ᛏ ᛐ ᛑ ᛒ ᛓ ᛔ ᛕ ᛖ ᛗ ᛘ ᛙ ᛚ ᛛ ᛜ ᛝ ᛞ ᛟ ᛠ ᛡ ᛢ ᛣ ᛤ ᛥ ᛦ ᛧ ᛨ ᛩ ᛪ ᛫ ᛬ ᛭ ᛮ ᛯ ᛰ

The following are various characters used for scientific texts:

Character Hexadecimal Name
U+0301A LEFT WHITE SQUARE BRACKET
U+0301B RIGHT WHITE SQUARE BRACKET
U+02016 DOUBLE VERTICAL LINE
| U+0007C VERTICAL LINE
+ U+0002B PLUS SIGN
U+02282 SUBSET OF
U+02283 SUPERSET OF
U+02627 CHI RHO
U+2720 MALTESE CROSS
U+203B REFERENCE MARK
Ϝ U+03DC GREEK LETTER DIGAMMA
ϲ U+03F2 GREEK LUNATE SIGMA SYMBOL
Ϡ U+03E0 GREEK LETTER SAMPI
Ϟ U+03DE GREEK LETTER KOPPA
Ϙ U+03D8 GREEK LETTER ARCHAIC KOPPA
Ϛ U+03DA GREEK LETTER STIGMA
U+025B2 BLACK UP-POINTING TRIANGLE
U+025B4 BLACK UP-POINTING SMALL TRIANGLE
U+025BC BLACK DOWN-POINTING TRIANGLE
U+025BE BLACK DOWN-POINTING SMALL TRIANGLE
U+022EE VERTICAL ELLIPSIS
U+03008 LEFT ANGLE BRACKET
U+03009 RIGHT ANGLE BRACKET
U+0300A LEFT DOUBLE ANGLE BRACKET
U+0300B RIGHT DOUBLE ANGLE BRACKET
U+02329 LEFT-POINTING ANGLE BRACKET
U+0232A RIGHT-POINTING ANGLE BRACKET
⌜o U+0231C TOP LEFT CORNER
o⌝ U+0231D TOP RIGHT CORNER
′o U+02032 PRIME
o‵ U+02035 REVERSED PRIME
U+02218 RING OPERATOR
U+02219 BULLET OPERATOR
U+02160 ROMAN NUMERAL ONE
U+02164 ROMAN NUMERAL FIVE
U+02169 ROMAN NUMERAL TEN
U+0216C ROMAN NUMERAL FIFTY
U+0216D ROMAN NUMERAL ONE HUNDRED
U+0216E ROMAN NUMERAL FIVE HUNDRED
U+0216F ROMAN NUMERAL ONE THOUSAND
U+02180 ROMAN NUMERAL ONE THOUSAND C D
U+02181 ROMAN NUMERAL FIVE THOUSAND
U+02182 ROMAN NUMERAL TEN THOUSAND
U+00325 COMBINING RING BELOW
U+00304 COMBINING MACRON
U+00305 COMBINING OVERLINE
U+00301 COMBINING ACUTE ACCENT
U+00341 COMBINING ACUTE TONE MARK
U+00302 COMBINING CIRCUMFLEX ACCENT
U+00335 COMBINING SHORT STROKE OVERLAY
U+00336 COMBINING LONG STROKE OVERLAY
{ U+0007B LEFT CURLY BRACKET
} U+0007D RIGHT CURLY BRACKET
U+02070 SUPERSCRIPT ZERO
¹ U+000B9 SUPERSCRIPT ONE
² U+000B2 SUPERSCRIPT TWO
³ U+000B3 SUPERSCRIPT THREE
U+02074 SUPERSCRIPT FOUR
U+02075 SUPERSCRIPT FIVE
U+02076 SUPERSCRIPT SIX
U+02077 SUPERSCRIPT SEVEN
U+02078 SUPERSCRIPT EIGHT
U+02079 SUPERSCRIPT NINE