Cyrillic in LaTeX and Postscript and Unicode
Cyrillic, LaTeX, Postscript and Unicode
Let's see how to deal with the Cyrillic alphabet in
LaTeX, Postscript, and Unicode.
You might also be interested in this
free on-line journal on Postscript and PDF.
And just in case you need a PDF to Word converter, use
OpenOffice
with its
PDF Import Extension.
You can import PDF and export as Word, all with free software!
Here are some other great sources of detailed information on
how to deal with LaTeX fonts:
Fonts and TeX
The LaTeX Font Catalogue
Cyrillic in LaTeX
PackageManagement
Install the
texlive-lang-cyrillic
and
texlive-lang-greek
packages
using yum
or apt
.
Modify your source document file to include some packages. Do this in the block at the top of the file where you're setting things up.
[ ... ] \usepackage[OT2,T1]{fontenc} \usepackage[russian,greek,english]{babel} [ ... ]
Now down within your LaTeX document file you could, for example, include the Cyrillic alphabet:
[ ... ] This is the Cyrillic alphabet: {\sffamily\foreignlanguage{russian}{ a b v g d e {\cyrzh} z i {\cyrishrt} k l m n o p r s t u f h c q x w {\cyrhrdsn} y {\cyrsftsn} {\cyrerev} {\cyryu} {\cyrya} A B V G D E {\CYRZH} Z I {\CYRISHRT} K L M N O P R S T U F H C Q X W {\CYRHRDSN} Y {\CYRSFTSN} {\CYREREV} {\CYRYU} {\CYRYA} }} {\sffamily\foreignlanguage{russian}{ {\CYRYA} byl remontikom v bolnice. }} [ ... ]
The result will look like this:
А Б В Г Д Е Ж З И Й К Л М Н О П
Р С Т У Ф Х Ц Ч Ш Щ Ъ Ы Ь Э Ю Я
Я бшл ремонтником в болнице
Find missing letters in this file,
part of the texlive-lang-cyrillic
package:
/usr/share/doc/texlive-doc/latex/cyrillic/cyoutenc.pdf
This page is very helpful for Cyrillic
and many other character sets:
http://www.bitjungle.com/isoent/index_files/isoent-ref.pdf
Let's Do Greek
Greek Study GuideThe above showed also including Greek language support.
Let's see how to use it!
Put something like this in your LaTeX source document:
[ ... ] \usepackage[OT2,T1]{fontenc} \usepackage[russian,greek,english]{babel} [ ... ] This is the Greek alphabet: {\sffamily\foreignlanguage{greek}{ a b g d e z h j i k l m n x o p r s c t u f q y w \\ 'a 'e 'h 'i "i "u 'o 'u 'w \\ A B G D E Z H J I K L M N X O P R S T U F Q Y W \\ 'A 'E 'H 'I "I "U 'O 'U 'W}} {\sffamily\foreignlanguage{greek}{swkr{'a}ths}} [ ... ]
That generates the following. Notice how σ/ς works: ς is sigma-final.
α β γ δ ε ζ η θ ι κ λ μ ν ξ ο π ρ ς σ τ υ φ χ ψ ω
ά έ ή ί ϊ ϋ ό ύ ώ
Α Β Γ Δ Ε Ζ Η Θ Ι Κ Λ Μ Ν Ξ Ο Π Ρ Σ Τ Υ Φ Χ Ψ Ω
Ά Έ Ή Ί Ϊ Ϋ Ό Ύ Ώ
σοκράτης
An alternative is the textcomp
package,
but for that you need a Cyrillic keyboard for the Russian part:
\documentclass[letterpaper,12pt]{article} \usepackage[russian,greek,english]{textcomp} \usepackage[latin1]{inputenc} \usepackage[T1,T2A]{fontenc} \begin{document} The last language listed will be the active (or default) one. The others can be chosen for large blocks: \selectlanguage{russian} Горбачёв \selectlanguage{greek} Ellhnik`o ke`imeno. \selectlanguage{english} You can also insert short pieces of text in arbitrary languages, even within paragraphs of a different language: The capital of Russia is \foreignlanguage{russian}{Moskva.} The capital of Greece is \foreignlanguage{greek}{Ajhna.} \end{document}
Also see:
LaTeX/Internationalization
Wikibooks
Multilingual LaTeX with the Babel Package
Reed College
How to make LaTeX2e understand Russian
Greek in LaTeX
Cyrillic in Postscript
The theory is that you can do something like the following and get Postscript that renders Cyrillic:
%!
%%Creator: Your Name Here
%%BoundingBox: 0 0 792 611
%%
%% Postscript Cyrillic demo
%%
%% Define measurements in millimeters, 1 mm = 2.834645 Postscript point
/mm { 2.834645 mul } def
%% Use the Cyrillic-Italic font. Could be just Cyrillic, etc:
/Cyrillic-Italic findfont 12 scalefont setfont
%% Move to the location (50mm, 50mm) and Russify my name:
50 mm 50 mm moveto (Robert Vilhelmoviq Kromvell) midshow
showpage
You have to figure out the quirky character-to-character mapping. Some letters are obvious, just the ASCII letter that is pronounced in a Roman-alphabet language much like the corresponding Cyrillic one is in a Slavic language. Others are not, like these in the following list.
The one that I cannot figure out is the Cyrillic character "ya" or я — if you know how to do this with the ASCII encoding, without remapping your keyboard to a Cyrillic character set, please let me know!
-/_ for "eh/EH" j/J for "zh/ZH" y/Y for "e-kratkaya/E-KRATKAYA" [/{ for "yuri/YURI" ]/} for "yu/YU" h/H for "kh/KH" q/Q for "ch/CH" w/W for "sh/SH" x/X for "shch/SHCH" c/C for "ts/TS" +/\# for "YAT/yat"
Cyrillic in Unicode
The real answer is what you find at the Unicode organization's site. I have this HTML table for my own use — I have a copy on my laptop, and I don't have to bother with rendering the Unicode PDF file. Plus, you can see how well your browser renders Unicode... Both Firefox and Chrome (and even Konqueror last I checked) do a fine job on Linux and OpenBSD.
Unicode describes the codes as:
0400-040f — Cyrillic extensions
0410-044f — Basic Russian alphabet
0450-045f — Cyrillic extensions
0460-0481 — Historic letters
0482-0489 — Historic miscellaneous
048a-04f9 — Cyrillic extensions
04fa-04ff — Additions for Nivkh
0500-050f — Komi letters
0510-0513 — Cyrillic extensions
Codes 048a-04ff are mostly for Cyrillic representation of
non-Slavic languages like Sami, Azerbaijani, Yakut, Tatar,
and so on.
0500-0513 are entirely for Cyrillic representation of
Komi, Enets, Khanty, Chuckchi, etc.
Read the Unicode pages
to see how arcane some of these are, and to get explanations
or at least names and language attributions for all the
characters.
To use this table:
Place the code between
&#x
and
;
.
So, the Russian word
да
is created with:
да
Basic Russian Alphabet | |||||
Ѐ 0400 | А 0410 | Р 0420 | а 0430 | р 0440 | ѐ 0450 |
Ё 0401 | Б 0411 | С 0421 | б 0431 | с 0441 | ё 0451 |
Ђ 0402 | В 0412 | Т 0422 | в 0432 | т 0442 | ђ 0452 |
Ѓ 0403 | Г 0413 | У 0423 | г 0433 | у 0443 | ѓ 0453 |
Є 0404 | Д 0414 | Ф 0424 | д 0434 | ф 0444 | є 0454 |
Ѕ 0405 | Е 0415 | Х 0425 | е 0435 | х 0445 | ѕ 0455 |
І 0406 | Ж 0416 | Ц 0426 | ж 0436 | ц 0446 | і 0456 |
Ї 0407 | З 0417 | Ч 0427 | з 0437 | ч 0447 | ї 0457 |
Ј 0408 | И 0418 | Ш 0428 | и 0438 | ш 0448 | ј 0458 |
Љ 0409 | Й 0419 | Щ 0429 | й 0439 | щ 0449 | љ 0459 |
Њ 040a | К 041a | Ъ 042a | к 043a | ъ 044a | њ 045a |
Ћ 040b | Л 041b | Ы 042b | л 043b | ы 044b | ћ 045b |
Ќ 040c | М 041c | Ь 042c | м 043c | ь 044c | ќ 045c |
Ѝ 040d | Н 041d | Э 042d | н 043d | э 044d | ѝ 045d |
Ў 040e | О 041e | Ю 042e | о 043e | ю 044e | ў 045e |
Џ 040f | П 041f | Я 042f | п 043f | я 044f | џ 045f |
Ѡ 0460 | Ѱ 0470 | Ҁ 0480 | Ґ 0490 | Ҡ 04a0 | Ұ 04b0 |
ѡ 0461 | ѱ 0471 | ҁ 0481 | ґ 0491 | ҡ 04a1 | ұ 04b1 |
Ѣ 0462 | Ѳ 0472 | ҂ 0482 | Ғ 0492 | Ң 04a2 | Ҳ 04b2 |
ѣ 0463 | ѳ 0473 | ҃ 0483 | ғ 0493 | ң 04a3 | ҳ 04b3 |
Ѥ 0464 | Ѵ 0474 | ҄ 0484 | Ҕ 0494 | Ҥ 04a4 | Ҵ 04b4 |
ѥ 0465 | ѵ 0475 | ҅ 0485 | ҕ 0495 | ҥ 04a5 | ҵ 04b5 |
Ѧ 0466 | Ѷ 0476 | ҆ 0486 | Җ 0496 | Ҧ 04a6 | Ҷ 04b6 |
ѧ 0467 | ѷ 0477 | ҇ 0487 | җ 0497 | ҧ 04a7 | ҷ 04b7 |
Ѩ 0468 | Ѹ 0478 | ҈ 0488 | Ҙ 0498 | Ҩ 04a8 | Ҹ 04b8 |
ѩ 0469 | ѹ 0479 | ҉ 0489 | ҙ 0499 | ҩ 04a9 | ҹ 04b9 |
Ѫ 046a | Ѻ 047a | Ҋ 048a | Қ 049a | Ҫ 04aa | Һ 04ba |
ѫ 046b | ѻ 047b | ҋ 048b | қ 049b | ҫ 04ab | һ 04bb |
Ѭ 046c | Ѽ 047c | Ҍ 048c | Ҝ 049c | Ҭ 04ac | Ҽ 04bc |
ѭ 046d | ѽ 047d | ҍ 048d | ҝ 049d | ҭ 04ad | ҽ 04bd |
Ѯ 046e | Ѿ 047e | Ҏ 048e | Ҟ 049e | Ү 04ae | Ҿ 04be |
ѯ 046f | ѿ 047f | ҏ 048f | ҟ 049f | ү 04af | ҿ 04bf |
Ӏ 04c0 | Ӑ 04d0 | Ӡ 04e0 | Ӱ 04f0 | Ԁ 0500 | Ԑ 0510 |
Ӂ 04c1 | ӑ 04d1 | ӡ 04e1 | ӱ 04f1 | ԁ 0501 | ԑ 0511 |
ӂ 04c2 | Ӓ 04d2 | Ӣ 04e2 | Ӳ 04f2 | Ԃ 0502 | Ԓ 0512 |
Ӄ 04c3 | ӓ 04d3 | ӣ 04e3 | ӳ 04f3 | ԃ 0503 | ԓ 0513 |
ӄ 04c4 | Ӕ 04d4 | Ӥ 04e4 | Ӵ 04f4 | Ԅ 0504 | |
Ӆ 04c5 | ӕ 04d5 | ӥ 04e5 | ӵ 04f5 | ԅ 0505 | |
ӆ 04c6 | Ӗ 04d6 | Ӧ 04e6 | Ӷ 04f6 | Ԇ 0506 | |
Ӈ 04c7 | ӗ 04d7 | ӧ 04e7 | ӷ 04f7 | ԇ 0507 | |
ӈ 04c8 | Ә 04d8 | Ө 04e8 | Ӹ 04f8 | Ԉ 0508 | |
Ӊ 04c9 | ә 04d9 | ө 04e9 | ӹ 04f9 | ԉ 0509 | |
ӊ 04ca | Ӛ 04da | Ӫ 04ea | Ӻ 04fa | Ԋ 050a | |
Ӌ 04cb | ӛ 04db | ӫ 04eb | ӻ 04fb | ԋ 050b | |
ӌ 04cc | Ӝ 04dc | Ӭ 04ec | Ӽ 04fc | Ԍ 050c | |
Ӎ 04cd | ӝ 04dd | ӭ 04ed | ӽ 04fd | ԍ 050d | |
ӎ 04ce | Ӟ 04de | Ӯ 04ee | Ӿ 04fe | Ԏ 050e | |
ӏ 04cf | ӟ 04df | ӯ 04ef | ӿ 04ff | ԏ 050f |