Thesaurus Indogermanischer Text- und Sprachmaterialien
|| UNICODE |
Titus Is Testing Unicode Script-management
TITUS continues testing UNICODE script management (please cf. the contributions
by Carl-Martin Bunz and Jost Gippert to IUC 10
and by Carl-Martin Bunz to IUC 11 for a description of
the TITUS approach to using Unicode).
For this purpose, we prepared some document pages containing a full account
of UNICODE characters with their equivalents in UTF-8 (the pages themselves are encoded
using UTF-8). They can be used to check your WWW browser's capabilities as to
representing UNICODE / UTF-8 encoding. If you have a browser that is able to
handle UTF-8 encoded files (e.g., Netscape Communicator 4.0 or higher or Microsoft's Internet Explorer 4.0 or 5.0) and if you use an
operating system that is able to handle UNICODE (e.g., MS Windows 95; MS Windows 98; MS Windows NT 4.0; MS Windows 2000; MS Windows XP; Mac OS X), you should be
able to read at least the following parts of the tables as contained in the pages:
- ASCII (U+0020 through U+007F);
- Latin-1 Supplement (U+0080 through U+00FF);
- Latin Extended-A (U+0100 through U+017F);
- parts from Latin Extended-B (U+01FA through U+01FF);
- Greek (U+0370 through U+03FF);
- Cyrillic (U+0401 through U+045F);
- parts from Latin Extended Additional (U+1E80 through U+1E85);
- parts from General Punctuation and following blocks (U+2000 through U+26FF);
On this server, we also provide several sample pages documenting the Unicode encoding of Latin and non-Latin scripts such as Ancient Greek, Cyrillic, Devanagari, and the like.
Additionally, we are at present developing a data base that contains full information about characters encodable in Unicode. A preliminary version of the retrieval engine is available here.
In order to be able to visualize UNICODE encoded characters as listed above, you will have
to prepare your system in the following way (you will need a 32-bit bus processor
to be successful):
Users of MS Windows 95 and 98:
- a) Install Multi-Language Support ("Sprachunterstützung") from the Win95 / Win98 Setup CD-ROM;
- b) Choose at least one additional non-Latin keyboard (e.g., Greek) to be usable:
- from within the Win95 / Win98 control panel ("Systemsteuerung"), choose "keyboards",
- choose "Greek" (you will need the Win95-CD-ROM for this step again);
Users of MS Windows NT 4.0, MS Windows 2000, and MS Windows XP:
- These operating systems are preconfigured for usage of Unicode and need no further installation.
Users of Mac OS X (information kindly provided by H. Elbrecht):
- Do upgrade to version 10.2, but do "Custom Install" for additional language support, when installing/upgrading to Mac OS X 10.2 -
there is more language support beyond the usual default installation!
- Activate your keyboard(s) then in "System Preferences/International/Input Menu" - and don't forget about activating
"Character Palette", "U.S. Extended" and "Unicode Hex Input" anyway!
BTW: other/customized keyboards are available/customizable in XML format - so do it on your own using this link to the
Apple developers' site or look for open source ones to come!
BTW: you can actually "compose" diacritical marked characters now using the "U.S. Extended" keyboard: these on the fly "composed" characters
get "normalized" by Unicode Standard.
- You can install the ".ZIP" version of "TITUS Cyberbit Basic" in one of three "Fonts" folders on Mac OS X 10.2 - but
"Users/~/Library/Fonts" is best for handling later!
BTW: complex glyph rendering is implemented "in" AAT fonts, that's why you need these new intelligent Apple Advanced Typography fonts -
WINdows fonts will display, as does "TITUS Cyberbit Basic", but more advanced OpenType features are not yet supported (on Mac OS X 10.2 -
besides some Adobe OT features "inside" InDesign only) because OT support needs to be implemented on system level instead to work.
- A Unicode compatible web browser for MAC OS X is "OmniWeb 4.1" (by OmniGroup) downloadable from: www.omnigroup.com.
Users of Netscape Communicator 4.0 and higher:
- choose "Preferences" under the "Edit" menue;
- choose "Fonts";
- under "for the Encoding", choose "Unicode";
- under "Variable fix font", choose a font that is able to
handle Unicode, incl. the TITUS Unicode font, "TITUS Cyberbit
- under "Fixed width font" choose "Courier New";
- click upon "Use document-specified fonts, including dynamic fonts";
- click upon the "OK" button.
Users of Microsoft Internet Explorer 4.0
- Choose "Universal Alphabet" (!) as the encoding to be used (under "View",
"Internet Options", "Fonts") when browsing UTF-8 pages.
Users of Microsoft Internet Explorer 5 and 6
- Check for the fonts to be used for each script ("Latin based", "Greek", "Cyrillic" etc.)
available in the check box under "Tools" (German version: "Extras"), "Internet Options", "Fonts".
Choose any font, incl. the TITUS Unicode font,
"TITUS Cyberbit Basic", that is offered for displaying the individual scripts.
Please note that the font management will mostly depend on the font you select
for the rendering of "Latin based" scripts (German version: "Lateinischer
You should then be ready to display the characters as listed above on the screen.
If you still have difficulties, you can check your equipment's capacities by
If you want to see many more characters than the ones indicated above, you can
download and install the BITSTREAM CYBERBIT font as availablehere
This font includes, among others, a (nearly) full set of Korean, Japanese, and Chinese (Han)
characters. Be careful: The font has a size of about 13 MB!
N.B. The font does not include a full implementation of Latin diacritics, Ancient Greek, Armenian,
Georgian and the like, however.
In cooperation with BITSTREAM, the TITUS project has prepared a Unicode Font (in Windows TTF format) to match the requirements of linguists and philologists working on several languages (ancient and modern).
This font (named "TITUS Cyberbit Basic"), now compliant with UNICODE 4.0 , is available for non-commercial users only.
The font can be downloaded here (N.B. the preparation of the download file may take a minute).
If you want to use the font with your web browser, you will have to configure your system and your browser in the way indicated above.
On the basis of the tool "Uniqoder" developed by Östen Dahl, Andrej Perdih created UniqTitus, a UNICODE keyboard layout for MS Word. This tool,
which is available for non-commercial users only, can be downloaded here.
Back to the TITUS homepage
Copyright of this page: Jost Gippert, Frankfurt a/M 1996-2003.
No parts of this document may be republished in any form without prior permission by the copyright holder.