logo

Character Sets

This chapter describes the pitfalls of language depending character sets.

Maitreya supports Unicode character sets since release 4.0. Old version have Ansi charsets. This code page change may lead to several problems.

Charset Problems in Location Database, etc

You may see broken characters in the location database if you copy your old file "locations.dat" to the the directory of the Unicode release.

Older versions of the location database had Ansi format; the new database has Unicode format. Solution: open the database file "locations.dat" in a Unicode capable editor (like Word on Windows or gedit on Linux). Save the file in Unicode UTF-8 format. The problem should disappear then.

Same for astrological data files "*.mtx" etc.

Unicode Charsets on Windows Platforms

Unicode character sets are especially required for non European (e.g. Telugu) translation and display of Sanskrit characters in Sarvatobhadra view.

Problems in Unicode charset display can be a result of Windows configuration.

On Windows XP, additional support for Unicode languages can be installed as follows:
Start →Settings → Control Panel → Regional Options and Language Options.

In the Languages tab, check the Supplemental language support option(s) you want. Setting both options will install all optional fonts. This option adds fonts as well as system support for those languages.

Installation on Windows 2000 is similar.

Unicode on Windows 9x

On Windows 95 and 98 the compilation must done with libunicows (a Unicode library for Windows).

Setting up the Correct Language on Unix/Linux Systems

Language configuration on Linux/Unix is done with the environment variable LANG. This variable holds the ISO code of the desired language and country plus extra information about the character set.

Example

  • en - means English language.
  • en_US - means US American English language.
  • en_US.UTF-8 - means US American language with utf-8 Unicode character set.
Most systems have the correct configuration by default because installation programs generally setup the correct language.

If not, try to set the language manually:
export LANG=te for Bource Again Shell or setenv LANG te for csh.
Or try export LANG=te_IN (may work on some systems like Fedora 5).

Russian language should be configured with export LANG=ru_RU, German lang with de_DE, etc.

Still having Problems

Please check the list if you have problems with Unicode charsets on Linux/UNIX

  • Your system should be new enough. Experience shows that most UNIX/Linux systems with release date before 2005 have Unicode problems.
  • wxWidgets must be compiled in Unicode mode. Be sure that you compiled with --enable-unicode or that your installed packages with Unicode suport.
  • wxGTK should be compiled with GTK version > 2.0, not version 1.x which is known to have Unicode problems. There is no Unicode support for wxMotif and wxX11 yet.
  • Unicode fonts must be installed. True Type fonts are known to work (recommended).