Sunday, February 25, 2007

Viewing Tamil web pages

Im still researching this topic and i will update this post as i find out more on this.

There are 2 things to consider when viewing Non-English content like Tamil:

a) Encoding
b) Font

There are various encoding schemes. For example, in Firefox, i have seen Tamil web pages requiring one of the following:
o user defined
o Unicode - UTF 8
o Western - Windows-1252

The webpages and content are at the end of the day stored as a stream of bytes.
Encoding is the format which which the language is represented in these bytes. For example,
ASCII is the standard encoding for english and in that: A is represented using number 65.

User defined, Western etc. are not standard encodings. They are whatever the web page writer chose for representing their content. Unicode UTF 8 on the other hand is standard.

Font:
Given an encoding, a font is used to map a byte in the webpage to a letter symbol drawn on the screen. The font and the encoding are related. When you say unicode font for example, you mean a font that can render content encoded in Unicode format.

Im going to document the fonts that i know of for rendering Tamil in this post.

Vikatan.com: This uses a true type font: http://www.vikatan.com/vikat_tm.ttf
It uses "Western - Windows-1252" encoding. Font config for vikatan is documented at:
http://www.vikatan.com/trouble.asp

kumudam:

dinamani:

FONTS:

Latha: This is a unicode font. I do not like this font. This doesnt show tamil in its natural written form. This is new fangled and confusing.


OS:
Windows XP: Unicode Support for Tamil is not installed by Default in XP. you need to add it by using the procedure described here:

For Windows XP, getting additional languages installed is as follows:

Start > Settings > Control Panel > Regional Options and Language Options.



Vista: Vista supposedly comes with out of the box support for Tamil.

Browsers:

Firefox: Firefox falls back to the operating system for Unicode fonts.

IE: IE has built in support for Unicode and do not have to fall back to the OS.



References:
this site: http://www.alanwood.net/unicode/index.html
has very good information on unicode settings, browser configs, etc.

WALTT: http://ccat.sas.upenn.edu/plc/tamilweb/

Best Unicode Tamil Webpage to view and test: Tamil Wikipedia

No comments: