Re: [Phpmyadmin-translators] Changes in translations

15 Jun 2002

      Hi,

 --- Siu Sun <siusun@best-view.net> wrote:
...
Chinese, Japanese, Korean etc.. These we call it
"Double-Bytes Character".
the nickname of this group is CJK.
there's CJKV, which V is for Vietnamese.
Vietnam also use Chinese character.
...
PS. I have some confuse on UTF-8 and Unicode. UTF-8
and Unicode is the same
Unicode is just a general name, an organization name.
if you need to made a specific to encoding name,
the popular one is UCS-2.

UCS-2 encoding is like what you have mentioned,
using two bytes for one characters.
(UCS = Universal Character Set)

But like you've also mentioned, two bytes character
processing led many problem with legacy system.

To solve this compatibiltiy problem,
Unicode introduces UTF-8 encoding.
UTF is UCS Transformation Format, this means
UTF try to kept UCS's double-bytes char
in the form of ASCII single-byte char.
(so, the plain ASCII file is also counted as UTF-8
file)

U+0000..U+007F:  0xxxxxxx
U+0080..U+07FF:  110xxxxx 10xxxxxx
U+0800..U+FFFF:  1110xxxx 10xxxxxx 10xxxxxx

with this scheme,
some characters (from U+0000..U+007F which is 
essentially ASCII 0x00..0x7F) will not change,
some characters (from U+0080..U+07FF) will be
kept using two bytes, and some characters
will be kept using three bytes.

for more details, go
http://www.unicode.org/glossary/
http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8

----

for brief,
* Unicode is not official name of any encoding.
* Unicode != UTF-8

:)

Art

=====
----
FREE SOFTWARE   -->   free as in "freedom"
http://www.fsf.org/philosophy/free-sw.html
----
SIIT student community     http://siit.net
Sirindhorn Int'l Inst of Tech, Thammasat U

__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com