6/1/07

utf8 and gb chinese to pinyin converter

This program convert Chinese text encoded in either UTF-8 or GB2312/GBK to pinyin. It's small and does not use external lookup table. I wrote this for my own use in Chihe Project.
Maybe someone will find it useful.

5/25/07

putty, utf8 and chinese

I often have to edit utf8 encoded files that contain Chinese, on remote linux machine, through putty. To edit and display Chinese correctly, I have to configure putty and my environment setting on the remote host.


First putty:open putty configuration Window -> appearance -> font settints->change... select NSimSun (新宋体)
Window -> Translation -> Character set ... select UTF-8


Save configuration to a session.
Next we setting up remote host:
put this in ~/.bashrc
export LANG=en_US.UTF-8

Re-"source" bashrc.

Issue a "locale" and make sure LC_CTYPE is set correctly. If not do a "locale -a" and see what's available.
In Fedora or CentOS, just do a "yum install glibc-common" to install glibc-common to get all locales.
In Ubuntu, edit file /var/lib/locales/supported.d/local and add this line:
en_US.UTF-8 UTF-8
Then do a "dpkg-reconfigure locales".

That's it. Now we can edit UTF8 encoded Chinese file without problem.



5/23/07

lightbox with slideshow

Lightbox is very cool (I know it's hated by some), but you have to click mouse to view next image.

Why not make it run by itself, like a slideshow.

Googling didn’t turn out anything useful, so I added a few lines of code to lightbox.js to make it run slideshow.

If you have lightbox, replace your js/lightbox.js with lightbox-ss.js, and you’re ready to go.

Demo is here.

5/22/07

Vim, Gvim, utf8, and Chinese in Windows XP

See the post for using chinese in dos (cmd) under Windows XP (English).

If you have set locale to PRC as in above post, you are all set to input Chinese in Vim and Gvim. Do a ":set encoding" in vim/gvim, it should say the encoding is set to cp936 (codepage 936, ie. GBK/GB2312).
This is fine if you want to save file as encoded in GB2312. However I usually save all my file as UTF8. To do this I can issue ":set fileencoding=utf8" in vim, but I have to do it every time for a new file. Also, when you open a file previously saved as utf8 in Vim, it will displayed incorrectly because Vim assume it's encoded in gb2312. You can ask vim to reload a file with utf8 encoding by issue a ':e ++enc=utf8', but again you have to do it every time.

To solve these 2 problems, we need to edit our _vimrc file(located in Program Files\vim), and add these 2 lines:
set fileencoding=utf8
set fileencodings=ucs-bom,utf8,prc

The first line tell Vim to save any new file encoded in utf8, the 2nd line tell it to detect file encoding in that order.

That's it, now we can edit both utf8 and GB encoded file.

If you want to make Gvim look prettier when displaying Chinese, add following lines in _vimrc (assume you already have Chinese font installed):
if has("gui_running")
set guifont=NSimSun:h12:cGB2312
endif

5/21/07

Dos, Chinese under window xp

To use Chinese in dos under windows XP (English), just change the locale to "Chinese (PRC)".
These are the steps.

Go to Control Panel -> Regional and Language Options -> (tab) Advanced -> Language for non-Unicode programs, select "Chinese (PRC)".
Also on Languages tab, make sure "Install files for East Asian languages" is checked (may prompt you to install fonts from Windows XP cd).After a reboot, you should be able to enter chinese in dos (cmd).

5/16/07

Chinglish

You can see some Chinglish examples in my lbss (lightbox-slideshow) demo.

Click "Start Demo Slideshow".

2/27/07

on DVBBS

What a piece of crap!

You think Discuz is bad? DVBBS is far worse.