1. Introduction
2. Unicode and the Internet
3. Learning tools
4. Input tools; hiragana, katakana and kanji.
5. Japanese on the Net; XChat and such.
6. Other Options
1. Introduction
Slackware is not the most user-friendly of distros; or rather, as the
Unix joke goes, it's picky about who its friends are. On the other
hand, it has distinct advantages in terms of flexibility, simplicity
and a charming literalness of mind. It does not, however, speak many
languages 'out of the box'.
As a seemingly eternal language student, the existence in Gnome 2 of
the character map application (accessories->unicode character map)
helps with accent insertion in European languages, as does the
'Character palette' applet, which may be added to the Gnome Panel
as follows:
Right click on the panel; follow the
'Add to panel' arrow.
Select Utility->Character Palette.
You may now insert characters (such as ©, à or ñ) by
clicking on them, which enters them into the clipboard, and pasting
them where required. Custom palettes may be created in the preferences
dialogue of the applet.
If you're also a touch-typist and are used to multiple keyboards, it is
possible to use the Gnome keyboard switcher applet to switch between
input keyboard maps. The procedure is similar to that for the Gnome
panel, and has the advantage of allowing faster input once you're used
to the mapping.
None of these things help with the Japanese problem. Like Chinese or
Korean, Japanese does not use a Western alphabet; Japanese learners may
initially read in 'Romanji', a Westernised version of the script, but
continued dependence on this representation tends to damage the
learner's progress later on. Furthermore, romanji representation is
simply not an option when surfing the 'Net (in fact, this is not quite
true. There exists a Web proxy written in Perl that converts Japanese
characters into romanji on the fly, called
Japana. But...)
For completeness, let's briefly discuss the three varieties of Japanese
script. If you can't read the examples below, you'll want to first
ensure that the character coding on this page (in View-> Character
Coding) is appropriate, and if that doesn't help - especially if you
are seeing little square 'broken font' signs instead of Japanese - you
will want to check out Section 2 and return to this part later.
The first Japanese script is Hiragana, used somewhat like the western
alphabet. This is an example; ひらがな.
The second is Katakana, a sort of spikier script, used for emphasis in
situations where we might use italic or bold. It is also used for
foreign words, such as names. This is an example; カタカナ.
The third is the nasty one. The Japanese borrowed a number of
characters from the Chinese; now called Kanji characters, they are
based around ideograms. Not only are they difficult to remember, these
sharks in the otherwise calm sea of any beginner's Japanese are also
difficult to input. There are somewhere around 2000 of them in Japanese
vocabulary. Example; 日本語.
2. Unicode and the Internet
Unicode is one of those relatively recent inventions that actually
seems to be a Good Thing. The need for it arises from the fact that
ASCII-style character codings only involve a maximum of 255 characters;
this led to the need for several different ASCII codings, approximately
one for each language or group of languages. Unicode brings all of
those character codings together, making one mega-giant-sized character
set that includes absolutely everything you might ever fear to
encounter on the Internet. One complete Unicode font can therefore
display characters from Cyrillic, Spanish, Arabic and Chinese, probably
including a number of Wingdings on the way.
Unicode is pretty handy.
To use it, though, you need that complete Unicode font I mentioned.
Personally, I used a copy of 'arialuni.ttf'. Getting hold of the font
could be exciting, since Microsoft in their infinite wisdom provided it
initially as a download meant for users of their Publisher(?) software,
and then later on decided to remove the download, presumably on the
principal that aiding international communication is not a sin of which
they should like to be found guilty. There are others, however, such as
described on
this helpful
site. If you have a copy of Windows, of course, you can probably
locate arialuni.ttf on the disk...
Once you've found a Unicode font to your liking, you simply place it in
your TTF directory - which, since you are using Slackware, is
/usr/X11R6/lib/X11/fonts/TTF/. You will then have to renew your font
cache, most likely, by running the program
/usr/X11R6/bin/fc-cache. Now, you
should be able to display the entire Unicode character set, and
returning to this page should mean that those 日本語 squiggles are now
legible.
If you use
mplayer, be aware
that placing this font or a symlink to it (ln -s /path/to/arialuni.ttf
$HOME/.mplayer/) in your $HOME/.mplayer/ directory as subfont.ttf will
cause mplayer to use it for on-screen display and subtitles. It appears
that mplayer currently distribute the arialttf font in its original
form on their FTP server, as one of their non-supported fonts, though
due to distribution limitations you will have to also get hold of a
program called cabextract to unpack it. Google is your friend here.
3. Learning tools
Now that you can display Japanese characters, you can install a number
of programs to make your life more fun.
The first of these is
kanatest.
This program will drill you through learning hiragana and katakana.
Since it has no dependence on Unicode, you may install it before going
through step 2. As a Slackware user, you will not be surprised to find
yourself compiling it in the usual way (./compile, make, make install).
The second is
kdrill.
Whilst a bit graphically out-of-date, it's probably your best option if
you want to test yourself on kanji recognition. Install is relatively
straightforward, requiring the following steps:
First edit the Imakefile, if there's
anything you want to change. Pay attention to this, by the way; kdrill
requires a gzipped version of a file known as kanjidic, whilst gjiten
will require it in non-gzipped form. Whilst a shocking waste of space
to have both versions on the system, the easiest way is to merely put
both in places where they do not conflict with each other.
Then type 'xmkmf'. Note that you cannot do this when su'd to root.
Now 'make'.
Finally, if all went well, type 'make install'.
You will now want to download the file mentioned in the documentation,
kanjidic.
Voilà!
The third is
kanjipad.
This is a useful little program, trivial to compile. It allows you to
draw kanji with the mouse and then look them up and choose the
corresponding kanji; you will find it extremely useful as an input
device, since looking up kanji in an electronic dictionary involves
either a) knowing exactly what the kanji is, which makes the lookup
rather redundant, b) searching for it by heuristic (number of strokes
or the radicals of which it is composed), or c) being able to reproduce
the kanji itself. Kanjipad helps with option c.
Last but not least,
gjiten.
Some part of me hates this program for a crashy piece of *#@*!, but you
can't live without it (unless you choose to go for its KDE counterpart,
kiten, which comes as part of the stock Slackware install. A short note
on kiten; for reasons of its own, it chooses to default to a font which
fails to display hiragana. To make it useful, you will have to choose a
different font in settings->Configure kiten->fonts such as
fixed(Sony), regular, 16. DOH! But by preference you want to install
gjiten, since kiten can't use the japanese input we are about to give
GNOME).
Installation of gjiten is theoretically
a simple ./configure, make, make install process. I give you fair
warning that later you will probably find gjiten crashing on startup
due to a conflict between it and the im-ja japanese input module for
GNOME that we will get to in the next section. To remedy this, you can
either use the configuration file to force gjiten to quit trying to get
japanese input, or you can comment out the 'force' line in gjiten.c (I
chose that option - I am not subtle. Here's an edited
gjiten 2.1). Oddly enough, im-ja
does not crash gjiten when it is chosen manually after startup - it is
only this 'force' line that causes it to crash.
Installation of the dictionaries is a slightly different problem.
gjiten likes its dictionaries to be in utf-8, which means that you will
have to take the dictionaries we downloaded earlier and convert them to
utf-8 by means of the following command line; iconv -f EUC-JP -t
UTF-8 dictfile -o dictfile.utf8. Remember to be careful not to
overwrite anything, other programs might need the non-utf dictionary
files. Also, follow the instructions about the other files gjiten needs
(read the README in the source directory).
4. Input tools
This is very definitely the hard part. Firstly, you need to get hold of
a 'kanji server' software, which sits around your computer and converts
input characters to possible output characters (for example, one line
of hiragana might be representable by one or several kanji character
possibilities - this kanji server does that work). There are two likely
suspects - Canna and
Wnn.
My experience with Canna leads me to suspect that whilst it itself will
compile without problems, nothing else will ever agree to compile with
it on Slackware. Therefore I recommend a recent wnn, such as you might
download
here.
At this point you will begin to appreciate the irritation value
inherent in installing input devices for a language that you are barely
able to read, since all the installation instructions are already in
Japanese. Panic not.
Download the latest version, probably
ftp://ftp.freewnn.org/pub/FreeWnn/alpha/FreeWnn-1.1.1-a020.tar.bz2.
Extract it (tar xvfj FreeWnn-1.1.1-a020.tar.bz2).
cd FreeWnn-1.10-pl020/
./configure
make
su
make install
You may now start the freewnn server for Japanese, otherwise known as
jserver, as follows;
cd
/usr/local/bin/Wnn4/
./jserver
Put this in your /etc/rc.d/rc.something or other, if you want it to
automatically start on reboot.
Hopefully, if all has gone well, you may now telnet to port 22273, and
watch a connection being made to your brand new jserver. If you start
cserver, by the way, it will be on port 22289. I don't know who chose
those defaults.
Now that you have Wnn, you will want to download and install
im-ja.
tar xvfz im-ja-0.9.tgz
cd im-ja-0.9
./configure --disable-canna
make
make install
Now edit /etc/gtk-2.0/gtk.immodules and add the following lines:
"/usr/lib/gtk-2.0/2.2.0/immodules/im-ja.so"
"im-ja" "Japanese" "gtk+" "/usr/share/locale" "ja"
Restart X and gnome. Hopefully, you can now right-click on any
true-blue Gnome application and choose Japanese from Input Methods.
Configure by means of im-ja-conf. Sadly, neither Galeon nor Mozilla
allow the use of input methods in this way, but you can always type
into abiword or the Gnome text editor and cut'n'paste.
5. Japanese on the Net
Xchat, xchat. What would life be like without the ability to type to
people all over the world in Unicode? Here are the enabling steps;
1) edit the server list; choose UTF-8 as the character set for the
server of your choice.
2) Once connected, choose your Arial Unicode TTF font as the default
font in Settings->Preferences->Text box. Note that you might not
really need to do this - theoretically, most programs are smart enough
to choose a font that supports the given character set without being
specifically told to do so.
3) Right click; choose Japanese input method.
And there you have it -> provided the other people in the chat room
are also using UTF-8 encoding, you may now converse in mixed hiragana,
katakana, kanji, Arabic and ancient Hebrew.
As for Japanese-enabled web pages, it should suffice to set the
character encoding as follows:
<meta http-equiv="content-type" content="text/html;
charset=UTF-8">
This should permit you to cut and paste Japanese into your page and be
fairly certain that any user with a Unicode-enabled browser can read it
as such.
6. Other options
Murphy's law is in operation. No sooner have I succeeded in getting all
this working, but I stumble on a couple of helpful links;
Japanese
on GNU/Linux and the Slackware 9.1
add-on package
project. The sweetest irony comes from the fact that I found the
add-on project before, and fell onto a 404...
I installed both canna and kinput2 from the add-on package project
(download packages, which I have mirrored
here due
to a weird feeling that 404s often repeat; su to root, and type
'installpkg packagename.tgz'). I
then started the cannaserver and restarted X, then starting kinput2
with the command 'kinput2 -canna -xim&'.
I wrote a couple of small scripts following the advice of the Japanese
on GNU/Linux author, and placed them into my ~/bin/ directory.
Eventually, I discovered that Gnome applications take violent exception
to the Japanese locale (I suspect it is the same input bug that I
mentioned on gjiten that makes setting locale to japanese fatal to
application startup). Other applications are quite happy with it,
however; it makes kiten useful - and one can always use konqueror for
the Japanese web...
Example script:
#!/usr/bin/bash
XMODIFIERS="@im=kinput2" LANG="ja_JP.eucJP" konqueror&
One further point - if you're after getting a Japanese-enabled command
line, you can always use rxvt -
#!/usr/bin/bash
XMODIFIERS="@im=kinput2" LANG="ja_JP.eucJP" rxvt &
Works like a charm.
Em Tonkin / エッマ トンキン (contact:
em.mirrorscape.net).