Nihongo no Slackware

1. Introduction
2. Unicode and the Internet
3. Learning tools
4. Input tools; hiragana, katakana and kanji.
5. Japanese on the Net; XChat and such.
6. Other Options

1. Introduction

Slackware is not the most user-friendly of distros; or rather, as the Unix joke goes, it's picky about who its friends are. On the other hand, it has distinct advantages in terms of flexibility, simplicity and a charming literalness of mind. It does not, however, speak many languages 'out of the box'.

As a seemingly eternal language student, the existence in Gnome 2 of the character map application (accessories->unicode character map) helps with accent insertion in European languages, as does the 'Character palette' applet, which may be added to the Gnome Panel as follows:

Right click on the panel; follow the 'Add to panel' arrow.
Select Utility->Character Palette.
You may now insert characters (such as ©, à or ñ) by clicking on them, which enters them into the clipboard, and pasting them where required. Custom palettes may be created in the preferences dialogue of the applet.

If you're also a touch-typist and are used to multiple keyboards, it is possible to use the Gnome keyboard switcher applet to switch between input keyboard maps. The procedure is similar to that for the Gnome panel, and has the advantage of allowing faster input once you're used to the mapping.

None of these things help with the Japanese problem. Like Chinese or Korean, Japanese does not use a Western alphabet; Japanese learners may initially read in 'Romanji', a Westernised version of the script, but continued dependence on this representation tends to damage the learner's progress later on. Furthermore, romanji representation is simply not an option when surfing the 'Net (in fact, this is not quite true. There exists a Web proxy written in Perl that converts Japanese characters into romanji on the fly, called Japana. But...)

For completeness, let's briefly discuss the three varieties of Japanese script. If you can't read the examples below, you'll want to first ensure that the character coding on this page (in View-> Character Coding) is appropriate, and if that doesn't help - especially if you are seeing little square 'broken font' signs instead of Japanese - you will want to check out Section 2 and return to this part later.

The first Japanese script is Hiragana, used somewhat like the western alphabet. This is an example; ひらがな.

The second is Katakana, a sort of spikier script, used for emphasis in situations where we might use italic or bold. It is also used for foreign words, such as names. This is an example; カタカナ.

The third is the nasty one. The Japanese borrowed a number of characters from the Chinese; now called Kanji characters, they are based around ideograms. Not only are they difficult to remember, these sharks in the otherwise calm sea of any beginner's Japanese are also difficult to input. There are somewhere around 2000 of them in Japanese vocabulary. Example; 日本語.

2. Unicode and the Internet

Unicode is one of those relatively recent inventions that actually seems to be a Good Thing. The need for it arises from the fact that ASCII-style character codings only involve a maximum of 255 characters; this led to the need for several different ASCII codings, approximately one for each language or group of languages. Unicode brings all of those character codings together, making one mega-giant-sized character set that includes absolutely everything you might ever fear to encounter on the Internet. One complete Unicode font can therefore display characters from Cyrillic, Spanish, Arabic and Chinese, probably including a number of Wingdings on the way.

Unicode is pretty handy.

To use it, though, you need that complete Unicode font I mentioned. Personally, I used a copy of 'arialuni.ttf'. Getting hold of the font could be exciting, since Microsoft in their infinite wisdom provided it initially as a download meant for users of their Publisher(?) software, and then later on decided to remove the download, presumably on the principal that aiding international communication is not a sin of which they should like to be found guilty. There are others, however, such as described on this helpful site. If you have a copy of Windows, of course, you can probably locate arialuni.ttf on the disk...

Once you've found a Unicode font to your liking, you simply place it in your TTF directory - which, since you are using Slackware, is /usr/X11R6/lib/X11/fonts/TTF/. You will then have to renew your font cache, most likely, by running the program /usr/X11R6/bin/fc-cache. Now, you should be able to display the entire Unicode character set, and returning to this page should mean that those 日本語 squiggles are now legible.

If you use mplayer, be aware that placing this font or a symlink to it (ln -s /path/to/arialuni.ttf $HOME/.mplayer/) in your $HOME/.mplayer/ directory as subfont.ttf will cause mplayer to use it for on-screen display and subtitles. It appears that mplayer currently distribute the arialttf font in its original form on their FTP server, as one of their non-supported fonts, though due to distribution limitations you will have to also get hold of a program called cabextract to unpack it. Google is your friend here.

3. Learning tools

Now that you can display Japanese characters, you can install a number of programs to make your life more fun.

The first of these is kanatest. This program will drill you through learning hiragana and katakana. Since it has no dependence on Unicode, you may install it before going through step 2. As a Slackware user, you will not be surprised to find yourself compiling it in the usual way (./compile, make, make install).

The second is kdrill. Whilst a bit graphically out-of-date, it's probably your best option if you want to test yourself on kanji recognition. Install is relatively straightforward, requiring the following steps:

First edit the Imakefile, if there's anything you want to change. Pay attention to this, by the way; kdrill requires a gzipped version of a file known as kanjidic, whilst gjiten will require it in non-gzipped form. Whilst a shocking waste of space to have both versions on the system, the easiest way is to merely put both in places where they do not conflict with each other.
Then type 'xmkmf'. Note that you cannot do this when su'd to root.
Now 'make'.
Finally, if all went well, type 'make install'.
You will now want to download the file mentioned in the documentation, kanjidic.

Voilà!

The third is kanjipad. This is a useful little program, trivial to compile. It allows you to draw kanji with the mouse and then look them up and choose the corresponding kanji; you will find it extremely useful as an input device, since looking up kanji in an electronic dictionary involves either a) knowing exactly what the kanji is, which makes the lookup rather redundant, b) searching for it by heuristic (number of strokes or the radicals of which it is composed), or c) being able to reproduce the kanji itself. Kanjipad helps with option c.

Last but not least, gjiten. Some part of me hates this program for a crashy piece of *#@*!, but you can't live without it (unless you choose to go for its KDE counterpart, kiten, which comes as part of the stock Slackware install. A short note on kiten; for reasons of its own, it chooses to default to a font which fails to display hiragana. To make it useful, you will have to choose a different font in settings->Configure kiten->fonts such as fixed(Sony), regular, 16. DOH! But by preference you want to install gjiten, since kiten can't use the japanese input we are about to give GNOME).

Installation of gjiten is theoretically a simple ./configure, make, make install process. I give you fair warning that later you will probably find gjiten crashing on startup due to a conflict between it and the im-ja japanese input module for GNOME that we will get to in the next section. To remedy this, you can either use the configuration file to force gjiten to quit trying to get japanese input, or you can comment out the 'force' line in gjiten.c (I chose that option - I am not subtle. Here's an edited gjiten 2.1). Oddly enough, im-ja does not crash gjiten when it is chosen manually after startup - it is only this 'force' line that causes it to crash.

Installation of the dictionaries is a slightly different problem. gjiten likes its dictionaries to be in utf-8, which means that you will have to take the dictionaries we downloaded earlier and convert them to utf-8 by means of the following command line; iconv -f EUC-JP -t UTF-8 dictfile -o dictfile.utf8. Remember to be careful not to overwrite anything, other programs might need the non-utf dictionary files. Also, follow the instructions about the other files gjiten needs (read the README in the source directory).

4. Input tools

This is very definitely the hard part. Firstly, you need to get hold of a 'kanji server' software, which sits around your computer and converts input characters to possible output characters (for example, one line of hiragana might be representable by one or several kanji character possibilities - this kanji server does that work). There are two likely suspects - Canna and Wnn.

My experience with Canna leads me to suspect that whilst it itself will compile without problems, nothing else will ever agree to compile with it on Slackware. Therefore I recommend a recent wnn, such as you might download here. At this point you will begin to appreciate the irritation value inherent in installing input devices for a language that you are barely able to read, since all the installation instructions are already in Japanese. Panic not.

Download the latest version, probably ftp://ftp.freewnn.org/pub/FreeWnn/alpha/FreeWnn-1.1.1-a020.tar.bz2.
Extract it (tar xvfj FreeWnn-1.1.1-a020.tar.bz2).
cd FreeWnn-1.10-pl020/
./configure
make
su
make install

You may now start the freewnn server for Japanese, otherwise known as jserver, as follows;

cd /usr/local/bin/Wnn4/
./jserver

Put this in your /etc/rc.d/rc.something or other, if you want it to automatically start on reboot.
Hopefully, if all has gone well, you may now telnet to port 22273, and watch a connection being made to your brand new jserver. If you start cserver, by the way, it will be on port 22289. I don't know who chose those defaults.

Now that you have Wnn, you will want to download and install im-ja.

tar xvfz im-ja-0.9.tgz
cd im-ja-0.9
./configure --disable-canna
make
make install

Now edit /etc/gtk-2.0/gtk.immodules and add the following lines:
"/usr/lib/gtk-2.0/2.2.0/immodules/im-ja.so"
"im-ja" "Japanese" "gtk+" "/usr/share/locale" "ja"

Restart X and gnome. Hopefully, you can now right-click on any true-blue Gnome application and choose Japanese from Input Methods. Configure by means of im-ja-conf. Sadly, neither Galeon nor Mozilla allow the use of input methods in this way, but you can always type into abiword or the Gnome text editor and cut'n'paste.

5. Japanese on the Net

Xchat, xchat. What would life be like without the ability to type to people all over the world in Unicode? Here are the enabling steps;

1) edit the server list; choose UTF-8 as the character set for the server of your choice.
2) Once connected, choose your Arial Unicode TTF font as the default font in Settings->Preferences->Text box. Note that you might not really need to do this - theoretically, most programs are smart enough to choose a font that supports the given character set without being specifically told to do so.
3) Right click; choose Japanese input method.

And there you have it -> provided the other people in the chat room are also using UTF-8 encoding, you may now converse in mixed hiragana, katakana, kanji, Arabic and ancient Hebrew.

As for Japanese-enabled web pages, it should suffice to set the character encoding as follows:
<meta http-equiv="content-type" content="text/html; charset=UTF-8">

This should permit you to cut and paste Japanese into your page and be fairly certain that any user with a Unicode-enabled browser can read it as such.

6. Other options

Murphy's law is in operation. No sooner have I succeeded in getting all this working, but I stumble on a couple of helpful links; Japanese on GNU/Linux and the Slackware 9.1 add-on package project. The sweetest irony comes from the fact that I found the add-on project before, and fell onto a 404...

I installed both canna and kinput2 from the add-on package project (download packages, which I have mirrored here due to a weird feeling that 404s often repeat; su to root, and type 'installpkg packagename.tgz'). I then started the cannaserver and restarted X, then starting kinput2 with the command 'kinput2 -canna -xim&'.

I wrote a couple of small scripts following the advice of the Japanese on GNU/Linux author, and placed them into my ~/bin/ directory. Eventually, I discovered that Gnome applications take violent exception to the Japanese locale (I suspect it is the same input bug that I mentioned on gjiten that makes setting locale to japanese fatal to application startup). Other applications are quite happy with it, however; it makes kiten useful - and one can always use konqueror for the Japanese web...

Example script:

#!/usr/bin/bash
XMODIFIERS="@im=kinput2" LANG="ja_JP.eucJP" konqueror&

One further point - if you're after getting a Japanese-enabled command line, you can always use rxvt -

#!/usr/bin/bash
XMODIFIERS="@im=kinput2" LANG="ja_JP.eucJP" rxvt &

Works like a charm.

Em Tonkin / エッマトンキン (contact: em.mirrorscape.net).