The basic format of the language data file is the same as it is for the Aspell configuration file. It is named lang.dat and is located in the architecture independent data dir for Aspell (option data-dir) which is usually prefix/share/aspell. Use aspell config to find out where it is in your installation. By convention the language name should be the two letter ISO 639 language code if it exists, if not use the three letter code.
The language data file has several mandatory fields, and several optional ones. All fields are case sensitive and should be in all lower case.
The two mandatory fields are name and charset.
name is the name of the language and should be the same as the file name (without the .dat).
charset is the 8-bit character set Aspell will expect the word lists to be formatted in. If possible choose from one of the standard ones provided with Aspell. These are `iso-8859-*', `koi8-*', or `viscii'. If your language does not require any non-ascii characters choose `iso-8859-1'. If one of these standard character sets is not suitable for your language then you can create a new one. See Creating A New Character Set.
The optional fields are as follows:
char is the non-letter character in question. begin, middle, end are either a `-' or a `*'. A star for begin means that the character can begin a word, a `-' means it can't. The same is true for middle and end. For example, the entry for the `'' in English is:
To include more than one middle character just list them one after another on the same line. For example, to make both the `'' and the `-' a middle character, use the following line in the language data file:
special ' -*- - -*-
However, please be aware that adding special characters can have
unintended consequences due to limitations of Aspell. For example if
the `-' was accepted as a middle character, then every
word with a `-' in it would be flagged as a spelling error unless
that exact word is in the dictionary, even if both parts are in the
dictionary. Also, having a `.' as an end character will cause
the `.' to be part of any misspelled words. Which can get very
annoying if you misspell a word at the end of a sentence.
If name is `simpile' then a very simple soundslike is used. This is not as powerful as full phonetic soundslike but it can be computed a lot faster. (see The Simple Soundslike)
If the soundslike name is `none', or this option is not specified,
then no soundslike will be used. The effective soundslike is the word
converted to all lowercase and possibly with accents stripped
depending on the store-as option. For languages with
phonetic spelling the difference will not be very noticeable.
However, for languages with non-phonetic spelling there will be a
noticeable difference. The difference you notice will depend on the
quality of the soundslike data file. If you do not notice much of a
difference for a language with non-phonetic spelling that is a good
indication that the soundslike data is not rough enough—or the words
you are trying are not that badly misspelled.
Additional options includes options to control how run-together words are handled the same way as they are in the normal configuration files. for more information, please Controlling the Behavior of Run-together Words.