Next: , Previous: Creating an Individual Word List, Up: Working With Dictionaries


5.6 Working With Affix Info in Word Lists

5.6.1 The Munch Command

The munch command takes a list of words from standard input and outputs a list of possible root words and affixes. The root may, however, be invalid as it does not check them against the existing dictionary. For example the command:

     echo brother | aspell -l en munch

produces
brother broth/R brothe/R

5.6.2 The Expand Command

The expand command is the reverse of munch, it expands affix flags to produce a list of words. For example:

     echo both/R | aspell -l en expand

produces
both bother

The formal usage is:

     aspell expand [level] [limit]

Where level is the expansion level. Valid values are between 1 and 3. Level 1 is the default if not otherwise specified. Level 2 causes the original root/affix to be included, for example:

     both/R both bother

Level 3 causes multiple lines to be printed, one for each generated word, with the original root/affix combination followed by the word it creates:

     both/R both
     both/R bother

Levels larger than 3 may also be supported, but should not be used as they may eventually be removed.

If a limit parameter is given then only expansions which affect the first limit letters will be expanded. If a base word is not completely expanded for a given affix flag that flag will be left on the word. Note that prefixes are always expanded.

5.6.3 The Munch-list Command

The munch-list command will reduce the size of word list via affix compression. It will reduce a list of words to a minimal (or close to it) set of roots and affixes that will match the same list of words. In some cases the set of words returned is, provably, the minimum number possible. In the typical case the number of words returned is within 1% of the optimum number.

The list of words is read from standard input and the result, the “munched” list, is written to standard out. It's usage is:

     aspell munch-list [keep] [single|multi] < infile > outfile

where keep, single, and multi are literal values.

By default Aspell will remove redundant affix flags. The keep flag will avoid removing them, which can be useful if you want to include all possible expansions for each base word.

When cross products are involved it may be beneficial to list a base word more than once. While Aspell 0.61 and better can correctly handle multiple base words in a dictionary, Aspell 0.60 can not. Therefore, the current default behavior is to only include the one with the most expansions. All of them can be included via the multi flag, but be aware that the word lists will not work correctly with Aspell 0.60. A future version of Aspell may default to including them all. The single flag can be used to only include one of them.