Next: , Previous: Config Class, Up: Top


11 Filter Interface

11.1 Overview

In Aspell there are 5 types of filters:

  1. Decoders which take input in some standard format such as iso8859-1 or UTF-8 and convert it into a string of FilterChars.
  2. Decoding filters which manipulate a string of FilterChars by decoding the text is some way such as converting an SGML character into its Unicode value.
  3. True filters which manipulate a string of FilterChars to make it more suitable for spell checking. These filters generally blank out text which should not be spell checked
  4. Encoding filters which manipulate a string of FilterChars by encoding the text in some way such as converting certain Unicode characters to SGML characters.
  5. Encoders which take a string of FilterChars and convert into a standard format such as iso8859-1 or UTF-8

Which types of filters are used depends on the situation

  1. When decoding words for spell checking:
  2. When checking a document
  3. When encoding words such as those returned for suggestions:

A FilterChar is a struct defined in common/filter_char.hpp which contains two members, a character, and a width. Its purpose is to keep track of the width of the character in the original format. This is important because when a misspelled word is found the exact location of the word needs to be returned to the application so that it can highlight it for the user. For example if the filters translated this:

Mr. foo said "I hate my namme".

to this

Mr. foo said "I hate my namme".

without keeping track of the original width of the characters the application will likely highlight e my as the misspelling because the spell checker will return 25 as the offset instead of 30. However with keeping track of the width using FilterChar the spell checker will know that the real position is 30 since the quote is really 6 characters wide. In particular the text will be annotated something like the following:

1111111111111611111111111111161
Mr. foo said "I hate my namme".

The standard encoder and decoder filters are defined in common/convert.cpp. There should generally not be any need to deal with them so they will not be discussed here. The other three filters, the encoding filter, the true filter, and the decoding filter, are all defined the exact same way; they are inherited from the IndividualFilter class.

11.2 Adding a New Filter

A new filter basically is added by placing the corresponding loadable object inside a directory reachable by Aspell via filter-path list. Further it is necessary that the corresponding filter description file is located in one of the directories listed by the option-path list.

The name of the loadable object has to conform to the following convention libfiltername-filter.so where filtername stands for the name of the filter which is passed to Aspell by the add-filter option. The same applies to the filter description file which has to conform to the following naming scheme: filtername-filter.opt.

To add a new loadable filter object create a new file.

Basically the file should be a C++ file and end in .cpp. The file should contain a new filter class inherited from IndividualFilter and a constructor function called new_filtertype (see Constructor Function) returning a new filter object. Further it is necessary to manually generate the filter description file. Finally the resulting object has to be turned into a loadable filter object using libtool.

Alternatively a new filter may extend the functionality of an existing filter. In this case the new filter has to be derived form the corresponding valid filter class instead of the IndividualFilter class.

11.3 IndividualFilter class

All filters are required to inherit from the IndividualFilter class found in indiv_filter.hpp. See that file for more details and the other filter modules for examples of how it is used.

11.4 Constructor Function

After the class is created a function must be created which will return a new filter allocated with new. The function must have the following prototype:

     C_EXPORT IndividualFilter * new_aspell_filtername_filtertype

Filters are defined in groups where each group contains an encoding filter, a true filter, and a decoding filter (see Filter Overview). Only one of them is required to be defined, however they all need a separate constructor function.

11.5 Filter Description File

This file contains the description of a filter which is loaded by Aspell immediately when the add-filter option is invoked. If this file is missing Aspell will complain about it. It consists of lines containing comments which must be started by a # character and lines containing key value pairs describing the filter. Each file at least has to contain the following two lines in the given order.

     ASPELL >=0.60
     DESCRIPTION this is short filter description

The first non blank, non comment line has to contain the keyword ASPELL followed by the version of Aspell which the filter is usable with. To denote multiple Aspell versions the version number may be prefixed by `<', `<=', `=', `>=' or `>. If the range prefix is omitted `=' is assumed. The DESCRIPTION of the filter should be under 50, begin in lower case, and note include any trailing punctuation characters. The keyword DESCRIPTION may be abbreviated by DESC.

For each filter feature (see Filter Overview) provided by the corresponding loadable object, the option file has to contain the following line:

     STATIC filtertype

filtertype stands for one of decoder, filter or encoder denoting the entire filter type. This line allows to statically (see Link Filters Static) link the filter into Aspell if requested by the user or by the system Aspell is built for.

     OPTION newoption
     DESCRIPTION this is a short description of newoption
     TYPE bool
     DEFAULT false
     ENDOPTION

An option is added by a line containing the keyword OPTION followed by the name of the option. If this name is not prefixed by the name of the filter Aspell will implicitly do that. For the DESCRIPTION of a filter option the same holds as for the filter description. The TYPE of the option may be one of bool, int, string or list. If the TYPE is omitted bool is assumed. The default value(s) for an option is specified via DEFAULT (short DEF) followed by the desired TYPE dependent default value. The table Filter Default Values shows the possible values for each TYPE.

Type Default Available
bool true true false
int 0 any number value
string any printable string
list any comma separated list of strings

Table 1. Shows the default values Aspell assumes if option description lacks a DEFAULT or DEF line.

The ENDOPTION line may be omitted as it is assumed implicitly if a line containing OPTION, STATIC.

Note The keywords in a filter description file are case insensitive. The above examples use the all uppercase for better distinguishing them from values and comments. Further a filter description may contain blank lines to enhance their readability.
Note An option of list type may contain multiple consecutive lines for default values starting with DEFAULT or DEF, to specify numerous default values.

11.6 Retrieve Options by a Filter

An option always has to be retrieved by a filter using its full qualified name as the following example shows.

     config->retrieve_bool("filter-filtername-newoption");

The prefix filter- allows user to relate option uniquely to the specific filter when filtername-newoption ambiguous an existing option of Aspell. The filtername stands for the name of the filter the option belongs to and -newoption is the name of the option as specified in the corresponding .opt file (see Filter Description File

11.7 Compiling and Linking

See a good book on Unix programming on how to turn the filter source into a loadable object.

11.8 Programmer's Interface

A more convenient way recommended, if filter is added to Aspell standard distribution to build a new filter is provided by Aspell's programmers interface for filter. It is provided by the loadable-filter-API.hpp file. Including this file gives access to a collection of macros hiding nasty details about runtime construction of a filter and about filter debugging. Table Interface Macros shows the macros provided by the interface. For details upon the entire macros see loadable-filter-API.hpp. An example on how to use these macros can be found at examples/loadable/ccpp-context.hpp and examples/loadable/ccpp-context.cpp.

Macro Type Description Notes
ACTIVATE_ENCODER M makes the entire encoding filter callable by Aspell do not call inside class declaration; these macros define new_<filtertype> function;
ACTIVATE_DECODER M makes the entire decoding filter callable by Aspell as above
ACTIVATE_FILTER M makes the entire filter callable by Aspell as above
FDEBUGOPEN D Initialises the macros for debugging a filter and opens the debug file stream These macros are only active if the FILTER_PROGRESS_CONTROL macro is defined and denotes the name of the file debug messages should be sent to.

If debugging should go to Aspell standard debugging output (right now stderr) use empty string constant as filename

FDEBUGNOTOPEN D Same as “FDEBUGOPEN” but only if debug file stream was not opened yet as above
FDEBUGCLOSE D closes the debugging device opened by “FDEBUGOPEN” and reverts it to “stderr”; as above
FDEBUG D prints the filename and the line number it occurs as above
FDEBUGPRINTF D special printf for debugging as above

Table 2. Shows the macros provided by loadable-filter-API.hpp (M mandatory, D debugging)

11.9 Adding a filter to Aspell standard distribution

Any filter which one day should be added to Aspell has to be built using the developer interface, described in Programmer's Interface. To add the filter the following steps have to be performed:

  1. Decide whether the filter should be kept loadable if possible, or always be statically linked to Aspell.
  2. Place the filter sources inside the entire directory of Aspell source tree. Right now use $top_srcdir/modules/filter.
  3. Modify the Makefile.am file on the topmost directory of the Aspell distribution. Follow the instructions given by the #Filter Modules section.
  4. Run autoconf, automake, ...
  5. Reconfigure sources.
  6. Clear away any remains of a previous build and rebuild sources.
  7. Reinstall Aspell.
  8. Test if filter has been added properly otherwise return to steps 2–7
  9. Reconfigure sources with enable-static flag and repeat steps 2–7 until your filter builds and runs properly in case of static linkage.
  10. Add your source files to cvs, and commit all your changes. Or in case you are not allowed to commit to cvs submit a patch (see How to Submit a Patch) containing your changes.