Contents
- Introduction
- License
- Author
- Installation
- API
- Methods
- Classes
- Exceptions
- AspellSpeller Object
- ConfigKeys() => dictionary
- setConfigKey(key, value)
- check(word) => boolean
- suggest (word) => list of suggestions
- addReplacement(incorrect, correct) => None
- addtoPersonal(word) => None
- addtoSession(word) => None
- saveAllwords() => None
- clearSession() => None
- getPersonalwordlist() => [list of strings]
- getSessionwordlist() => [list of strings]
- getMainwordlist() => [list of strings]
- Known problems
- Character encoding
GNU Aspell is a leading spelling engine, fast, with many dictionaries available. Take a look at Python Cookbook --- Ryan Kelly have collected links to all python bindings for spellers.
aspell-python is a Python wrapper for GNU Aspell, there are two variants:
pyaspell.py
--- Python library, that utilize ctypes module; compatible with python3;aspell-python
--- C extension, two versions are available, one for Python 2.x, and Python 3.x.
C exension exist in two versions: one compatible with Python 2.x and other with Python 3.x.
Version for Py2 has been tested with Python 2.1, Python 2.3.4 and Python 2.4.1. Probably it works fine with all Python versions not older than 2.0. Version for Py3 has been tested with Python 3.2.
Both libraries are licensed under BSD license
Wojciech Muła, [email protected]
Thanks to:
- Adam Karpierz for conviencing me to change license from GPL to BSD and for compiling early versions of C extension under Windows
- Gora Mohanty for reporting a bug.
To build & install module for python2.x please use script setup.2.py, i.e.:
$ python setup.2.py build $ python setup.2.py install
Module for python3.x is build with setup.3.py:
$ python3 setup.3.py build $ python3 setup.3.py install
Note python3
name. Many Linux distributions ship both Python 2 and 3,
and use the different name to distinguish versions.
You need to have libaspell headers installed, Debian package is called
libaspell-dev
, other distributions should have similar package.
In order to install aspell-python for all users, you must be a root. If you are, type following command:
$ python setup.py install
It builds package and installs aspell.so
in directory
/usr/lib/{python}/site-packages
.
If you don't have root login, you can append --user
to the install
command, to install it for the current user in
~/.local/lib/{python}/site-packages
.
To correctly install aspell's dictionaries in Windows some additional work is needed. Eric Woudenberg has prepared detailed step-by-step instruction avaiable in file windows.rst.
Aspell-python module is seen in python under name aspell
. So,
aspell-python module is imported in following way:
import aspell
The module provides Speller class, two methods, and three types of exceptions --- all described below.
Method returns a dictionary, where keys are names of configuration item, values are 3-tuples:
- key type (
string
,integer
,boolean
,list
) - default value for the key
- short description - "internal" means that aspell doesn't provide any description of item and you shouldn't set/change it, unless you know what you do
Aspell's documentation covers in details all of keys and their meaning.
Below is a list of most useful and obvious options (it is a filtered
output of ConfigKeys
).
('data-dir', 'string', '/usr/lib/aspell-0.60', 'location of language data files') ('dict-dir', 'string', '/usr/lib/aspell-0.60', 'location of the main word list') ('encoding', 'string', 'ISO-8859-2', 'encoding to expect data to be in') ('home-dir', 'string', '/home/wojtek', 'location for personal files') ('ignore', 'integer', 1, 'ignore words <= n chars') ('ignore-accents', 'boolean', False, 'ignore accents when checking words -- CURRENTLY IGNORED') ('ignore-case', 'boolean', False, 'ignore case when checking words') ('ignore-repl', 'boolean', False, 'ignore commands to store replacement pairs') ('keyboard', 'string', 'standard', 'keyboard definition to use for typo analysis') ('lang', 'string', 'pl_PL', 'language code') ('master', 'string', 'pl_PL', 'base name of the main dictionary to use') ('personal-path', 'string', '/home/wojtek/.aspell.pl_PL.pws', 'internal') ('repl-path', 'string', '/home/wojtek/.aspell.pl_PL.prepl', 'internal') ('run-together', 'boolean', False, 'consider run-together words legal') ('save-repl', 'boolean', True, 'save replacement pairs on save all') ('warn', 'boolean', True, 'enable warnings') ('backup', 'boolean', True, 'create a backup file by appending ".bak"') ('reverse', 'boolean', False, 'reverse the order of the suggest list') ('suggest', 'boolean', True, 'suggest possible replacements')
Method creates an AspellSpeller object which is an interface to the GNU Aspell.
Speller
called with no parameters creates speller using default
configuration. If you want to change or set some parameter you can pass
pair of strings: key and it's value. One can get available keys using
ConfigKeys.
>>> aspell.Speller("key", "value")
If you want to set more than one pair of key&value, pass the list of pairs to the Speller().
>>> aspell.Speller( ("k1","v1"), ("k2","v2"), ("k3","v3") )
Module defines following errors:
Additionally TypeError
is raised when you pass wrong parameters to
method.
Error is reported by methods Speller and ConfigKeys. The most common error is passing unknown key.
>>> s = aspell.Speller('python', '2.3')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
aspell.AspellConfigError: The key "python" is unknown.
>>>
Error is reported when module can't allocate aspell structures.
Error is reported by libaspell
.
>>> # we set master dictionary file, the file doesn't exist
>>> s = Speller('master', '/home/dictionary.rws')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
aspell.AspellSpellerError: The file "/home/dictionary.rws" can not be opened for reading.
>>>
The AspellSpeller object provides interface to the aspell. It has several methods, described below.
- ConfigKeys
- check
- suggest
- addReplacement
- addtoPersonal
- saveAllwords
- addtoSession
- clearSession
- getPersonalwordlist
- getSessionwordlist
- getMainwordlist
In examples the assumption is that following code has been executed earlier:
>>> import aspell
>>> s = aspell.Speller('lang', 'en')
>>> s
<AspellSpeller object at 0x40209050>
>>>
New in version 1.1, changed in 1.13.
Method returns current configuration of speller.
Result has the same meaning as ConfigKeys()
procedure.
New in version 1.14
Method alters configuration value. Note that depending on key's type value is expected to be: string, boolean or integer.
Although setting all keys is possible, changes to some of them have no effect. For example changing lang doesn't change current language, it's an aspell limitation (feature).
Method checks spelling of given word
. If word
is present in
the main or personal (see addtoPersonal) or session dictionary
(see addtoSession) returns True, otherwise False.
>>> s.check('word') # correct word
True
>>> s.check('wrod') # incorrect
False
>>>
New in version 1.13.
It's possible to use operator in
or not in
instead
of check()
.
>>> 'word' in s
True
>>> 'wrod' in s
False
>>>
Method returns a list of suggested spellings for given word. Even if word is correct, i.e. method check returned 1, action is performed.
>>> s.suggest('wrod') # we made mistake, what aspell suggests?
['word', 'Rod', 'rod', 'Brod', 'prod', 'trod', 'Wood', 'wood', 'wried']
>>>
Warning! suggest()
in aspell 0.50 is very, very slow. I
recommend caching it's results if program calls the function several
times with the same argument.
Adds a replacement pair, it affects order of words in suggest result.
>>> # we choose 7th word from previous result
>>> s.addReplacement('wrod', 'trod')
>>> # and the selected word appears at the 1st position
>>> s.suggest('word')
['trod', 'word', 'Rod', 'rod', 'Brod', 'prod', 'Wood', 'wood', 'wried']
If config key save-repl
is true
method saveAllwords saves
the replacement pairs to file ~/.aspell.{lang_code}.prepl
.
Adds word to the personal dictionary, which is stored in file
~./.aspell.{lang_code}.pws
. The added words are available for
AspellSpeller object, but they remain unsaved until method saveAllwords
is called.
# personal dictionary is empty now $ cat ~/.aspell.en.pws personal_ws-1.1 en 0 $ python >>> import aspell >>> s = aspell.Speller('lang', 'en') # word 'aspell' doesn't exist >>> s.check('aspell') 0 # we add it to the personal dictionary >>> s.addtoPersonal('aspell') # and now aspell knows it >>> s.check('aspell') 1 # we save personal dictionary >>> s.saveAllwords() # new word appeared in the file $ cat ~/.aspell.en.pws personal_ws-1.1 en 1 aspell # check it once again $ python >>> import aspell >>> s = aspell.Speller('lang', 'en') # aspell still knows it's own name >>> s.check('aspell') 1 >>> s.check('aaa') 0 >>> s.check('bbb') 0 # add incorrect words, they shouldn't be saved >>> s.addtoPersonal('aaa') >>> s.addtoPersonal('bbb') >>> s.check('aaa') 1 >>> s.check('bbb') 1 # we've exit without saving, words 'aaa' and 'bbb' doesn't exists $ cat ~/.aspell.en.pws personal_ws-1.1 en 1 aspell $
Adds word to the session dictionary. The session dictionary is volatile, it is not saved to any file. It is destroyed with AspellSpeller object or when method clearSession is called.
Save all words from personal dictionary.
Clears session dictionary.
>>> import aspell
>>> s = aspell.Speller('lang', 'en')
>>> s.check('linux')
0
>>> s.addtoSession('linux')
>>> s.check('linux')
1
>>> s.clearSession()
>>> s.check('linux')
0
Returns list of words from personal dictionary.
Returns list of words from session dictionary.
>>> s.addtoSession('aaa')
>>> s.addtoSession('bbb')
>>> s.getSessionwordlist()
['aaa', 'bbb']
>>> s.clearSession()
>>> s.getSessionwordlist()
[]
>>>
Returns list of words from the main dictionary.
All version of aspell I've tested have the same error - calling method
getMainwordlist produces SIGKILL
. It is aspell problem and if you
really need a full list of words, use external program
word-list-compress
.
method | aspell 0.50.5 | aspell 0.60.2 | aspell 0.60.3 |
ConfigKeys | ok | ok | ok |
Speller | ok | ok | ok |
check | ok | ok | ok |
suggest | ok | ok | ok |
addReplacement | ok | ok | ok |
addtoPersonal | ok | ok | ok |
saveAllwords | ok | ok | ok |
addtoSession | ok | ok | ok |
clearSession | ok | AspellSpellerError | ok |
getPersonalwordlist | ok | SIGKILL | ok |
getSessionwordlist | ok | SIGKILL | ok |
getMainwordlist | SIGKILL | SIGKILL | SIGKILL |
Aspell uses 8-bit encoding. The encoding depend on dictionary setting and
is stored in key encoding
. One can obtain this key using speller's
ConfigKeys.
If your application uses other encoding than aspell, the translation is needed. Here is a sample session (polish dictionary is used).
>>> import aspell
>>> s=aspell.Speller('lang', 'pl')
>>>
>>> s.ConfigKeys()['encoding']
('string', u'iso-8859-1', 'encoding to expect data to be in')
>>> enc =s.ConfigKeys()['encoding'][1]
>>> enc # dictionary encoding
'iso-8859-1'
>>> word # encoding of word is utf8
# 'gżegżółka' means in some polish dialects 'cuckoo'
'g\xc5\xbceg\xc5\xbc\xc3\xb3\xc5\x82ka'
>>> s.check(word)
0
>>> s.check( unicode(word, 'utf-8').encode(enc) )
1