Educe is a library for working with discourse-annotated corpora. It also includes some utility scripts for building, maintaining, and for querying these corpora. Currently supported corpora are
- (SDRT) STAC corpus
- (RST) RST Discourse Treebank (experimental, 2014-07-14)
- (PDTB) Penn Discourse Treebank (experimental, 2014-07-14)
If you have a discourse-annotated corpus, or are trying to build one, you may find it useful to add support for it to educe.
First, try
pip --help
If this doesn't work, download this setup script and run
python distribute_setup.py
easy_install pip
If you have pip installed, then install educe and its dependencies:
pip install -r requirements.txt --use-mirrors .
- Documentation
- Attelo: a discourse parser