Skip to content
forked from oaqa/FlexNeuART

Flexible classic and NeurAl Retrieval Toolkit

License

Apache-2.0, Unknown licenses found

Licenses found

Apache-2.0
LICENSE
Unknown
LICENSE.RankLib
Notifications You must be signed in to change notification settings

alpers/FlexNeuART

 
 

Repository files navigation

FlexNeuART (flex-noo-art)

Flexible classic and NeurAl Retrieval Toolkit, or shortly FlexNeuART (intended pronunciation flex-noo-art) is a substantially reworked knn4qa package. The overview can be found in our EMNLP OSS workshop paper: Flexible retrieval with NMSLIB and FlexNeuART, 2020. Leonid Boytsov, Eric Nyberg.

In Aug-Dec 2020, we used this framework to generate best traditional and/or neural runs in the MSMARCO Document ranking task. In fact, our best traditional (non-neural) run slightly outperformed a couple of neural submissions. The code for the best-performing neural model will be published within 2-3 months. This model is described in our ECIR 2021 paper: Boytsov, Leonid, and Zico Kolter. "Exploring Classic and Neural Lexical Translation Models for Information Retrieval: Interpretability, Effectiveness, and Efficiency Benefits." ECIR 2021.

FlexNeuART is under active development. More detailed description and documentaion is to appear. Currently we have:

The framework supports data in generic JSONL format. We provide conversion (and in some cases download) scripts for the following collections:

  • MS MARCO data (documents and passages)
  • Yahoo Answers collections
  • Cranfield

For neural network training FlexNeuART incorporates a re-worked variant of CEDR (MacAvaney et al' 2019).

About

Flexible classic and NeurAl Retrieval Toolkit

Resources

License

Apache-2.0, Unknown licenses found

Licenses found

Apache-2.0
LICENSE
Unknown
LICENSE.RankLib

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 49.6%
  • C 17.8%
  • Python 15.4%
  • Jupyter Notebook 10.1%
  • Shell 6.6%
  • Makefile 0.3%
  • Perl 0.2%