JMTEB: Japanese Massive Text Embedding Benchmark

JMTEB is a benchmark for evaluating Japanese text embedding models. It consists of 5 tasks.

This is an easy-to-use evaluation script designed for JMTEB evaluation.

Quick start

git clone [email protected]:sbintuitions/JMTEB
cd JMTEB
poetry install
poetry run pytest tests

The following command evaluate the specified model on the all the tasks in JMTEB.

poetry run python -m jmteb \
  --embedder SentenceBertEmbedder \
  --embedder.model_name_or_path "<model_name_or_path>" \
  --save_dir "output/<model_name_or_path>"

By default, the evaluation tasks are read from src/configs/jmteb.jsonnet. If you want to evaluate the model on a specific task, you can specify the task via --evaluators option with the task config.

poetry run python -m jmteb \
  --evaluators "src/configs/tasks/jsts.jsonnet" \
  --embedder SentenceBertEmbedder \
  --embedder.model_name_or_path "<model_name_or_path>" \
  --save_dir "output/<model_name_or_path>"

Note: Some tasks (e.g., AmazonReviewClassification in classification, JAQKET and Mr.TyDi-ja in retrieval) are time-consuming and memory-consuming. Heavy retrieval tasks take hours to encode the large corpus, and use much memory for the storage of such vectors. If you want to exclude them, add --eval_exclude "['amazon_review_classification', 'mrtydi', 'jaqket']".

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github		.github
docs		docs
src/jmteb		src/jmteb
tests		tests
.flake8		.flake8
.gitignore		.gitignore
.markdownlint.yaml		.markdownlint.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

JMTEB: Japanese Massive Text Embedding Benchmark

Quick start

About

Releases

Packages

Languages

License

ftnext/JMTEB

Folders and files

Latest commit

History

Repository files navigation

JMTEB: Japanese Massive Text Embedding Benchmark

Quick start

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages