Skip to content

Commit

Permalink
Rename Faiss indexes to be consistent with current naming scheme (cas…
Browse files Browse the repository at this point in the history
  • Loading branch information
lintool authored Apr 17, 2023
1 parent 59c1a8a commit b841b53
Show file tree
Hide file tree
Showing 20 changed files with 272 additions and 3,298 deletions.
12 changes: 6 additions & 6 deletions docs/experiments-ance.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Thus, while the scoring script provides results to much higher precision, we hav

```bash
python -m pyserini.search.faiss \
--index msmarco-passage-ance-bf \
--index msmarco-v1-passage.ance \
--topics msmarco-passage-dev-subset \
--encoded-queries ance-msmarco-passage-dev-subset \
--output runs/run.msmarco-passage.ance.bf.tsv \
Expand Down Expand Up @@ -58,7 +58,7 @@ recall_1000 all 0.9584

```bash
python -m pyserini.search.faiss \
--index msmarco-doc-ance-maxp-bf \
--index msmarco-v1-doc.ance-maxp \
--topics msmarco-doc-dev \
--encoded-queries ance_maxp-msmarco-doc-dev \
--output runs/run.msmarco-doc.passage.ance-maxp.txt \
Expand Down Expand Up @@ -103,7 +103,7 @@ recall_100 all 0.9033

```bash
python -m pyserini.search.faiss \
--index wikipedia-ance-multi-bf \
--index wikipedia-dpr-100w.ance-multi \
--topics dpr-nq-test \
--encoded-queries ance_multi-nq-test \
--output runs/run.ance.nq-test.multi.bf.trec \
Expand All @@ -117,7 +117,7 @@ To evaluate, first convert the TREC output format to DPR's `json` format:
```bash
$ python -m pyserini.eval.convert_trec_run_to_dpr_retrieval_run \
--topics dpr-nq-test \
--index wikipedia-dpr \
--index wikipedia-dpr-100w \
--input runs/run.ance.nq-test.multi.bf.trec \
--output runs/run.ance.nq-test.multi.bf.json

Expand All @@ -135,7 +135,7 @@ Top100 accuracy: 0.8787

```bash
python -m pyserini.search.faiss \
--index wikipedia-ance-multi-bf \
--index wikipedia-dpr-100w.ance-multi \
--topics dpr-trivia-test \
--encoded-queries ance_multi-trivia-test \
--output runs/run.ance.trivia-test.multi.bf.trec \
Expand All @@ -149,7 +149,7 @@ To evaluate, first convert the TREC output format to DPR's `json` format:
```bash
$ python -m pyserini.eval.convert_trec_run_to_dpr_retrieval_run \
--topics dpr-trivia-test \
--index wikipedia-dpr \
--index wikipedia-dpr-100w \
--input runs/run.ance.trivia-test.multi.bf.trec \
--output runs/run.ance.trivia-test.multi.bf.json

Expand Down
4 changes: 2 additions & 2 deletions docs/experiments-bpr.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ BPR with brute-force index:

```bash
python -m pyserini.search.faiss \
--index wikipedia-bpr-single-nq-hash \
--index wikipedia-dpr-100w.bpr-single-nq \
--topics dpr-nq-test \
--encoded-queries bpr_single_nq-nq-test \
--output runs/run.bpr.rerank.nq-test.nq.hash.trec \
Expand All @@ -39,7 +39,7 @@ To evaluate, first convert the TREC output format to DPR's `json` format:

```bash
$ python -m pyserini.eval.convert_trec_run_to_dpr_retrieval_run \
--index wikipedia-dpr \
--index wikipedia-dpr-100w \
--topics dpr-nq-test \
--input runs/run.bpr.rerank.nq-test.nq.hash.trec \
--output runs/run.bpr.rerank.nq-test.nq.hash.json
Expand Down
12 changes: 6 additions & 6 deletions docs/experiments-distilbert_kd.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,10 @@ Dense retrieval, with brute-force index:

```bash
python -m pyserini.search.faiss \
--index msmarco-passage-distilbert-dot-margin_mse-T2-bf \
--index msmarco-v1-passage.distilbert-dot-margin-mse-t2 \
--topics msmarco-passage-dev-subset \
--encoded-queries distilbert_kd-msmarco-passage-dev-subset \
--output runs/run.msmarco-passage.distilbert-dot-margin_mse-T2.bf.tsv \
--output runs/run.msmarco-passage.distilbert-dot-margin_mse-t2.bf.tsv \
--output-format msmarco \
--batch-size 36 --threads 12
```
Expand All @@ -26,7 +26,7 @@ To evaluate:

```bash
$ python -m pyserini.eval.msmarco_passage_eval msmarco-passage-dev-subset \
runs/run.msmarco-passage.distilbert-dot-margin_mse-T2.bf.tsv
runs/run.msmarco-passage.distilbert-dot-margin_mse-t2.bf.tsv

#####################
MRR @10: 0.3250
Expand All @@ -39,11 +39,11 @@ For that we first need to convert runs and qrels files to the TREC format:

```bash
$ python -m pyserini.eval.convert_msmarco_run_to_trec_run \
--input runs/run.msmarco-passage.distilbert-dot-margin_mse-T2.bf.tsv \
--output runs/run.msmarco-passage.distilbert-dot-margin_mse-T2.bf.trec
--input runs/run.msmarco-passage.distilbert-dot-margin_mse-t2.bf.tsv \
--output runs/run.msmarco-passage.distilbert-dot-margin_mse-t2.bf.trec

$ python -m pyserini.eval.trec_eval -c -mrecall.1000 -mmap msmarco-passage-dev-subset \
runs/run.msmarco-passage.distilbert-dot-margin_mse-T2.bf.trec
runs/run.msmarco-passage.distilbert-dot-margin_mse-t2.bf.trec

map all 0.3308
recall_1000 all 0.9553
Expand Down
2 changes: 1 addition & 1 deletion docs/experiments-distilbert_tasb.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Dense retrieval, with brute-force index:

```bash
python -m pyserini.search.faiss \
--index msmarco-passage-distilbert-dot-tas_b-b256-bf \
--index msmarco-v1-passage.distilbert-dot-tas_b-b256 \
--topics msmarco-passage-dev-subset \
--encoded-queries distilbert_tas_b-msmarco-passage-dev-subset \
--output runs/run.msmarco-passage.distilbert-dot-tas_b-b256.bf.tsv \
Expand Down
Loading

0 comments on commit b841b53

Please sign in to comment.