Skip to content

Commit

Permalink
Renamed topics and qrels for consistency; updated documenation (casto…
Browse files Browse the repository at this point in the history
  • Loading branch information
lintool authored Nov 1, 2019
1 parent a61885d commit a76301a
Show file tree
Hide file tree
Showing 74 changed files with 1,058 additions and 624 deletions.
43 changes: 28 additions & 15 deletions docs/regressions-car17v1.5.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,17 @@
# Anserini: Regressions for [CAR17](http://trec-car.cs.unh.edu/) (v1.5)

This page documents regression experiments for the [TREC 2017 Complex Answer Retrieval (CAR)](http://trec-car.cs.unh.edu/) section-level passage retrieval task (v1.5).
The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/car17v1.5.yaml).
Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/car17v1.5.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.

## Indexing

Typical indexing command:

```
nohup sh target/appassembler/bin/IndexCollection -collection CarCollection \
-generator LuceneDocumentGenerator -threads 1 -input /path/to/car17v1.5 -index \
lucene-index.car17v1.5.pos+docvectors+rawdocs -storePositions -storeDocvectors \
-storeRawDocs >& log.car17v1.5.pos+docvectors+rawdocs &
nohup sh target/appassembler/bin/IndexCollection -collection CarCollection -input /path/to/car17v1.5 \
-index lucene-index.car17v1.5.pos+docvectors+rawdocs -generator LuceneDocumentGenerator -threads 1 \
-storePositions -storeDocvectors -storeRawDocs >& log.car17v1.5.pos+docvectors+rawdocs &
```

The directory `/path/to/car17v1.5` should be the root directory of Complex Answer Retrieval (CAR) paragraph corpus (v1.5), which can be downloaded [here](http://trec-car.cs.unh.edu/datareleases/).
Expand All @@ -19,27 +20,39 @@ For additional details, see explanation of [common indexing options](common-inde

## Retrieval

The "benchmarkY1-test" topics and qrels (v1.5) are stored in `src/main/resources/topics-and-qrels/`, downloaded from [the CAR website](http://trec-car.cs.unh.edu/datareleases/):
The "benchmarkY1-test" topics and qrels (v1.5) are stored in [`src/main/resources/topics-and-qrels/`](../src/main/resources/topics-and-qrels/), downloaded from [the CAR website](http://trec-car.cs.unh.edu/datareleases/):

+ `topics.car17v1.5.benchmarkY1test.txt`
+ `qrels.car17v1.5.benchmarkY1test.txt`
+ [`topics.car17v1.5.benchmarkY1test.txt`](../src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt)
+ [`qrels.car17v1.5.benchmarkY1test.txt`](../src/main/resources/topics-and-qrels/qrels.car17v1.5.benchmarkY1test.txt)

Specifically, this is the section-level passage retrieval task with automatic ground truth.

After indexing has completed, you should be able to perform retrieval as follows:

```
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v1.5.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt -output run.car17v1.5.bm25.topics.car17v1.5.benchmarkY1test.txt -bm25 &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v1.5.pos+docvectors+rawdocs \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt \
-bm25 -output run.car17v1.5.bm25.topics.car17v1.5.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v1.5.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt -output run.car17v1.5.bm25+rm3.topics.car17v1.5.benchmarkY1test.txt -bm25 -rm3 &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v1.5.pos+docvectors+rawdocs \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt \
-bm25 -rm3 -output run.car17v1.5.bm25+rm3.topics.car17v1.5.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v1.5.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt -output run.car17v1.5.bm25+ax.topics.car17v1.5.benchmarkY1test.txt -bm25 -axiom -rerankCutoff 20 -axiom.deterministic &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v1.5.pos+docvectors+rawdocs \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt \
-bm25 -axiom -rerankCutoff 20 -axiom.deterministic -output run.car17v1.5.bm25+ax.topics.car17v1.5.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v1.5.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt -output run.car17v1.5.ql.topics.car17v1.5.benchmarkY1test.txt -ql &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v1.5.pos+docvectors+rawdocs \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt \
-ql -output run.car17v1.5.ql.topics.car17v1.5.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v1.5.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt -output run.car17v1.5.ql+rm3.topics.car17v1.5.benchmarkY1test.txt -ql -rm3 &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v1.5.pos+docvectors+rawdocs \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt \
-ql -rm3 -output run.car17v1.5.ql+rm3.topics.car17v1.5.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v1.5.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt -output run.car17v1.5.ql+ax.topics.car17v1.5.benchmarkY1test.txt -ql -axiom -rerankCutoff 20 -axiom.deterministic &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v1.5.pos+docvectors+rawdocs \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt \
-ql -axiom -rerankCutoff 20 -axiom.deterministic -output run.car17v1.5.ql+ax.topics.car17v1.5.benchmarkY1test.txt &
```

Expand All @@ -66,11 +79,11 @@ With the above commands, you should be able to replicate the following results:

MAP | BM25 | +RM3 | +Ax | QL | +RM3 | +Ax |
:---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
[TREC 2017 CAR: benchmarkY1test (v1.5)](http://trec-car.cs.unh.edu/datareleases/)| 0.1562 | 0.1295 | 0.1358 | 0.1386 | 0.1080 | 0.1048 |
[TREC 2017 CAR: benchmarkY1test (v1.5)](../src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt/)| 0.1562 | 0.1295 | 0.1358 | 0.1386 | 0.1080 | 0.1048 |


RECIP_RANK | BM25 | +RM3 | +Ax | QL | +RM3 | +Ax |
:---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
[TREC 2017 CAR: benchmarkY1test (v1.5)](http://trec-car.cs.unh.edu/datareleases/)| 0.2331 | 0.1923 | 0.1949 | 0.2037 | 0.1599 | 0.1524 |
[TREC 2017 CAR: benchmarkY1test (v1.5)](../src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt/)| 0.2331 | 0.1923 | 0.1949 | 0.2037 | 0.1599 | 0.1524 |


46 changes: 29 additions & 17 deletions docs/regressions-car17v2.0-doc2query.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,17 +7,17 @@ This page documents regression experiments for the [TREC 2017 Complex Answer Ret
These experiments are integrated into Anserini's regression testing framework.
For more complete instructions on how to run end-to-end experiments, refer to [this page](experiments-doc2query.md).

The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/car17v2.0-doc2query.yaml).
Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/car17v2.0-doc2query.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.

## Indexing

Typical indexing command:

```
nohup sh target/appassembler/bin/IndexCollection -collection JsonCollection \
-generator LuceneDocumentGenerator -threads 30 -input \
/path/to/car17v2.0-doc2query -index \
lucene-index.car17v2.0-doc2query.pos+docvectors+rawdocs -storePositions \
-storeDocvectors -storeRawDocs >& log.car17v2.0-doc2query.pos+docvectors+rawdocs \
&
nohup sh target/appassembler/bin/IndexCollection -collection JsonCollection -input /path/to/car17v2.0-doc2query \
-index lucene-index.car17v2.0-doc2query.pos+docvectors+rawdocs -generator LuceneDocumentGenerator -threads 30 \
-storePositions -storeDocvectors -storeRawDocs >& log.car17v2.0-doc2query.pos+docvectors+rawdocs &
```

The directory `/path/to/car17v2.0-doc2query` should be the root directory of Complex Answer Retrieval (CAR) paragraph corpus (v2.0) that has been augmented with the Doc2query expansions, i.e., `collection_jsonl_expanded_topk10/` as described in [this page](experiments-doc2query.md).
Expand All @@ -26,27 +26,39 @@ For additional details, see explanation of [common indexing options](common-inde

## Retrieval

The "benchmarkY1-test" topics and qrels (v2.0) are stored in `src/main/resources/topics-and-qrels/`, downloaded from [the CAR website](http://trec-car.cs.unh.edu/datareleases/):
The "benchmarkY1-test" topics and qrels (v2.0) are stored in [`src/main/resources/topics-and-qrels/`](../src/main/resources/topics-and-qrels/), downloaded from [the CAR website](http://trec-car.cs.unh.edu/datareleases/):

+ `topics.car17v2.0.benchmarkY1test.txt`
+ `qrels.car17v2.0.benchmarkY1test.txt`
+ [`topics.car17v2.0.benchmarkY1test.txt`](../src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt)
+ [`qrels.car17v2.0.benchmarkY1test.txt`](../src/main/resources/topics-and-qrels/qrels.car17v2.0.benchmarkY1test.txt)

Specifically, this is the section-level passage retrieval task with automatic ground truth.

After indexing has completed, you should be able to perform retrieval as follows:

```
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v2.0-doc2query.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -output run.car17v2.0-doc2query.bm25.topics.car17v2.0.benchmarkY1test.txt -bm25 &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v2.0-doc2query.pos+docvectors+rawdocs \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt \
-bm25 -output run.car17v2.0-doc2query.bm25.topics.car17v2.0.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v2.0-doc2query.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -output run.car17v2.0-doc2query.bm25+rm3.topics.car17v2.0.benchmarkY1test.txt -bm25 -rm3 &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v2.0-doc2query.pos+docvectors+rawdocs \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt \
-bm25 -rm3 -output run.car17v2.0-doc2query.bm25+rm3.topics.car17v2.0.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v2.0-doc2query.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -output run.car17v2.0-doc2query.bm25+ax.topics.car17v2.0.benchmarkY1test.txt -bm25 -axiom -rerankCutoff 20 -axiom.deterministic &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v2.0-doc2query.pos+docvectors+rawdocs \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt \
-bm25 -axiom -rerankCutoff 20 -axiom.deterministic -output run.car17v2.0-doc2query.bm25+ax.topics.car17v2.0.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v2.0-doc2query.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -output run.car17v2.0-doc2query.ql.topics.car17v2.0.benchmarkY1test.txt -ql &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v2.0-doc2query.pos+docvectors+rawdocs \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt \
-ql -output run.car17v2.0-doc2query.ql.topics.car17v2.0.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v2.0-doc2query.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -output run.car17v2.0-doc2query.ql+rm3.topics.car17v2.0.benchmarkY1test.txt -ql -rm3 &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v2.0-doc2query.pos+docvectors+rawdocs \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt \
-ql -rm3 -output run.car17v2.0-doc2query.ql+rm3.topics.car17v2.0.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v2.0-doc2query.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -output run.car17v2.0-doc2query.ql+ax.topics.car17v2.0.benchmarkY1test.txt -ql -axiom -rerankCutoff 20 -axiom.deterministic &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v2.0-doc2query.pos+docvectors+rawdocs \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt \
-ql -axiom -rerankCutoff 20 -axiom.deterministic -output run.car17v2.0-doc2query.ql+ax.topics.car17v2.0.benchmarkY1test.txt &
```

Expand All @@ -73,11 +85,11 @@ With the above commands, you should be able to replicate the following results:

MAP | BM25 | +RM3 | +Ax | QL | +RM3 | +Ax |
:---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
[TREC 2017 CAR: benchmarkY1test (v2.0)](http://trec-car.cs.unh.edu/datareleases/)| 0.1807 | 0.1521 | 0.1470 | 0.1752 | 0.1453 | 0.1339 |
[TREC 2017 CAR: benchmarkY1test (v2.0)](../src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt)| 0.1807 | 0.1521 | 0.1470 | 0.1752 | 0.1453 | 0.1339 |


RECIP_RANK | BM25 | +RM3 | +Ax | QL | +RM3 | +Ax |
:---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
[TREC 2017 CAR: benchmarkY1test (v2.0)](http://trec-car.cs.unh.edu/datareleases/)| 0.2750 | 0.2275 | 0.2186 | 0.2653 | 0.2156 | 0.1981 |
[TREC 2017 CAR: benchmarkY1test (v2.0)](../src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt)| 0.2750 | 0.2275 | 0.2186 | 0.2653 | 0.2156 | 0.1981 |


Loading

0 comments on commit a76301a

Please sign in to comment.