Contains automated scrapers that collect data in the format of triples (i.e. [head, relation, tail] tuples).
- Clone the repo.
- Change to the top-level directory of the repo.
- Run
./setup_repo.sh
to install dependencies and make needed virtual environments. This is run once to setup the repo.
- Run
./setup_env.sh
to set needed environment variable and source the virtual environment. This is run once each time you begin working with code in the repo.
- Run
python thor/thor_scraper.py
. Triples extracted from simulated rooms will be contained in pickle files inside thetriple-scrapers/thor/rooms/
folder. - To scrape the triples into text files instead of a pickle, see comment in the
main
function ofthor_scraper.py
inside thethor
folder.