team project for CS6322 - ACL citation analysis
- run ./bootstrap - this initializes the python environment for your unix machine. This environment is fully self-contained in the .pyenv folder thanks to virtualenv
- ./index/manage runserver - launches webserver. it will tell you the IP:PORT to point your web browser
- cd crawl
- ./crawl.sh #this crawls the aclweb site
- make -j <# of processors to use> -k #ignore any errors... the pre-processor isn't perfect
- cd ../index
- ./pre-index #this took over 40 hours to load up the sql. There's a memory leak so the process needs to be restart multiple times