I recommend you to install everything in a virtual environment. I usually use virtualenv
but any other environment manager should work.
virtualenv -p python3.10 venv
activate the environment
sourve venv/bin/activate
and install dependencies
pip install -r requirements.txt
Now we need to download the content of the blogs. Here I provide a list of feeds examples, but feel free to use your own. To download the content do
python scripts/download_content.py --feed-path feeds.txt
## Launch app
Finally, once the content is crawled and stored you can run the app as
python -m microsearch.app --data-path output.parquet