This repository contains the code to convert English Digital Documents(pdf) into Hindi.
Below are the two components present :
We have used opennmt to train and serve the model.Follow below instructions to start this component:
- Installing Dependencies
cd OpenNMT-py && pip install -r requirements.txt --no-cache-dir
- Model Download:
Download model from [here] and copy inside the ./OpenNMT-py/available_models folder
- Start Server(Will start a server at default port 5000)
python server.py
Below is a sample curl request to test the results:
curl --header "Content-Type: application/json" --request POST --data '[{"id":100,"src":"You should refrain from doing this."}]' http://localhost:5000/translator/translate
Tools like postman etc can also be used to test the api.
- Installing Dependencies
pip install -r requirements.txt
- Start Server (will start a server on port 5001)
export PYTHONPATH=$PWD && python src/app.py
Note: Having too many pages in the pdf might take a bit of time for the API to return the results. On successfull processing, a text file with the converted hindi text will be generated.