Skip to content

Commit

Permalink
documentation for text recognition
Browse files Browse the repository at this point in the history
  • Loading branch information
Bartzi committed Jul 25, 2017
1 parent 0b87084 commit a58d14d
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 3 deletions.
17 changes: 15 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,13 @@ In order to use the code you will need the following software environment:
3. install all requirements with `pip install -r requirements.txt`
4. clone and install `warp-ctc` from [here](https://github.com/baidu-research/warp-ctc.git)
5. go into the folder `mxnet/metrics/ctc` and run `python setup.py install`
6. copy the resulting `.so` file from `mxnet/metrics/ctc/build` to `mxnet/metrics/ctc`
7. You should be ready to go!
6. clone the [mxnet repository](https://github.com/dmlc/mxnet.git)
7. copy the resulting `.so` file from `mxnet/metrics/ctc/build` to `mxnet/metrics/ctc`
8. checkout the tag `v0.9.3`
9. add the `warpctc` plugin to the project by enabling it in the file `config.mk`
10. compile mxnet
11. install the python bindings of mxnet
12. You should be ready to go!

# Training

Expand Down Expand Up @@ -57,8 +62,16 @@ If you want to follow our experiments with svhn numbers placed in a regular grid
5. start the training using the following command: `python train_svhn.py <path to train.csv> <path to valid.csv> --gpus <gpu id you want to use> --log-dir <where to save the logs> -b <batch size you want ot use> --lr 1e-5`
6. If you are lucky it will work ;)

## Text Recognition

Following our text recognition experiments might be a little difficult, because we can not offer the entire dataset used by us.
But it is possible to perform the experiments based on the Synth-90k dataset provided by Jaderberg et al. [here](https://www.robots.ox.ac.uk/~vgg/data/text/#sec-synth).
After downloading and extracting this file you'll need to adapt the groundtruth file provided with this dataset to fit to the format used by our code. Our format is quite easy.
You need to create a `csv` file with tabular separated values. The first column is the absolute path to the image and the rest of the line are the labels corresponding to this image.

To train the network you can use the `train_text_recognition.py` script. You can start this script in a similar manner to the `train_svhn.py` script.

## FSNS



1 change: 0 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
Cython == 0.26
matplotlib == 2.0.2
mxnet-cu80 == 0.10.0.post2
Pillow == 4.2.1

0 comments on commit a58d14d

Please sign in to comment.