TextSnake and Ecotron2018

Installation and usage

The scripts runs out of the box, no setup required.

`img_mean_and_std.py`

$ python3 data_mean_and_std.py 

Compute RGB channel-wise mean and std of given images.

Usage:
    img_mean_and_std.py IMG...

Output for first four time frames of T17, T19, T20, T21, T23 and T24:

Mean: 77.12583122033104 69.66186087747802 65.88590173118982
Std: 9.664289791969221 8.175145880877395 7.81001608502447

`rename_files.py`

$ python rename_files.py 

Replace occurrences of PATTERN with REPLACEMENT in each FILE's name.

Useful because TextSnake annotations generated by [1] are named differently
than the original minirhizotron images, but partition.py relies on matching file
names to link them together.

[1] https://gitlab.informatik.uni-halle.de/moeller/minirhizotron_annotation

Usage:
    rename_files.py PATTERN REPLACEMENT FILE...

`partition.py`

$ python partition.py --help
usage: partition.py [-h] --images DIR --roots DIR --centerlines DIR --radii DIR --sin DIR --cos DIR --crop-width INT --val-split
                    INT --test-split INT [-y] [--dry-run] [--vis] [-r INT]

Partition the Ecotron-EInsect-2018 dataset into training, validation, and testing sets.

optional arguments:
  -h, --help            show this help message and exit
  -y, --yes             Assume answer "yes" for all questions
  --dry-run             Only simulate the process, don't actually touch anything
  --vis                 Create visualization of selection in ./vis (requires matplotlib)
  -r INT, --random-seed INT
                        Seed for the random number generator

Data input:
  --images DIR          directory with images
  --roots DIR           directory with root masks for given images
  --centerlines DIR     directory with center line masks for given images
  --radii DIR           directory with radii maps for given images
  --sin DIR             directory with sine maps for given images
  --cos DIR             directory with cosine maps for given images
  --crop-width INT      crop width

Split control:
  --val-split INT       percentage of data going into validation set
  --test-split INT      percentage of data going into test set

Partition the Ecotron-EInsect-2018 dataset into training, validation, and
testing sets.

The set contains only a few minirhizotron images. This script also expects for
each image five feature maps and masks defining the input for the TextSnake
neural net. The maps and masks can be generated with
https://gitlab.informatik.uni-halle.de/moeller/minirhizotron_annotation.

Because the minirhizotron images are very big and wide (about 5000x700 pixels),
they (and of course all feature maps and masks) are cropped into smaller parts
before they are fed into the neural net.

This script partitions these crops into training, validation, and testing set.
But this is tricky because usually these crops overlap in both x and y
direction. We make sure that no data goes into more than one set. Because of
the overlaps, there is no other possibility than simply ignoring some of the
crops, i.e. do not put them in any set. The sets are moved into
subdirectories "training", "test", and "validation".

We want parts of every image in all splits. The idea is to divide an image into
three not-overlapping patches, one for each subset. The size of the validation
and test patch is configurable via the --val-split and --test-split arguments,
which makes the size of the training patch 1 minus the sum of these arguments.
Crops at the patch boundaries belonging to two patches are excluded.

This script is designed to work with the output of dl_cropImages.sh. However, it
should work with every input files as long as the file names match this format:
    <original image file name>-<arbitrary string>+<position x>+<position y>.<arbitrary file type>
where <original image file name> is used as hint for which crops belong to the
same image. <position x> and <position y> denote the crop's upper left corner in
its original image's coordinate system. Precisely, the names must match this
regular expression:
    .*-.*\+[0-9]+\+[0-9]+\.[a-zA-Z0-9_]+$
An example:
    EInsect_T017_CTS_09.05.18_000000_1_HMC-roots-00510x00510+00000+00000.tif
However, it can be easily adapted to other file names by changing the
VALID_FILE_NAME variable and get_metadata_from_filename function.

Name		Name	Last commit message	Last commit date
Latest commit Cannot retrieve latest commit at this time. History 13 Commits
.gitignore		.gitignore
README.md		README.md
data_mean_and_std.py		data_mean_and_std.py
partition.py		partition.py
rename_files.py		rename_files.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TextSnake and Ecotron2018

Installation and usage

`img_mean_and_std.py`

`rename_files.py`

`partition.py`

About

Releases

Packages

Languages

Limsande/eco2018-textsnake

Folders and files

Latest commit

History

Repository files navigation

TextSnake and Ecotron2018

Installation and usage

img_mean_and_std.py

rename_files.py

partition.py

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`img_mean_and_std.py`

`rename_files.py`

`partition.py`

Packages