Also see the post here https://www.kaggle.com/c/data-science-bowl-2018/discussion/54742
Newbies' solution - as first time Kagglers (LB = 0.545)
I'm @CPMP's team mate + colleague. Since most of us were first time Kagglers, we implemented a very simple U-net, taken from the U-net starter of @Kjetil Åmdal-Sævik that any DL newbies like us can do.
Model
- Same as the U-net starter except two output channels:
masks
andcontours
with thebinary cross entropy loss
Pre-processing
- CLAHE (Contrast Limited Adaptive Histogram Equalization)
- L-channel only
- Inversion only when the background is lighter than the foreground
- Reference: https://www.kaggle.com/kmader/normalizing-brightfield-stained-and-fluorescence
- The goal is to make it easy for the U-net to learn their common features.
K-mean cluster based on colours
- K-mean cluster using
R, G, R-G deviation
on train/test images. As @Heng suggested, if we train grey and colour images together, their common trait is the shape/boundary, so it'd be hard for the neural net to learn the features inside the nucleus. In a result, our U-net did fairly well on the colour images at the 2nd stage, okay, maybe not too well on @Allen's art :)
Mosaic for training and predicting
-
As suggested by @CPMP, we applied @Emil's mosaic idea (KNN on 4 edges of the image to find the matching neighbours) and merged all images to mosaic to gain the information on the edges https://www.kaggle.com/bonlime/train-test-image-mosaic
-
Around
37%
of the grey test images and about77%
of the colour test images can be formed into mosaic, this ratio still holds true for the stage 2 test images. That means we gained more hidden traits than most teams.
Windowing images for training and predictions
-
Thanks to mosaic, now we have a lot more information on the edges, so we crop images to
256x256
using overlapping windows with a stride of128
, because256x256
yields the best mean IoU. -
Mirror the edges and the corners as suggested by the U-net paper. However, we have less edges than the people who don't use the
mosaic
approach -
Rotate and flip all windowing images, this augments images by
8x
Data augmentation
-
Transform perspective using the
imgaug
library. (this approach is used by the top 1 team) -
Additive Gaussian noise + speckle noise. (used by the top 1 team)
-
Gaussian Blur (no time to train)
-
Image pyramid, upsampling and downsampling based on the average nuclei size per image (no time to train )
-
Regional gamma correction (no time to train)
-
Greyscale (no time to train)
-
Inversion, hurts our LB since it makes the U-net difficult to learn, so we discarded the idea just like the top 1 team
Average windowing predictions
- Predict on overlapping windowing images in rotated/flipped/mirrored versions
- Average all windowing predictions to minimize the artifacts, this approach was used by the winners in the satellite competition last year.
- Stitch predictions together to the original size
Post-processing
-
Otsu threshold on predicted masks and predicted contours
-
Find sure foreground:
(predicted masks - predicted contours)
-
Find sure background: dilates the predicted masks
-
Find unknown area:
(sure background - sure foreground)
-
Find labels on the sure foreground
-
Random walker or watershed on the masks + contours to avoid
contour erosion
, adding the thresholding contour gave us a big LB boost. -
Reference: https://docs.opencv.org/3.3.1/d3/db4/tutorial_py_watershed.html
Model ensemble using weighted majority voting
- @CPMP chose weights based on our public LB scores
4 * ( baseline model + noise ) + 2 * ( baseline model + transform ) + 1 * ( baseline model)
Acknowledgement
Thanks to @CPMP. If he didn't join our team in the last 3 days, we wouldn't have known that we had to train and submit models by the deadline of the first stage and we wouldn't have submitted any models that we could have used at the 2nd stage. Sadly, we didn't get to train all the models with data augmentation we planned to, however, we're happy with the results as first time Kagglers. It's pretty easy to implement our approach. For our uploaded instruction for the first stage, see below.
stage1_train_fixed.zip
withmd5sum 9a3e938a312baa30fcea84c476a278cb
We merged the originalstage1_train.zip
with this fixed stage1 train data, generated the combined masks and contours underprep_masks
under each imageID directory, and programmatically removed the following imageIDs from the training set on the fly.
- For stage 2, if allowed, we'll add
stage1_test.zip
with the released masks to our training data
-
Train 3 models with full size of colour train images with data augmentation and then window on all augmented data and rotate/flip on all windowing data for 10 epochs. With a limitation of total training image size of roughly 300k on AWS with one GPU V100
-
model 1c mosaic with Gaussian noise and speckle noise data augmentation,
sigma = 0.1
python schwaebische_nuclei.py train --maxtrainsize 300000 --mosaic --noise 0.1 --rotate --epoch 10 --valsplit 0 --colouronly
- model 2c mosaic with perspective transform data augmentation,
sigma = 0.175
python schwaebische_nuclei.py train --maxtrainsize 300000 --mosaic --transform 0.175 --rotate --epoch 10 --valsplit 0 --colouronly
- model 3c mosaic only
python schwaebische_nuclei.py train --maxtrainsize 300000 --mosaic --rotate --epoch 10 --valsplit 0 --colouronly
-
Train 3 models with full size of grey/colour train images with data augmentation and then window on all augmented data and rotate/flip on all windowing data for 10 epochs. With a limitation of total training image size of roughly 300k on AWS with one GPU V100
-
model 1g mosaic with Gaussian noise and speckle noise data augmentation,
sigma = 0.1
python schwaebische_nuclei.py train --maxtrainsize 300000 --mosaic --noise 0.1 --rotate --epoch 10 --valsplit 0 --loadmodel output_from_model_1c/
- model 2g mosaic with perspective transform data augmentation,
sigma = 0.175
python schwaebische_nuclei.py train --maxtrainsize 300000 --mosaic --transform 0.175 --rotate --epoch 10 --valsplit 0 --loadmodel output_from_model_2c/
- model 3g mosaic only
python schwaebische_nuclei.py train --maxtrainsize 300000 --mosaic --rotate --epoch 10 --valsplit 0 --loadmodel output_from_model_3c/
-
Window and rotate/flip test images and predict on those grey test images using each model
-
Average weighted predictions of
model_1g
,model_2g
andmodel_3g
with weight distribution[4, 2, 1]
:
predictions = ( 4 * model_1g + 2 * model_2g + 1 * model_1g) / 7
- Predictions on contours and masks will be generated automatically and saved to the disk
-
Window and rotate/flip test images and predict on those colour test images using each model
-
Average weighted predictions of
model_1c
,model_2c
andmodel_3c
with weight distribution[4, 2, 1]
:
predictions = ( 4 * model_1c + 2 * model_2c + 1 * model_3c ) / 7
by issuing the following command:
python schwaebische_nuclei.py predict --loadmodel output_from_all_models/ --weights 4 2 1
- After weighted predictions get averaged, CSV will be generated by the above command, we use this for our submissions
- Predictions on contours and masks will be generated automatically and saved to the disk