Skip to content

shankarpd/satellite-image-deep-learning

 
 

Repository files navigation

Introduction

This document primarily lists resources for performing deep learning (DL) on satellite imagery. To a lesser extent Machine learning (ML, e.g. random forests, stochastic gradient descent) are also discussed, as are classical image processing techniques.

Top links

Table of contents

Datasets

WorldView - SpaceNet

Sentinel

  • As part of the EU Copernicus program, multiple Sentinel satellites are capturing imagery -> see wikipedia.
  • 13 bands, Spatial resolution of 10 m, 20 m and 60 m, 290 km swath, the temporal resolution is 5 days
  • Open access data on GCP
  • Paid access via sentinel-hub and python-api.
  • Example loading sentinel data in a notebook
  • so2sat on Tensorflow datasets - So2Sat LCZ42 is a dataset consisting of co-registered synthetic aperture radar and multispectral optical image patches acquired by the Sentinel-1 and Sentinel-2 remote sensing satellites, and the corresponding local climate zones (LCZ) label. The dataset is distributed over 42 cities across different continents and cultural regions of the world.
  • eurosat - EuroSAT dataset is based on Sentinel-2 satellite images covering 13 spectral bands and consisting of 10 classes with 27000 labeled and geo-referenced samples. Dataset and usage in EuroSAT: Land Use and Land Cover Classification with Sentinel-2, where a CNN achieves a classification accuracy 98.57%.
  • bigearthnet - The BigEarthNet is a new large-scale Sentinel-2 benchmark archive, consisting of 590,326 Sentinel-2 image patches. The image patch size on the ground is 1.2 x 1.2 km with variable image size depending on the channel resolution. This is a multi-label dataset with 43 imbalanced labels.

Landsat

Shuttle Radar Topography Mission (digital elevation maps)

Aerial imagery (drones)

Kaggle

Kaggle hosts over 60 satellite image datasets, search results here. The kaggle blog is an interesting read.

Kaggle - Amazon from space - classification challenge

Kaggle - DSTL - segmentation challenge

Kaggle - Airbus Ship Detection Challenge

Kaggle - Draper - place images in order of time

Kaggle - Deepsat - classification challenge

Not satellite but airborne imagery. Each sample image is 28x28 pixels and consists of 4 bands - red, green, blue and near infrared. The training and test labels are one-hot encoded 1x6 vectors. Each image patch is size normalized to 28x28 pixels. Data in .mat Matlab format. JPEG?

  • Imagery source
  • Sat4 500,000 image patches covering four broad land cover classes - barren land, trees, grassland and a class that consists of all land cover classes other than the above three Example notebook
  • Sat6 405,000 image patches each of size 28x28 and covering 6 landcover classes - barren land, trees, grassland, roads, buildings and water bodies.
  • Deep Gradient Boosted Learning article

Kaggle - other

Alternative datasets

There are a variety of datasets suitable for land classification problems.

Tensorflow datasets

  • There are a number of remote sensing datasets
  • resisc45 - RESISC45 dataset is a publicly available benchmark for Remote Sensing Image Scene Classification (RESISC), created by Northwestern Polytechnical University (NWPU). This dataset contains 31,500 images, covering 45 scene classes with 700 images in each class.
  • eurosat - EuroSAT dataset is based on Sentinel-2 satellite images covering 13 spectral bands and consisting of 10 classes with 27000 labeled and geo-referenced samples.
  • bigearthnet - The BigEarthNet is a new large-scale Sentinel-2 benchmark archive, consisting of 590,326 Sentinel-2 image patches. The image patch size on the ground is 1.2 x 1.2 km with variable image size depending on the channel resolution. This is a multi-label dataset with 43 imbalanced labels.

UC Merced

AWS datasets

Quilt

  • Several people have uploaded datasets to Quilt

Google Earth Engine

Weather Datasets

UAV datasets

Synthetic data

Interesting deep learning projects

Raster Vision by Azavea

RoboSat

neat-EO

DeepOSM

DeepNetsForEO - segmentation

Skynet-data

Techniques

This section explores the different techniques (DL, ML & classical) people are applying to common problems in satellite imagery analysis. Classification problems are the most simply addressed via DL, object detection is harder, and cloud detection harder still (niche interest).

Land classification

Semantic segmentation

Change detection

Image registration

Object detection

Cloud detection

  • A subset of the object detection problem, but surprisingly challenging
  • From this article on sentinelhub there are three popular classical algorithms that detects thresholds in multiple bands in order to identify clouds. In the same article they propose using semantic segmentation combined with a CNN for a cloud classifier (excellent review paper here), but state that this requires too much compute resources.
  • This article compares a number of ML algorithms, random forests, stochastic gradient descent, support vector machines, Bayesian method.

Wealth and ecenomic activity measurement

The goal is to predict ecenomic activity from satellite imagery rather than conducting labour intensive ground surveys

Super resolution

Pansharpening

Stereo imaging for terrain mapping & DEMs

Lidar

NVDI - vegetation index

SAR

Aerial imagery (drones)

Image formats and catalogues

STAC - SpatioTemporal Asset Catalog

State of the art

What are companies doing?

Batch processing

Online platforms for Geo analysis

  • This article discusses some of the available platforms -> TLDR Pangeo rocks, but must BYO imagery
  • Pangeo - open source resources for parallel processing using Dask and Xarray http://pangeo.io/index.html
  • Airbus Sandbox -> will provide access to imagery
  • Descartes Labs -> access to EO imagery from a variety of providers via python API -> not clear which imagery is available (Airbus + others?) or pricing
  • DigitalGlobe have a cloud hosted Jupyter notebook platform called GBDX. Cloud hosting means they can guarantee the infrastructure supports their algorithms, and they appear to be close/closer to deploying DL. Tutorial notebooks here. Only Sentinel-2 and Landsat data on free tier.
  • Planet have a Jupyter notebook platform which can be deployed locally and requires an API key (14 days free). They have a python wrapper (2.7..) to their rest API. No price after 14 day trial.
  • Earth-i Spectrum appears to allow processing of imagery, with the capability to perform segmentation, change detection, object recognition. This promo video contains some screenshots of the application.

Free online computing resources

Generally a GPU is required for DL, and this section lists a couple of free Jupyter environments with GPU available. There is a good overview of online Jupyter envs on the fast.at site.

Google Colab

  • Collaboratory notebooks with GPU as a backend for free for 12 hours at a time. Note that the GPU may be shared with other users, so if you aren't getting good performance try reloading.
  • Also a pro tier for $10 a month -> https://colab.research.google.com/signup
  • Tensorflow available & pytorch can be installed, useful articles

Kaggle - also Google!

  • Free to use
  • GPU Kernels - may run for 1 hour
  • Tensorflow, pytorch & fast.ai available
  • Advantage that many datasets are already available
  • Read

Production

Custom REST API

Tensorflow Serving

  • https://www.tensorflow.org/serving/
  • TensorFlow Serving makes it easy to deploy new algorithms and experiments, while keeping the same server architecture and APIs. Multiple models, or indeed multiple versions of the same model, can be served simultaneously. TensorFlow Serving comes with a scheduler that groups individual inference requests into batches for joint execution on a GPU

chip-n-scale-queue-arranger by developmentseed

Useful open source software

Movers and shakers on Github

  • Chris Holmes is doing great things at Planet
  • Christoph Rieke maintains a very popular imagery repo and has published his thesis on segmentation
  • Robin Wilson is a former academic who is very active in the satellite imagery space

Courses

Online communities

Geopsatial companies

For fun

Useful References

About the author

My background is optical physics, and I have a PhD from Cambridge on the topic of Plasmon enhanced Raman spectroscopy. After doing a post doc I left academia and took a variety of roles, from industrial research at Sharp Labs Europe, to medical physics, to building optical telescopes at Surrey Satellites (SSTL). It was whilst at SSTL that I started this repo as a personal resource. I left SSTL, actually was made redundant along with 30% of the company, and after a brief stint at an IOT start up, I now work as a data engineer. Deep learning is currently a hobby, but I have ambitions to move into this domain when the right opportunity presents itself. Feel free to connect with me on LinkedIn.

About

Resources for deep learning with satellite & aerial imagery

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.8%
  • Python 0.2%