This project contains Python implementations of various climate indices which provide a geographical and temporal picture of the severity of precipitation and temperature anomalies. We attempt to provide best-of-breed implementations of various climate indices commonly used for climate and drought monitoring, to provide a codebase that is available for development by the climate science community, and to facilitate the use of climate indices datasets computed in a standardized, reproducible, and transparent fashion.
Currently provided climate indices:
- SPI, Standardized Precipitation Index
- SPEI, Standardized Precipitation Evapotranspiration Index
- PET, Potential Evapotranspiration: computed using Thornthwaite's equation
- PNP, Percentage of Normal Precipitation
- PDSI, Palmer Drought Severity Index
- scPDSI, Self-calibrated Palmer Drought Severity Index
- PHDI, Palmer Hydrological Drought Index
- Z-Index, Palmer moisture anomaly index (Z-index)
- PMDI, Palmer Modified Drought Index
This Python implementation of the above climate indices algorithms is being developed with the following goals in mind:
- to provide an open source software package to compute a suite of climate indices commonly used for drought monitoring, with well documented code that is faithful to the literature, and scientifically valid results
- to provide transparency into the operational code used for climate monitoring activities at NCEI, and reproducibility for users of datasets computed from this package
- to facilitate standardization and consensus on best-of-breed algorithms and accompanying implementations
- to serve as an example of open source scientific development process, incorporating software engineering principles and programming best practices
The configuration and usage described below shows the indices computation module being installed and used via shell commands calling scripts that perform data management and computation of climate indices from provided inputs. Interaction with the module is assumed to be performed using a bash shell, either on Linux, Windows, or MacOS.
Windows users will need to install and configure a bash shell in order to follow the usage shown below. Recommended for this are babun or Cygwin.
Clone this repository:
$ git clone https://github.com/monocongo/indices_python.git
Move into the source directory:
$ cd indices_python
This project's code is written for Python 3. It's recommended that you use an installation of the Anaconda Python 3 distribution. The below instructions will be Anaconda specific, and initially aimed at Linux users.
For users without an existing Python/Anaconda installation we recommend either
- installing the Miniconda (minimal Anaconda) distribution or
- installing the full Anaconda distribution
This library and the example processing scripts use the netCDF4, numpy, scipy, and numba Python modules. The NetCDF Operators (NCO) software package is also useful for the processing scripts, and can optionally be installed as a Python module via conda.
A new Anaconda environment containing all required modules can be created through the use of the provided environment.yml
file, which specifies an environment named indices_python containing all required modules:
$ conda env create -f environment.yml
The environment created by the above command can be activated using the following command:
$ source activate indices_python
Once the *conda Python environment has been activated then subsequent Python commands will run in this environment where the package dependencies for this project are present.
Now the indices_python package itself can be added into the environment via pip:
$ pip install .
For users who'd prefer to not utilize pip the required module dependencies can be installed instead into an Anaconda environment piecemeal via multiple conda install
commands:
$ conda create --name <env_name> python=3
$ source activate <env_name>
$ conda install netCDF4
$ conda install numba
$ conda install numpy
$ conda install pandas
$ conda install scipy
Optionally install the package into the local site-packages:
$ python setup.py install
indices_python
: main moduletests
: unit tests for main modulescripts/compare
: scripts to compare results of indices processing on grids or climate divisions, comparing against expected/known results (for example nClimDivs from NCEI, PRISM grids from WRCC)scripts/ingest
: scripts to ingest grid or climate divisions datasets from ASCII to NetCDFscripts/process
: scripts to process indices computations on either grids or climate divisions datasetsscripts/task
: scripts that perform a combination of ingest and process for either grids or climate divisions datasets, useful as cron jobs for monthly processing
Initially all tests should be run for validation:
$ export NUMBA_DISABLE_JIT=1
$ python -m unittest tests/test_*.py
$ unset NUMBA_DISABLE_JIT
If you run the above from the main branch and get an error then please send a report and/or add an issue, as all test should pass on the main branch.
The numba environment variable is set/unset in order to bypass the numba just-in-time compilation process, which reduces testing times.
There are example climate indices processing scripts provided which compute the full suite of indices for various input dataset types. These process input files in the NetCDF format, and produce output NetCDF files in a corresponding format.
The script process_grid.py
(found under the scripts/process
subdirectory) is used to compute climate indices from nClimGrid input datasets. Usage of this script requires specifying the input file names and corresponding variable names for precipitation, temperature, and soil constant datasets, as well as the month scales over which the scaled indices (SPI, SPEI, and PAP) are to be computed, plus the base output file name and the initial and final years of the calibration period.
This script has the following required command line arguments:
Option | Description |
---|---|
precip_file | input NetCDF file containing nClimGrid precipitation dataset |
precip_var_name | name of the precipitation variable within the input nClimGrid precipitation dataset NetCDF |
temp_file | input NetCDF file containing nClimGrid temperature dataset |
temp_var_name | name of the temperature variable within the input nClimGrid temperature dataset NetCDF |
awc_file | input NetCDF file containing a soil constant (available water capacity of the soil) dataset NetCDF, should correspond dimensionally with the input nClimGrid temperature and precipitation datasets |
awc_var_name | name of the soil constant (available water capacity of the soil) variable within the input soil constant dataset NetCDF |
output_file_base | base file name for all output files, each computed index will have an output file whose name will begin with this base plus the index's abbreviation plus a month scale (if applicable), plus ".nc" as the extension (i.e. for SPI/Gamma at 3-month scale the resulting output file will be named <output_file_base>_spi_gamma_03.nc) |
month_scales | month scales over which the PAP, SPI, and SPEI values are to be computed, valid range is 1-72 months |
calibration_start_year | initial year of calibration period |
calibration_end_year | final year of calibration period |
Example command line invocation:
`$ nohup python -u process_grid.py --precip_file example_inputs/nclimgrid_lowres_prcp.nc --temp_file example_inputs/nclimgrid_lowres_tavg.nc --awc_file example_inputs/nclimgrid_lowres_soil.nc --precip_var_name prcp --temp_var_name tavg --awc_var_name awc --month_scales 1 2 3 6 12 24 --calibration_start_year 1931 --calibration_end_year 1990 --output_file_base nclimgrid_lowres
The script process_divisions.py
(found under the scripts/process
subdirectory) is used to compute climate indices from nClimDiv input datasets. Usage of this script requires specifying the input file name and corresponding variable names for precipitation, temperature, and soil constant datasets, as well as the month scales over which the scaled indices (SPI, SPEI, and PAP) are to be computed, plus the base output file name and the initial and final years of the calibration period.
This script has the following required command line arguments:
Option | Description |
---|---|
input_file | input NetCDF file containing nClimDiv temperature, precipitation, and soil constant datasets, with output variables added or updated for each computed index |
precip_var_name | name of the precipitation variable within the input nClimGrid dataset NetCDF |
temp_var_name | name of the temperature variable within the input dataset NetCDF |
awc_var_name | name of the soil constant (available water capacity of the soil) variable within the input dataset NetCDF |
month_scales | month scales over which the PAP, SPI, and SPEI values are to be computed, valid range is 1-72 months |
calibration_start_year | initial year of calibration period |
calibration_end_year | final year of calibration period |
Example command line invocation:
`$ nohup python -u process_divisions.py --input_file example_inputs/nclimdiv_20170404.nc --precip_var_name prcp --temp_var_name tavg --awc_var_name awc --month_scales 1 2 3 6 12 24 --calibration_start_year 1931 --calibration_end_year 1990
Please use, make suggestions, and contribute to this code. Without diverse participation and community adoption this project will not reach its potential.
Are you aware of other indices that would be a good addition here? Can you find bottlenecks and help improve performance? Can you suggest new ways of comparing these implementations against others (or other criteria) in order to determine best-of-breed? Please fork the code and have at it, and/or contact us to see if we can help.
- Read our contributing guidelines
- File an issue, or submit a pull request
- Send us an email
This is a developmental version of code that is originally developed at NCEI/NOAA, official release version available on drought.gov. Please read more on our license page.