Skip to content

Latest commit

 

History

History

docs

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Although vanilla Python is fairly slow and hence not a good candidate, there are several options to significantly increase the efficiency of Python programs.

Learning outcomes

When you complete this training you will

  • understand and identify performance bottlenecks of Python;
  • know some libraries that can help improve performance for scientific computing such as numpy, numexpr and numba;
  • be able to use Cython to improve your code's performance;
  • be able to wrap C, C++ and Fortran code to use it from Python;
  • understand the opportunities and pitfalls of multi-threaded programming with Python;
  • be able to write distributed application using MPI;
  • have an understanding of how frameworks for distributed computing such as dask and pyspark work.

Schedule

Total duration: 4 hours.

Subject Duration
introduction and motivation 5 min.
performance and profiling 25 min.
libraries 10 min.
Cython 60 min.
coffee break 10 min.
interfacing with C/C++/Fortran 30 min.
multi-threaded programming 10 min.
MPI 45 min.
dask 15 min.
pyspark 20 min.
wrap up 10 min.

Training materials

Slides are available in the GitHub repository, as well as example code and hands-on material.

Software environment

Instructions on how to create the required software environment are available.

Target audience

This training is for you if you need to use Python for computationally intensive scientific computing.

Prerequisites

You will need experience programming in Python, using numpy, and have a passing familiarity with C/C++. This is not a training that starts from scratch.

If you plan to do Python programming in a Linux or HPC environment you should be familiar with these as well.

Trainer(s)