Skip to content

kthohr/mcmc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

2abb7d4 · Feb 8, 2024

History

85 Commits
Aug 14, 2022
Feb 8, 2024
May 5, 2023
May 5, 2023
May 14, 2023
Aug 14, 2022
Jan 16, 2022
Aug 14, 2022
Dec 26, 2023
Apr 30, 2018
Apr 16, 2018
Apr 30, 2023
Apr 23, 2023
Feb 8, 2024
Dec 24, 2023

Repository files navigation

MCMCLib   Build Status Coverage Status License Documentation Status

MCMCLib is a lightweight C++ library of Markov Chain Monte Carlo (MCMC) methods.

Features:

  • A C++11/14/17 library of well-known MCMC algorithms.
  • Parallelized samplers designed for multi-modal distributions, including:
    • Adaptive Equi-Energy Sampler (AEES)
    • Differential Evolution (DE)
  • For fast and efficient matrix-based computation, MCMCLib supports the following templated linear algebra libraries:
  • Automatic differentiation functionality is available through use of the Autodiff library
  • OpenMP-accelerated algorithms for parallel computation.
  • Straightforward linking with parallelized BLAS libraries, such as OpenBLAS.
  • Available as a single precision (float) or double precision (double) library.
  • Available as a header-only library, or as a compiled shared library.
  • Released under a permissive, non-GPL license.

Contents:

Algorithms

A list of currently available algorithms includes:

  • Adaptive Equi-Energy Sampler (AEES)
  • Differential Evolution (DE-MCMC)
  • Hamiltonian Monte Carlo (HMC)
  • Metropolis-adjusted Langevin algorithm (MALA)
  • No-U-Turn Sampler (NUTS)
  • Random Walk Metropolis-Hastings (RWMH)
  • Riemannian Manifold Hamiltonian Monte Carlo (RM-HMC)

Documentation

Full documentation is available online:

Documentation Status

A PDF version of the documentation is available here.

API

The MCMCLib API follows a relatively simple convention, with most algorithms called in the following manner:

algorithm_id(<initial values>, <log posterior kernel function of the target distribution>, <storage for posterior draws>, <additional data for the log posterior kernel function>);

The inputs, in order, are:

  • A vector of initial values used to define the starting point of the algorithm.
  • A user-specified function that returns the log posterior kernel value of the target distribution.
  • An array to store the posterior draws.
  • The final input is optional: it is any object that contains additional data necessary to evaluate the log posterior kernel function.

For example, the RWMH algorithm is called using:

bool rwmh(const ColVec_t& initial_vals, std::function<fp_t (const ColVec_t& vals_inp, void* target_data)> target_log_kernel, Mat_t& draws_out, void* target_data);

where ColVec_t is used to represent, e.g., arma::vec or Eigen::VectorXd types.

Installation

MCMCLib is available as a compiled shared library, or as header-only library, for Unix-alike systems only (e.g., popular Linux-based distros, as well as macOS). Use of this library with Windows-based systems, with or without MSVC, is not supported.

Requirements

MCMCLib requires either the Armadillo or Eigen C++ linear algebra libraries. (Note that Eigen version 3.4.0 requires a C++14-compatible compiler.)

Before including the header files, define one of the following:

#define MCMC_ENABLE_ARMA_WRAPPERS
#define MCMC_ENABLE_EIGEN_WRAPPERS

Example:

#define MCMC_ENABLE_EIGEN_WRAPPERS
#include "mcmc.hpp"

Installation Method 1: Shared Library

The library can be installed on Unix-alike systems via the standard ./configure && make method.

First clone the library and any necessary submodules:

# clone mcmc into the current directory
git clone https://github.com/kthohr/mcmc ./mcmc

# change directory
cd ./mcmc

# clone necessary submodules
git submodule update --init

Set (one) of the following environment variables before running configure:

export ARMA_INCLUDE_PATH=/path/to/armadillo
export EIGEN_INCLUDE_PATH=/path/to/eigen

Finally:

# build and install with Eigen
./configure -i "/usr/local" -l eigen -p
make
make install

The final command will install MCMCLib into /usr/local.

Configuration options (see ./configure -h):

      Primary

  • -h print help
  • -i installation path; default: the build directory
  • -f floating-point precision mode; default: double
  • -l specify the choice of linear algebra library; choose arma or eigen
  • -m specify the BLAS and Lapack libraries to link with; for example, -m "-lopenblas" or -m "-framework Accelerate"
  • -o compiler optimization options; defaults to -O3 -march=native -ffp-contract=fast -flto -DARMA_NO_DEBUG
  • -p enable OpenMP parallelization features (recommended)

      Secondary

  • -c a coverage build (used with Codecov)
  • -d a 'development' build
  • -g a debugging build (optimization flags set to -O0 -g)

      Special

  • --header-only-version generate a header-only version of MCMCLib (see below)

Installation Method 2: Header-only Library

MCMCLib is also available as a header-only library (i.e., without the need to compile a shared library). Simply run configure with the --header-only-version option:

./configure --header-only-version

This will create a new directory, header_only_version, containing a copy of MCMCLib, modified to work on an inline basis. With this header-only version, simply include the header files (#include "mcmc.hpp) and set the include path to the head_only_version directory (e.g.,-I/path/to/mcmclib/header_only_version).

R Compatibility

To use MCMCLib with an R package, first generate a header-only version of the library (see above). Then simply add a compiler definition before including the MCMCLib files.

  • For RcppArmadillo:
#define MCMC_USE_RCPP_ARMADILLO
#include "mcmc.hpp"

At this time, builds using RcppEigen are not supported as MCMCLib requires a version of Eigen >= v3.4.0.

Example

To illustrate MCMCLib at work, consider the problem of sampling values of the mean parameter of a normal distribution.

Code:

#define MCMC_ENABLE_EIGEN_WRAPPERS
#include "mcmc.hpp"

inline
Eigen::VectorXd
eigen_randn_colvec(size_t nr)
{
    static std::mt19937 gen{ std::random_device{}() };
    static std::normal_distribution<> dist;

    return Eigen::VectorXd{ nr }.unaryExpr([&](double x) { (void)(x); return dist(gen); });
}

struct norm_data_t {
    double sigma;
    Eigen::VectorXd x;
 
    double mu_0;
    double sigma_0;
};

double ll_dens(const Eigen::VectorXd& vals_inp, void* ll_data)
{
    const double pi = 3.14159265358979;

    //

    const double mu = vals_inp(0);
 
    norm_data_t* dta = reinterpret_cast<norm_data_t*>(ll_data);
    const double sigma = dta->sigma;
    const Eigen::VectorXd x = dta->x;
 
    const int n_vals = x.size();
 
    //
 
    const double ret = - n_vals * (0.5 * std::log(2*pi) + std::log(sigma)) - (x.array() - mu).pow(2).sum() / (2*sigma*sigma);
 
    //
 
    return ret;
}
 
double log_pr_dens(const Eigen::VectorXd& vals_inp, void* ll_data)
{
    const double pi = 3.14159265358979;

    //

    norm_data_t* dta = reinterpret_cast< norm_data_t* >(ll_data);
 
    const double mu_0 = dta->mu_0;
    const double sigma_0 = dta->sigma_0;
 
    const double x = vals_inp(0);
 
    const double ret = - 0.5*std::log(2*pi) - std::log(sigma_0) - std::pow(x - mu_0,2) / (2*sigma_0*sigma_0);
 
    return ret;
}
 
double log_target_dens(const Eigen::VectorXd& vals_inp, void* ll_data)
{
    return ll_dens(vals_inp,ll_data) + log_pr_dens(vals_inp,ll_data);
}
 
int main()
{
    const int n_data = 100;
    const double mu = 2.0;
 
    norm_data_t dta;
    dta.sigma = 1.0;
    dta.mu_0 = 1.0;
    dta.sigma_0 = 2.0;
 
    Eigen::VectorXd x_dta = mu + eigen_randn_colvec(n_data).array();
    dta.x = x_dta;
 
    Eigen::VectorXd initial_val(1);
    initial_val(0) = 1.0;

    //

    mcmc::algo_settings_t settings;

    settings.rwmh_settings.par_scale = 0.4;
    settings.rwmh_settings.n_burnin_draws = 2000;
    settings.rwmh_settings.n_keep_draws = 2000;

    //

    Eigen::MatrixXd draws_out;
    mcmc::rwmh(initial_val, log_target_dens, draws_out, &dta, settings);

    //
  
    std::cout << "rwmh mean:\n" << draws_out.colwise().mean() << std::endl;
    std::cout << "acceptance rate: " << static_cast<double>(settings.rwmh_settings.n_accept_draws) / settings.rwmh_settings.n_keep_draws << std::endl;
    
    //
 
    return 0;
}

On x86-based computers, this example can be compiled using:

g++ -Wall -std=c++14 -O3 -mcpu=native -ffp-contract=fast -I$EIGEN_INCLUDE_PATH -I./../../include/ rwmh_normal_mean.cpp -o rwmh_normal_mean.out -L./../.. -lmcmc

Check the /examples directory for additional examples, and https://mcmclib.readthedocs.io/en/latest/ for a detailed description of each algorithm.

Automatic Differentiation

By combining Eigen with the Autodiff library, MCMCLib provides experimental support for automatic differentiation.

The example below uses forward-mode automatic differentiation to compute the gradient of the Gaussian likelihood function, and the HMC algorithm to sample from the posterior distribution of the mean and variance parameters.

#define MCMC_ENABLE_EIGEN_WRAPPERS
#include "mcmc.hpp"

#include <autodiff/forward/real.hpp>
#include <autodiff/forward/real/eigen.hpp>

inline
Eigen::VectorXd
eigen_randn_colvec(size_t nr)
{
    static std::mt19937 gen{ std::random_device{}() };
    static std::normal_distribution<> dist;

    return Eigen::VectorXd{ nr }.unaryExpr([&](double x) { (void)(x); return dist(gen); });
}

struct norm_data_t {
    Eigen::VectorXd x;
};

double ll_dens(const Eigen::VectorXd& vals_inp, Eigen::VectorXd* grad_out, void* ll_data)
{
    const double pi = 3.14159265358979;
  
    norm_data_t* dta = reinterpret_cast<norm_data_t*>(ll_data);
    const Eigen::VectorXd x = dta->x;
  
    //

    autodiff::real u;
    autodiff::ArrayXreal xd = vals_inp.eval();

    std::function<autodiff::real (const autodiff::ArrayXreal& vals_inp)> normal_dens_log_form \
    = [x, pi](const autodiff::ArrayXreal& vals_inp) -> autodiff::real
    {
        autodiff::real mu    = vals_inp(0);
        autodiff::real sigma = vals_inp(1);

        return - x.size() * (0.5 * std::log(2*pi) + autodiff::detail::log(sigma)) - (x.array() - mu).pow(2).sum() / (2*sigma*sigma);
    };
  
    //

    if (grad_out) {
        Eigen::VectorXd grad_tmp = autodiff::gradient(normal_dens_log_form, autodiff::wrt(xd), autodiff::at(xd), u);

        *grad_out = grad_tmp;
    } else {
        u = normal_dens_log_form(xd);
    }
  
    //
  
    return u.val();
}
  
double log_target_dens(const Eigen::VectorXd& vals_inp, Eigen::VectorXd* grad_out, void* ll_data)
{
    return ll_dens(vals_inp,grad_out,ll_data);
}

int main()
{
    const int n_data = 1000;

    const double mu = 2.0;
    const double sigma = 2.0;
  
    norm_data_t dta;
  
    Eigen::VectorXd x_dta = mu + sigma * eigen_randn_colvec(n_data).array();
    dta.x = x_dta;
  
    Eigen::VectorXd initial_val(2);
    initial_val(0) = mu + 1; // mu
    initial_val(1) = sigma + 1; // sigma
  
    mcmc::algo_settings_t settings;
  
    settings.hmc_settings.step_size = 0.08;
    settings.hmc_settings.n_burnin_draws = 2000;
    settings.hmc_settings.n_keep_draws = 2000;

    //
  
    Eigen::MatrixXd draws_out;
    mcmc::hmc(initial_val, log_target_dens, draws_out, &dta, settings);

    //
  
    std::cout << "hmc mean:\n" << draws_out.colwise().mean() << std::endl;
    std::cout << "acceptance rate: " << static_cast<double>(settings.hmc_settings.n_accept_draws) / settings.hmc_settings.n_keep_draws << std::endl;

    //
 
    return 0;
}

Compile with:

g++ -Wall -std=c++17 -O3 -march=native -ffp-contract=fast -I/path/to/eigen -I/path/to/autodiff -I/path/to/mcmc/include hmc_normal_autodiff.cpp -o hmc_normal_autodiff.cpp -L/path/to/mcmc/lib -lmcmc

See the documentation for more details on this topic.

Author

Keith O'Hara

License

Apache Version 2