This folder contains the source code for our AISTATS 2024 paper "Offline Policy Evaluation and Optimization under Confounding", which you can find at https://arxiv.org/abs/2211.16583.
The folder is structured as follows:
core/
contains utilities, helper classes and functions generously provided by David Bruns-Smith as part of the source code for his paper "Model-Free and Model-Based Policy Evaluation when Causality is Uncertain".mcmix/
contains the source code for the global confounders portion of our paper.- The
subspace.py
,clustering.py
,emalg.py
, andhelpers.py
files were obtained from the source code for "Learning Mixtures of Markov Chains and MDPs" by Kausik et. al. - The folder
sepsisSimDiabetes/
contains the sepsis simulator of Oberst and Sontag, "Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models".mdptoolboxSec/
andcf/
contain code provided by them necessary to obtain the files indata/
. - The
data/
folder contains (1) the sepsis simulator's transition matrix indiab_txr_mats-replication.pkl
, (2) the epsilon-greedy behavior policy insepsisPol.npy
. The former can be re-obtained by running the notebooklearn_mdp_parameters.ipynb
, and the latter can be re-obtained by runningbehavior_policy.ipynb
. - The main experiment for this portion of the paper can be reproduced by running
sepsisOPELarge.ipynb
.
- The
COPE/
contains the source code for the history-independent confounders portion of the paper.histIndep.ipynb
is self-contained and contains the main experiment for this portion of the paper.