Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: HDI intervals #89

Open
JorisDeRidder opened this issue Mar 27, 2023 · 6 comments
Open

Feature request: HDI intervals #89

JorisDeRidder opened this issue Mar 27, 2023 · 6 comments

Comments

@JorisDeRidder
Copy link

Bayesian credible intervals can currently be derived using quantiles. Would it be possible to implement an HDI interval as well?

@JohannesBuchner
Copy link
Owner

That's a very good idea, and it should be possible. Maybe you could try implementing it for one parameter based on the posterior samples?

General recipe:

  1. make a histogram with a reasonable number of bins
  2. iterate from the highest density bin to the lowest density bin
  3. add to a set of bins, add up the posterior probability in the bins
  4. stop if x% of the posterior have been accumulated & report the interval

The approach can in principle give you multimodal regions, which is fine or not depending on what you want. If you do not want it multimodal, you would need to add in also the bins in between in step (3).

Step 1 may also not be trivial.

@JohannesBuchner
Copy link
Owner

Another approach is to use a kernel-density estimation library (fastkde, getdist, etc) to do the job. I am not sure which one, perhaps arviz also has an implementation.

I'd be interested to hear how you solve this.

@JorisDeRidder
Copy link
Author

Your pointer to arviz proved to be the quickest way to obtain a HDI.
An example for other users. I'm assuming the following imports

import numpy as np
import pandas as pd
import arviz as az
import xarray as xr

and that you ran UltraNest:

sampler = ultranest.ReactiveNestedSampler(param_names, my_loglikelihood, my_prior_transform, derived_param_names)
results = sampler.run()

To convert the UltraNest output to an Arviz InferenceData object I used

results_df = pd.DataFrame(data=results['samples'], columns=results['paramnames'])
results_df["chain"] = 0
results_df["draw"] = np.arange(len(results_df), dtype=int)
results_df = results_df.set_index(["chain", "draw"])
xdata = xr.Dataset.from_dataframe(results_df)
trace = az.InferenceData(posterior=xdata)

which is quite likely not the shortest way. :-)
A 95% HDI interval can then be obtained using Arviz built-in hdi() function:

hdi = az.hdi(trace, hdi_prob=0.95)
for name in results['paramnames']: 
    print(name, ": ", hdi[name].values)  

leading in my example to

intercept :  [0.77835208 1.14636967]
alpha :  [1.07772564 1.18964846]
sigma :  [0.27528108 0.40888707]
slope :  [1.84139496 2.47128276]

@JohannesBuchner
Copy link
Owner

@JorisDeRidder btw, how do you get the highest density point though?

Repository owner deleted a comment from JorisDeRidder Mar 18, 2024
@JorisDeRidder
Copy link
Author

You mean the MAP? Some software packages (e.g. PyMC I believe) simply numerically maximize the logPosterior = logLikelihood + logPrior function. It works well if your posterior is monomodal. If that's not the case, you might want to consider constructing a KDE of your posterior and determine the MAP from that.

@JohannesBuchner
Copy link
Owner

Here is an approach based on getdist, which supports bounds: https://gist.github.com/JohannesBuchner/2027d0f313521387c2cded2424cdcfeb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants