Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add structure for likelihood to compute predictive distributions based on draws #112

Open
MansMeg opened this issue Dec 4, 2019 · 13 comments

Comments

@MansMeg
Copy link
Collaborator

MansMeg commented Dec 4, 2019

In model slot.

@MansMeg MansMeg added this to the Beta 0.2 milestone Dec 12, 2019
@eerolinna
Copy link
Contributor

Would this be an additional slot in the model info file like blr.info.json or in a model code file like blr.stan?

@MansMeg
Copy link
Collaborator Author

MansMeg commented Dec 14, 2019

Im actually starting to think of this as a separate stan function for now (a separate stan file). Then we can add it along the way.

@eerolinna
Copy link
Contributor

So you're thinking that blr.info.json would be something like this?

{
  "name": "blr",
  "keywords": [],
  "title": "A Bayesian linear regression model with vague priors",
  "description": "A Bayesian linear regression model with vague priors.",
  "urls": [],
  "model_code": {
    "stan": "models/stan/blr.stan"
  },
  "likelihood_code": {
    "stan": "models/stan/blr_likelihood.stan" 
  },
  "references": null,
  "added_date": "2019-11-29",
  "added_by": "Mans Magnusson"
}

@MansMeg
Copy link
Collaborator Author

MansMeg commented Dec 14, 2019

Yes. Exactly!

@eerolinna
Copy link
Contributor

With PyMC there is no need for any extra code to compute predictive distributions

posterior_predictive_samples = pymc.sample_posterior_predictive(draws, model=model)

where model is any pymc model and draws are the posterior draws computed with pymc.
Source: https://docs.pymc.io/notebooks/posterior_predictive.html

It seems backwards to include likelihood_code in the model info file when not all model implementations will require it.

Ideal solution

From my perspective the ideal solution seems like this

  1. change blr.info.json to this
{
  "name": "blr",
  "model_implementations": {
    "stan": "models/stan/blr.info.json",
    "pymc": "models/pymc/blr.info.json"
  }
}

(some slots like keywords have been omitted for the sake of clarity, but they still would remain here)

  1. models/stan/blr.info.json would be
{
  "code": "models/stan/blr.stan",
  "likelihood": "models/stan/blr_likelihood.stan"
}
  1. models/pymc/blr.info.json would be
{
  "code": "models/pymc/blr.py"
}

@MansMeg
Copy link
Collaborator Author

MansMeg commented Dec 16, 2019

I agree that this is probably the right way to go. Well spotted.

@eerolinna
Copy link
Contributor

eerolinna commented Dec 17, 2019

How should we expose the likelihood/predictive distribution to users?

In other words, if I have a posterior object po <- posterior("8_schools"), what would the API be like to access the likelihood/predictive distribution?

Would it be something like stan_predictive_draws(po, posterior_draws)? Would we also want something like stan_likelihood(po, posterior_draws)? What would this return?

These could be generalized to predictive_draws(po, posterior_draws, framework = "stan") and the same for stan_likelihood.

@MansMeg
Copy link
Collaborator Author

MansMeg commented Dec 18, 2019

Yes, that would probably be a good idea

@MansMeg
Copy link
Collaborator Author

MansMeg commented Dec 19, 2019

Write a suggestion for the 8-schools example of how it should look and review it with Paul.

The structure change suggested by @eerolinna needs to be done before @jarnefeltoliver starts to fill out the db.

@MansMeg MansMeg self-assigned this Dec 19, 2019
@MansMeg
Copy link
Collaborator Author

MansMeg commented Jan 3, 2020

I have now implemented it, but I think it is simpler to just add a JSON object directly:

{
  "name": "blr",
  "model_implementations": {
    "stan": {
         "model_code": "models/stan/blr.stan",
         "likelihood_code": "models/stan/blr_likelihood.stan"}
    "pymc": {
          "model_code": "models/pymc/blr.py"
     }
  }
}

@eerolinna
Copy link
Contributor

That's a good idea!

@eerolinna
Copy link
Contributor

Do we want to have an API that exposes the likelihood code? Something like

stan_likelihood_code_file(po)

that would for the blr posterior return models/stan/blr_likelihood.stan.

Or is it sufficient that we have something like

posterior_predictive_draws(po, framework = "stan")

that essentially keeps the likelihood code file as an implementation detail instead of public API?

Arguments for keeping the likelihood code file out of the public API

  • Smaller API surface is easier to learn, understand and maintain
  • If Stan ever adds a way to automatically compute predictive draws without a separate likelihood definition we can remove the likelihood code files without breaking anyone's code

Arguments for adding the likelihood code to the public API

  • Someone might need the likelihood code outside of computing predictive draws?

@MansMeg
Copy link
Collaborator Author

MansMeg commented Jan 6, 2020

We would like to be able to produce the predictive distribution using stan likelihood file without any restrictions on how the posterior was computed. So in R we would have a function that uses the likelihood file but returns a predictive distribution. Although this does not necessarily need to expose the likelihood code as you say. I . have no good answer yet.

@MansMeg MansMeg removed this from the Beta 0.2 milestone Jul 6, 2020
@MansMeg MansMeg removed their assignment Jul 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants