Add expectations and variances in gold standard #108

MansMeg · 2019-12-04T18:36:06Z

We may be interested in gold standards expectations without actual gold standard draws. Hence we should add them as a separate slot that could be accessed.

This should include:
Mean, variances and covariance for all parameters based on 100 000 draws. Hence a new gold standard slot should be created.

eerolinna · 2019-12-12T14:24:31Z

Is the idea here that we can get more accurate expectations by using more samples? So we might have a gold standard with 10 000 draws but expectations computed with 100 000 draws?

Or is it more that for some posteriors we might have a way of computing accurate expectations but not accurate draws? I guess simulated posteriors also fall under this case as we can have ground truth expectations while the expectations computed from draws are only estimates.

MansMeg · 2019-12-14T10:15:34Z

Yes. Thats the reason. We could have much more accurate estimates of expectations and the covariance.

Good point. We should also add how the expectations were calculated.

eerolinna · 2019-12-14T14:21:34Z

So posteriors/8_schools.json might be something like this?

{
  "name": "eight_schools-eight_schools_noncentered",
  "keywords": ["stan_benchmark"],
  "model_name": "eight_schools_noncentered",
  "reference_draws_name": "eight_schools-eight_schools_noncentered",
  "reference_expectations_name": "something",
  "data_name": "eight_schools",
  "dimensions": {"theta":8, "mu":1, "tau":1},
  "added_by": "Mans Magnusson",
  "added_date": "2019-08-12"
}

(I'm using reference instead of gold standard here)

I guess the "something" in "reference_expectations_name": "something" could also be eight_schools-eight_schools_noncentered if we have separate folders for reference draw files and reference expectation files.

MansMeg · 2019-12-14T15:50:34Z

Exactly!

eerolinna · 2019-12-15T18:27:18Z

Here's some things that popped to my mind.

1

Lets say we have a posterior where the expectations computed with 100 000 draws are more accurate than the ones computed from a sample of 10 000 draws. Do we even want to expose the smaller sample?

One could argue that if the expectation with the smaller draws is less accurate then the draws do not represent the posterior well and should not ever be used.

1.5

Lets continue the previous case. We know that 100 000 draws gave a better result. Should we also try 1 000 000 draws to see if that gives an even better result?

2

Lets say we have a small posterior where storing large draws (lets say 100 000) takes less space than storing small draws (say 10 000) of a "normal-sized" posterior. For the small posterior the large draws gives a more accurate estimate than with small one. Should we just store the large draws (so 100 000) in this case?

In other words, do we want to have a fixed number of draws in the first place?

3

How can we recognize that an estimate is more accurate than another estimate? Sure it is more likely that a larger sample gives more accurate estimate but it is still possible that sometimes the smaller sample gives a better estimate. Or can we consider the chance of this to be small enough to be ignored?

MansMeg · 2019-12-16T07:36:05Z

There are different use-cases. So if you only want to check that you get the expectations right, then the larger sample is better. Storing expectations from 100 000 draws is also less costly (it only depends on the dimension). Although others may want to have draws from the posterior to, for example compute log_lik values for a subset of observations. @avehtari is working with writing down use cases now.

1.5) The more draws the better. We need to set the bar for reference draws somewhere since there is a computational cost, especially for larger models.

Yes, to make it simple and straight-forward.
Using the MC error.

eerolinna · 2019-12-16T19:36:40Z

I probably explained 1. a bit poorly. What I mean is that

We know that 10 000 draws doesn't give an accurate posterior mean (or some other expectation) for some posterior (this is because it is different from the mean obtained with 100 000 draws)
Thus computing log_lik values from the draws won't be an accurate representation of the actual log_lik values (is this right?)
So we should only expose the mean and not the 10 000 draws because using the draws is likely to result in wrong conclusions

As for 2. perhaps we should have a simple and straightforward guideline (10 000 draws are preferred) that we can deviate from if there's good reasons to do so. Maybe this is what you also had in mind?

MansMeg · 2019-12-18T07:35:18Z

The difference will (mostly) only be a difference in MC error. And with 10 000 samples the MC error is still very small.
Yes.

eerolinna · 2019-12-18T20:30:33Z

This is what I'm essentially hearing: 10 000 samples will have a small error compared to 100 000 and thus it's fine to use the smaller sample to compute log_lik etc. Yet 10 000 samples has too big of an error to compute expectations so we need to use the larger sample for that.

This sounds like a contradiction.

MansMeg · 2019-12-19T07:39:49Z

10000 samples is good enogh in most situations, 100 000 is better but would take up 10x space. There is no on-off here. Computing expectations and covariance with respect to 100 000 give a slightly better estimate but with no additional storage cost. Using 1000000 would be even better but we need to draw the line somewhere.

MansMeg · 2019-12-19T11:14:14Z

Expectations and Covariance
and the MCSE for expectations and Covariance.

Implement batch method in posterior for MCSE estimation
Add reference_posterior_moments and reference_posterior_quantiles together with their with MCSE.

MansMeg added this to the Beta 0.2 milestone Dec 12, 2019

MansMeg removed this from the Beta 0.2 milestone Sep 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add expectations and variances in gold standard #108

Add expectations and variances in gold standard #108

MansMeg commented Dec 4, 2019

eerolinna commented Dec 12, 2019

MansMeg commented Dec 14, 2019 •

edited

Loading

eerolinna commented Dec 14, 2019 •

edited

Loading

MansMeg commented Dec 14, 2019

eerolinna commented Dec 15, 2019

MansMeg commented Dec 16, 2019

eerolinna commented Dec 16, 2019

MansMeg commented Dec 18, 2019

eerolinna commented Dec 18, 2019

MansMeg commented Dec 19, 2019

MansMeg commented Dec 19, 2019 •

edited

Loading

Add expectations and variances in gold standard #108

Add expectations and variances in gold standard #108

Comments

MansMeg commented Dec 4, 2019

eerolinna commented Dec 12, 2019

MansMeg commented Dec 14, 2019 • edited Loading

eerolinna commented Dec 14, 2019 • edited Loading

MansMeg commented Dec 14, 2019

eerolinna commented Dec 15, 2019

1

1.5

2

3

MansMeg commented Dec 16, 2019

eerolinna commented Dec 16, 2019

MansMeg commented Dec 18, 2019

eerolinna commented Dec 18, 2019

MansMeg commented Dec 19, 2019

MansMeg commented Dec 19, 2019 • edited Loading

MansMeg commented Dec 14, 2019 •

edited

Loading

eerolinna commented Dec 14, 2019 •

edited

Loading

MansMeg commented Dec 19, 2019 •

edited

Loading