Skip to content

Commit

Permalink
update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
nikosbosse committed Jan 22, 2021
1 parent 8fd9ae7 commit 574d82c
Show file tree
Hide file tree
Showing 6 changed files with 61 additions and 52 deletions.
23 changes: 15 additions & 8 deletions Readme.Rmd
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
title: 'scoringutils: Utilities for Scoring and Assessing Predictions'
output: github_document
output:
github_document
---

[![R-CMD-check](https://github.com/epiforecasts/scoringutils/workflows/R-CMD-check/badge.svg)](https://github.com/epiforecasts/scoringutils/actions)
Expand Down Expand Up @@ -237,7 +238,7 @@ predictive distribution at time t,

$$u_t = F_t (x_t)$$

where $x_t$ is the observed data point at time $t \text{ in } t_1, , t_n$,
where $x_t$ is the observed data point at time $t$ in $t_1, \dots, t_n$,
n being the number of forecasts, and $F_t$ is the (continuous) predictive
cumulative probability distribution at time t. If the true probability
distribution of outcomes at time t is $G_t$ then the forecasts $F_t$ are
Expand Down Expand Up @@ -319,7 +320,7 @@ the predictions must be a probability that the true outcome will be 1.
The Brier Score is then computed as the mean squared error between the
probabilistic prediction and the true outcome.

$$\text{Brier_Score} = \frac{1}{N} \sum_{t = 1}^{n} (\text{prediction}_t - \text{outcome}_t)^2$$
<!-- $$\text{BrierScore} = \frac{1}{N}\sum_{t=1}^{n} (\text{prediction}_t-\text{outcome}_t)^2$$ -->


```{r}
Expand All @@ -335,10 +336,16 @@ following Gneiting and Raftery (2007). Smaller values are better.

The score is computed as

$$ \text{score} = (\text{upper} - \text{lower}) + \\
\frac{2}{\alpha} \cdot (\text{lower} - \text{true_value}) \cdot 1(\text{true_values} < \text{lower}) + \\
\frac{2}{\alpha} \cdot (\text{true_value} - \text{upper}) \cdot
1(\text{true_value} > \text{upper})$$
![interval_score](man/figures/interval_score.png)


<!-- $$\begin{align*} -->
<!-- \text{score} = & (\text{upper} - \text{lower}) + \\ -->
<!-- & \frac{2}{\alpha} \cdot ( \text{lower} - \text{true_value}) \cdot 1( \text{true_values} < \text{lower}) + \\ -->
<!-- & \frac{2}{\alpha} \cdot ( \text{true_value} - \text{upper}) \cdot 1( \text{true_value} > \text{upper}) -->
<!-- \end{align*}$$ -->




where $1()$ is the indicator function and $\alpha$ is the decimal value that
Expand All @@ -350,7 +357,7 @@ prediction interval. Correspondingly, the user would have to provide the
No specific distribution is assumed,
but the range has to be symmetric (i.e you can't use the 0.1 quantile
as the lower bound and the 0.7 quantile as the upper).
Setting `weigh = TRUE` will weigh the score by $\frac{\alpha}{2}$ such that
Setting `weigh = TRUE` will weigh the score by $\alpha/2$ such that
the Interval Score converges to the CRPS for increasing number of quantiles.


Expand Down
90 changes: 46 additions & 44 deletions Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -416,17 +416,17 @@ can assume values between -1 and 1 and is 0 ideally.
true_values <- rpois(30, lambda = 1:30)
predictions <- replicate(200, rpois(n = 30, lambda = 1:30))
bias(true_values, predictions)
#> [1] -0.095 -0.610 -0.130 -0.005 0.290 0.000 0.205 -0.900 -0.465 0.390
#> [11] 0.610 -0.360 0.795 0.875 1.000 0.930 0.010 0.725 -0.935 -0.325
#> [21] -0.180 -0.705 0.715 0.110 0.025 -0.470 -0.180 0.005 0.625 0.500
#> [1] 0.020 0.830 -0.070 0.850 0.600 0.985 -0.395 -0.540 0.920 -0.705
#> [11] 0.730 -0.340 0.310 0.955 0.235 -0.235 -0.580 -0.835 0.955 -0.460
#> [21] -0.225 0.025 -0.980 -0.860 -0.875 -0.235 -0.035 0.580 0.000 0.985

## continuous forecasts
true_values <- rnorm(30, mean = 1:30)
predictions <- replicate(200, rnorm(30, mean = 1:30))
bias(true_values, predictions)
#> [1] 0.33 -0.12 -0.88 -0.27 -0.45 0.77 -0.91 0.92 -0.31 -0.40 0.47 -0.22
#> [13] 0.82 0.87 0.02 -0.31 0.83 -0.16 0.03 0.41 -0.77 -0.80 0.84 0.66
#> [25] -0.18 -0.93 0.76 -0.56 -0.33 -0.99
#> [1] 0.35 -0.37 -0.10 -0.37 -0.32 -0.14 -0.76 0.21 -1.00 0.73 0.31 0.30
#> [13] 0.40 0.21 0.87 0.83 -0.84 -0.19 0.47 -0.69 -0.18 0.96 0.70 0.27
#> [25] 0.31 -0.14 -0.05 -0.69 -0.72 0.83
```

## Sharpness
Expand All @@ -442,9 +442,9 @@ median of the predictive samples. For details, see `?stats::mad`
``` r
predictions <- replicate(200, rpois(n = 30, lambda = 1:30))
sharpness(predictions)
#> [1] 1.4826 1.4826 1.4826 1.4826 1.4826 2.9652 1.4826 2.9652 2.9652 2.9652
#> [11] 2.9652 2.9652 4.4478 2.9652 4.4478 4.4478 4.4478 4.4478 4.4478 4.4478
#> [21] 4.4478 4.4478 4.4478 4.4478 5.9304 5.9304 5.9304 5.9304 5.9304 5.1891
#> [1] 1.4826 1.4826 1.4826 2.2239 2.9652 2.9652 2.9652 2.9652 2.9652 2.9652
#> [11] 2.9652 3.7065 4.4478 4.4478 4.4478 2.9652 4.4478 4.4478 2.9652 4.4478
#> [21] 4.4478 4.4478 4.4478 4.4478 5.9304 4.4478 4.4478 4.4478 5.9304 5.9304
```

## Calibration
Expand All @@ -459,12 +459,12 @@ predictive distribution at time t,

*u*<sub>*t*</sub> = *F*<sub>*t*</sub>(*x*<sub>*t*</sub>)

where *x*<sub>*t*</sub> is the observed data point at time
*t* in *t*<sub>1</sub>, …, *t*<sub>*n*</sub>, n being the number of
forecasts, and *F*<sub>*t*</sub> is the (continuous) predictive
cumulative probability distribution at time t. If the true probability
distribution of outcomes at time t is *G*<sub>*t*</sub> then the
forecasts *F*<sub>*t*</sub> are said to be ideal if
where *x*<sub>*t*</sub> is the observed data point at time *t* in
*t*<sub>1</sub>, …, *t*<sub>*n*</sub>, n being the number of forecasts,
and *F*<sub>*t*</sub> is the (continuous) predictive cumulative
probability distribution at time t. If the true probability distribution
of outcomes at time t is *G*<sub>*t*</sub> then the forecasts
*F*<sub>*t*</sub> are said to be ideal if
*F*<sub>*t*</sub> = *G*<sub>*t*</sub> at all times *t*. In that case,
the probabilities ut are distributed uniformly.

Expand Down Expand Up @@ -507,11 +507,11 @@ as integer valued forecasts. Smaller values are better.
true_values <- rpois(30, lambda = 1:30)
predictions <- replicate(200, rpois(n = 30, lambda = 1:30))
crps(true_values, predictions)
#> [1] 0.669925 0.666625 0.396150 1.279300 0.470675 3.410600 0.822175
#> [8] 3.098400 0.861725 1.238325 4.209300 1.174450 0.905975 1.178650
#> [15] 1.122400 0.953425 5.086200 11.136700 1.294500 5.122075 6.680825
#> [22] 1.412150 8.954825 1.454975 4.931100 5.562650 3.688375 1.616650
#> [29] 4.629000 1.368350
#> [1] 0.484725 0.522300 0.754775 1.940750 0.946000 1.540875 0.560800
#> [8] 1.287250 0.773250 6.333950 0.977575 1.600275 1.062800 8.922450
#> [15] 2.441150 0.834700 2.728625 1.004725 1.996200 3.062175 1.136075
#> [22] 2.975050 3.402475 3.283250 3.156225 3.942225 4.146375 10.740000
#> [29] 1.646800 1.751275
```

## Dawid-Sebastiani Score (DSS)
Expand All @@ -525,11 +525,10 @@ as integer valued forecasts. Smaller values are better.
true_values <- rpois(30, lambda = 1:30)
predictions <- replicate(200, rpois(n = 30, lambda = 1:30))
dss(true_values, predictions)
#> [1] 0.1177901 0.5822220 1.5066992 3.8373524 1.9597284 1.8531762 3.4175897
#> [8] 2.5434091 3.1641709 3.6931812 7.6703941 3.1402025 4.6353701 2.5584192
#> [15] 4.3530542 4.8442803 3.3773641 4.1063975 3.3360388 4.3296920 3.1801352
#> [22] 3.3891771 3.1501111 4.6833844 6.1560524 3.6387926 3.1623198 6.0614609
#> [29] 4.8645734 3.4459012
#> [1] 3.271605 1.109590 2.296375 1.638642 1.720113 1.891801 1.814442 2.633736
#> [9] 2.315999 3.609963 3.199639 3.208637 2.718436 4.955331 5.141449 4.165855
#> [17] 3.417330 3.390977 3.526383 3.127248 3.334262 3.220303 5.131775 4.164216
#> [25] 4.444760 6.651042 4.790084 3.784817 3.604751 3.972252
```

## Log Score
Expand All @@ -545,11 +544,11 @@ defined for discrete values. Smaller values are better.
true_values <- rnorm(30, mean = 1:30)
predictions <- replicate(200, rnorm(n = 30, mean = 1:30))
logs(true_values, predictions)
#> [1] 1.1071445 1.1446195 3.4118736 1.1475530 4.1226770 1.8518763 1.4201647
#> [8] 1.1085001 0.9362716 1.0170988 1.1607530 1.0272924 0.9839865 1.1855521
#> [15] 1.5058463 2.1647430 0.9254600 1.4931984 2.4514677 0.8576165 1.3072919
#> [22] 1.0286997 1.7659242 0.9539723 2.2266546 1.5902868 2.1572616 3.4122593
#> [29] 0.9635253 0.9872547
#> [1] 4.2449388 1.2183110 0.9350496 1.3702498 1.2994017 1.9734176 1.9486463
#> [8] 1.0301476 1.0777401 1.1858366 0.8741934 1.2182398 1.0398304 1.1220978
#> [15] 1.6338811 1.8492573 2.6494550 1.5769399 1.0705151 1.0049739 0.9289143
#> [22] 1.2958066 1.0433506 1.0013816 1.1328087 1.0145965 4.4580586 1.0059555
#> [29] 1.0175473 1.3584801
```

## Brier Score
Expand All @@ -561,14 +560,14 @@ predictions must be a probability that the true outcome will be 1.
The Brier Score is then computed as the mean squared error between the
probabilistic prediction and the true outcome.

$$\\text{Brier\_Score} = \\frac{1}{N} \\sum\_{t = 1}^{n} (\\text{prediction}\_t - \\text{outcome}\_t)^2$$
<!-- $$\text{BrierScore} = \frac{1}{N}\sum_{t=1}^{n} (\text{prediction}_t-\text{outcome}_t)^2$$ -->

``` r
true_values <- sample(c(0,1), size = 30, replace = TRUE)
predictions <- runif(n = 30, min = 0, max = 1)

brier_score(true_values, predictions)
#> [1] 0.3962064
#> [1] 0.3132916
```

## Interval Score
Expand All @@ -579,10 +578,13 @@ better.

The score is computed as

$$ \\text{score} = (\\text{upper} - \\text{lower}) + \\\\
\\frac{2}{\\alpha} \\cdot (\\text{lower} - \\text{true\_value}) \\cdot 1(\\text{true\_values} &lt; \\text{lower}) + \\\\
\\frac{2}{\\alpha} \\cdot (\\text{true\_value} - \\text{upper}) \\cdot
1(\\text{true\_value} &gt; \\text{upper})$$
![interval\_score](man/figures/interval_score.png)

<!-- $$\begin{align*} -->
<!-- \text{score} = & (\text{upper} - \text{lower}) + \\ -->
<!-- & \frac{2}{\alpha} \cdot ( \text{lower} - \text{true_value}) \cdot 1( \text{true_values} < \text{lower}) + \\ -->
<!-- & \frac{2}{\alpha} \cdot ( \text{true_value} - \text{upper}) \cdot 1( \text{true_value} > \text{upper}) -->
<!-- \end{align*}$$ -->

where 1() is the indicator function and *α* is the decimal value that
indicates how much is outside the prediction interval. To improve
Expand All @@ -592,9 +594,9 @@ interval. Correspondingly, the user would have to provide the 5% and 95%
quantiles (the corresponding alpha would then be 0.1). No specific
distribution is assumed, but the range has to be symmetric (i.e you
can’t use the 0.1 quantile as the lower bound and the 0.7 quantile as
the upper). Setting `weigh = TRUE` will weigh the score by
$\\frac{\\alpha}{2}$ such that the Interval Score converges to the CRPS
for increasing number of quantiles.
the upper). Setting `weigh = TRUE` will weigh the score by *α*/2 such
that the Interval Score converges to the CRPS for increasing number of
quantiles.

``` r
true_values <- rnorm(30, mean = 1:30)
Expand All @@ -607,9 +609,9 @@ interval_score(true_values = true_values,
lower = lower,
upper = upper,
interval_range = interval_range)
#> [1] 0.14538566 0.10477555 0.13379352 0.16998247 0.67282876 0.89278362
#> [7] 1.02201346 0.07840875 0.10679858 0.17350429 0.80954779 0.03572332
#> [13] 0.20285522 0.23010213 0.17141142 0.14013901 0.09723121 0.19868605
#> [19] 0.18237177 0.15799423 0.11898037 0.15083061 1.60384989 0.11411942
#> [25] 0.11221762 0.66365885 0.22814734 0.52722209 0.19030832 0.53886811
#> [1] 0.1221291 0.1313182 0.1663424 0.2354949 0.2466355 0.2170362 1.5143775
#> [8] 0.1810042 0.2064054 1.2614409 0.4002147 1.4409476 0.1685613 0.6510942
#> [15] 0.1086299 0.2612796 0.2429625 0.1377730 0.1912384 0.7589243 0.1609433
#> [22] 1.4352004 0.1790574 0.1040817 0.1621819 0.1988057 0.1803143 0.1420993
#> [29] 0.1225354 0.2043855
```
Binary file added man/figures/interval_score.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed man/figures/unnamed-chunk-2-1.png
Binary file not shown.
Binary file removed man/figures/unnamed-chunk-4-2.png
Binary file not shown.
Binary file removed man/figures/unnamed-chunk-8-1.png
Binary file not shown.

0 comments on commit 574d82c

Please sign in to comment.