Skip to content

Commit

Permalink
Merge pull request #57 from bodkan/tree-sequences
Browse files Browse the repository at this point in the history
Tree sequences
  • Loading branch information
bodkan authored Jul 28, 2021
2 parents 215aad7 + f4a2c2c commit 91789be
Show file tree
Hide file tree
Showing 92 changed files with 5,564 additions and 131 deletions.
3 changes: 2 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,8 @@ Imports:
readr,
stringr,
reticulate,
magick
magick,
admixr
Suggests:
testthat (>= 3.0.0),
knitr,
Expand Down
14 changes: 14 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,20 @@ export(script)
export(shrink)
export(slim)
export(subtract)
export(ts_afs)
export(ts_coalesced)
export(ts_eigenstrat)
export(ts_f4ratio)
export(ts_fst)
export(ts_genotypes)
export(ts_individuals)
export(ts_load)
export(ts_mutate)
export(ts_nodes)
export(ts_recapitate)
export(ts_simplify)
export(ts_tajima)
export(ts_vcf)
export(world)
import(data.table)
import(ggplot2)
Expand Down
84 changes: 42 additions & 42 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,37 @@
completely arbitrary shapes of the "continents", geographic barriers
etc.).

- It is now possible to define models in forward or backwards
direction, depending on what is more convenient for the user and the
scenario that is being modeled. The direction of time is
automatically detected and translated to SLiM's units of generations
in the forward direction.

- Spatial interaction distances (translated to `maxDistance` on the
SLiM side) as well as offspring distances from parents (i.e. the
standard deviations of their normal distributions) can now be
specified for each population individually. The users can still
provide default values for all populations (or just those which did
not have their own dedicated parameter values) in the main
`compile()` call. These parameters can also change dynamically over
time by calling the `dispersal()` function.

- Sampling ("remembering") of individuals for tree-sequence recording can
now be scheduled with the `sampling()` function.

- A basic interface to _tskit_ has been implemented: loading, recapitating,
and simplifying tree-sequence data, conversion to genotype file formats
such as EIGENSTRAT (`ts_eigenstrat()`) and VCF (`ts_vcf()`), a couple of
population genetics statistical functions (all with the `ts_*` prefix)
such as `ts_f3()`, etc. have also been implemented. More is to come, but
the goal here is to focus *only* on loading tree-sequences and calculating
population genetics statistics on simulated tree-sequence data, nothing more.
Only the things desirable (and sensible) to do within R and its limitations
will be implemented. All of the *tskit*/*pyslim* functionality is accessed
via the [_reticulate_](https://rstudio.github.io/reticulate/index.html)
package. See a [dedicated vignette](../articles/vignette-05-tree-sequences.html)
for an example of the tree-sequence functionality in *slendr*.

- New set operations for manipulating spatial regions: `join()`,
`overlap()`, `subtract()`. These can be used to build more complex
population boundaries and geographic regions from smaller ones.
Expand All @@ -18,23 +49,11 @@
a time slider. This can be used to inspect the model spatial
dynamics interactively before they are sent over to the SLiM side.

- A new function `distance()` for computing distances between
geographic objects in the package (either between their borders or
centroids) has been implemented.

- A small utility function `dimension()` which calculates the
dimension of the world map in the "real" projected units
(i.e. meters) has been added.

- A new function `area()` has for calculating the geographic area
covered by any object of the class `slendr` has been implemented.

- It is now possible to define models in forward or backwards
direction, depending on what is more convenient for the user and the
scenario that is being modeled. The direction of time is
automatically detected and translated to SLiM's units of generations
in the forward direction. The implementation of this features might
be a bit wonky in some aspects and more testing is required.
- New functions `distance()` (for computing distances between
geographic objects), `dimension()` (which calculates the
dimension of the world map in the "real" projected units), and `area()`
(for calculating the geographic area covered by any spatial object)
have been added.

- Simulations of marine species are now possible. This required adding
a single argument `aquatic = TRUE` to the `population()` call which
Expand All @@ -43,43 +62,24 @@
(subtracting land instead of water, which is done for terrestrial
species).

- Any simulation can now have multiple "ancestral" populations
(i.e. populations without an immediate ancestor created by
`sim.addSubpop()`). Until now, all populations had to trace their
ancestry to a single ancestor (a leftover of a hard-coded
requirement from the very first version of the code).

- A function `resize()` for scheduling changes in population size (in
a single step, or as an exponential growth/shrinking) is
implemented.

- A new `shrink()` function (analogous to `expand()`) is implemented.

- Spatial interaction distances (translated to `maxDistance` on the
SLiM side) as well as offspring distances from parents (i.e. the
standard deviations of their normal distributions) can now be
specified for each population individually. The users can still
provide default values for all populations (or just those which did
not have their own dedicated parameter values) in the main
`compile()` call. These parameters can also change dynamically over
time by calling the `dispersal()` function.

- Two knew vignettes have been
added. [One](../articles/grid_example.html) demonstrates a simple
use case of simulating the history of populations as demes laid out
on a regular abstract grid. The [other
one](../articles/spatial_interactions.html) demonstrates the spatial
interaction dynamics described in the previous point.
- Five new vignettes have been added (available under "Articles" at
the [project website](https://bodkan.net/slendr).

## Nonspatial simulations

- _slendr_ can now simulate "normal" nonspatial models. This is
triggered by setting the `map = FALSE` in the `population()`
triggered by leaving the `map = FALSE` in the `population()`
"constructor" function call. All populations descending from that
population will then be nonspatial, and the whole compiled model
will be run in the nonspatial model on the SLiM side. A short
[example vignette](../articles/nonspatial_models.html) demonstrating
this feature has been added.
[example vignette](../articles/vignette-04-nonspatial-models.html)
demonstrating this feature has been added.

## Changes to the R interface

Expand All @@ -106,6 +106,6 @@

## Other changes

- More than a hundred new unit tests checking the consistency of the R
- Almost two hundred new unit tests checking the consistency of the R
interface (misspecified event times, etc.) as well as the
correctness of the SLiM simulation runs programmed by *slendr*.
12 changes: 12 additions & 0 deletions R/compilation.R
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,18 @@ compile <- function(populations, dir, generation_time, resolution = NULL,
sim_length = NULL, direction = NULL) {
if (inherits(populations, "slendr_pop")) populations <- list(populations)

# make sure that all parents are present
pop_names <- purrr::map_chr(populations, ~ .x$pop[1])
parent_names <- unique(purrr::map_chr(populations, function(pop) {
parent <- attr(pop, "parent")
if (is.character(parent))
return(pop$pop[1])
else
parent$pop[1]
}))
if (!all(parent_names %in% pop_names))
stop("The following parent populations are missing: ", parent_names[!parent_names %in% pop_names], call. = FALSE)

if (length(populations) != length(unique(sapply(populations, `[[`, "pop"))))
stop("All populations must have unique names", call. = FALSE)

Expand Down
9 changes: 6 additions & 3 deletions R/interface.R
Original file line number Diff line number Diff line change
Expand Up @@ -980,8 +980,11 @@ seconds, but if you don't want to wait, you can set `snapshots = N` manually.")

#' Define sampling events at specified times for a given set of populations
#'
#' @param times Integer vector of times (in model time units) at which to schedule remembering of individuals in the tree-sequence
#' @param ... Lists of two elements <\code{slendr_pop} population object>-<number of individuals to sample>, representing from which populations should how many individuals be remembered at times given by \code{times}
#' @param times Integer vector of times (in model time units) at which to
#' schedule remembering of individuals in the tree-sequence
#' @param ... Lists of two elements (\code{slendr_pop} population object-<number
#' of individuals to sample), representing from which populations should how
#' many individuals be remembered at times given by \code{times}
#'
sampling <- function(times, ..., strict = FALSE) {
samples <- list(...)
Expand Down Expand Up @@ -1015,7 +1018,7 @@ sampling <- function(times, ..., strict = FALSE) {
stop("Cannot schedule sampling for '", pop$pop, "' at time ", t,
" because the population will not be present in the simulation",
" at that point. Consider running this function with `strict = FALSE`",
" which will automatically retain only keep valid sampling events.",
" which will automatically retain only valid sampling events.",
call. = FALSE)
})
})
Expand Down
Loading

0 comments on commit 91789be

Please sign in to comment.