[Use case] standardized format for observed and modeled animal movement paths #3

jmlondon · 2019-09-30T20:44:44Z

Use case:

This specific use case focusses on the crawl R package for analysis of animal movement. We have long hoped for a standardized format that represents observed input data (e.g. GPS or Argos locations) and the associated error with each observation. The error bit is of critical importance for our application. With a standardized input, we could then develop a standardized output for predicted tracks from the model fit. The key benefit for us as developers is we could reduce the 'data cleaning and troubleshooting' overhead. And, package users would benefit from having an output product that could be used in a variety of, existing or yet to be developed, packages.

Requirements:

The bare minimum requirement would be an ordered (regular or irregular) time series with corresponding coordinates (geographic or projected or simulated space) and a representation of the error associated with each record. Ideally, there would also be support for unique 'animal IDs' and various group-by operations. For the output, we would likely create an extension of the sftraj object to properly capture additional model or prediction metadata and parameters.

Input:

For the crawl package, the idea would be for users to input an sftraj object. For wide scale adoption, the community would need an easy method for creating an sftraj object from typical telemetry observation data sources (usually a csv). The bio-logging community, various repositories (e.g. MoveBank), and manufacturers (e.g. Wildlife Computers, SMRU) might choose to develop options or APIs that allow export of data as an sftraj.

Output:

One option would be for a predicted track from a crawl model fit to be provided as an sftraj object. But, there are a number of model fit parameters that might useful to include in the output. Depending on the structure of the sftraj object, we might need to create a crawl extension/expansion of sftraj. This might be as simple as specifying a set of data frame columns that are included as part of the sftraj.

Additional information:

Convenience functions for sftraj to sf (points) or sf (linestring) (or multipoint, multilinestring) would be helpful. In some cases, telemetry data may record behavior or environmental observations at different time scales/frequencies than the observed locations. In these cases, it would be useful if sftraj supported NULL coordinates so these observations could be stored within the same sftraj structure.

The text was updated successfully, but these errors were encountered:

basille · 2019-10-01T23:51:41Z

Ooooh this one is interesting, thanks Josh! So at first, I didn't think that handling of errors would make it into our first cycle… but it was on our road map for later, and you make a very good case for it. I have a few thoughts about it:

First of all, how do you define 'error' (please bare with me, I am familiar with crawl but actually never used it): is that a standardized metric? Is this one number in every direction, or maybe one for each x/y(/z) axes (thus forming an ellipse)? Is this more like a quality assessment (typically the DOP or something similar)? As far as I am concerned, I would be very reluctant to implement something that is not (or does not define) a standard approach. My initial thought here was to rely on the errors package, which does it very well, but I wonder how it can be adapted to 2D or 3D coordinates…
I wonder if the error (which is, I agree, essential for locations) should be part of the sftraj specification, e.g. directly in the class or as an attribute, or simply left as an additional (potentially constrained) column… Basically, I would be cautious to supercharge the specification with too much information (right now, that would be {x,y,z,t} at the location level + individual, projection and a few other attributes at higher levels), but it seems to me that errors would be a valid addition at the location level… which is why we had it on our road map to start with.
NULL coordinates should be part of it, as sf already allows it for POINT geometries. See a few more thoughts here.

rociojoo · 2019-10-10T17:41:48Z

I think that Josh is referring to the errors to estimate the location by the tracking system, which is usually given by the provider (of the devices). I'm just directly copying some text from Johnson et al. 2008 which is the first thing that came to mind: "For each species location data were recorded by the Argos system (system description available online). Along with locations the Argos system rates the quality of observed location with a score from 1 (lowest measurement error; Argos class 3) to 6 (highest measurement error; Argos class B)." For the proposed model, the errors are treated as a covariate, so they need to keep that information at the location level.
Johnson, D. S., London, J. M., Lea, M.-A., & Durban, J. W. (2008). Continuous-Time Correlated Random Walk Model for Animal Telemetry Data. Ecology, 89(5), 1208–1215. https://doi.org/10.1890/10-1922.1

basille · 2019-10-10T19:48:43Z

I've been thinking about that, and it is far from trivial: Argos has one way of classifying errors, GPS come with various measures of DOP, VHF triangulation would give you an ellipse, path interpolation might give you a confidence interval… There are many different ways to implement an error on a location or path. Hope this clarifies my questions above.

As long as there is no "standard" approach to handle error, it might as well be one column of the data with no particular (think semantic) meaning.

jmlondon · 2019-10-17T19:41:58Z

All ... sorry I haven't had a chance to follow up on this issue until now.

I am thinking of error from a very Argos-centric experience, but it also applies to GPS locations. As @rociojoo mentioned, Argos provides a location quality class for each location calculated (which has some general correspondence to real-world values ... e.g, location quality class of 3 is within 250m). In more recent years, Argos has provided parameters for an error ellipse that specifies the specific error for each location. This is, by far, the preferred description of error and is supported by crawl (we also support location class error but, mostly, for legacy datasets). For GPS-based data, the approach is to follow the same ellipse structure, but treat it as a circle/ellipse with a constant error of 50m. There have been some efforts/suggestions that the error radius should be increased depending on the number of satellites used in the GPS calculation.

The other error to think about is uncertainty associated with predicted tracks/points from a model fit.

My thinking with respect to sftraj is that it would be helpful if there were a standard location (column, list column) where per location error is specified. And, if that error information was somehow standardized. As @basille points out, the last bit would be the hardest to sort out. I might imagine a workflow whereby sftraj specifies the location where error is stored AND the form of that error specification (would mimicking the Argos ellipse suffice? how could we specify the error distribution (e.g. uniform, bivariate normal, etc)).

Once that sftraj standard is established, various helper/wrapper functions could be developed to convert all the location errors out in the world to this standard. So, but burden isn't on sftraj to do the conversion, just to have put in some thoughtful effort at describing location error in a way that has broad applicability.

I'm going to cc @dsjohnson in on this discussion as well. Hoping he can chime in on some standardize approaches for describing location error that could be useful.

cheers
josh

basille · 2019-10-25T15:44:48Z

Thanks Josh for the clarification. I need to acknowledge that I am not familiar with crawl (more than simply reading about it), so I will need a few more pointers.

From what I understand, there are a few forms of error:

Categorical: basically a class of error, could be numerical (e.g. 1, 2, 3… for Argos) or anything else (e.g. 1D/2D/3D for GPS).
Circular: a number in a geographic unit, giving the radius of the error (e.g. what is inferred from Argos or GPS parameters.
Ellipse: more parameters to define an ellipse (I guess major/minor axes, direction)
Model: basically a uni- or bivariate distribution defining the error.
Suite of parameters: typically what a full GPS record will show: Status (1D/2D/3D) + number of satellites + various measures of dilution of precision (HDOP, VDOP, PDOP, GDOP, etc.).

Anything else?

I can see in crawl that the Bearded Seal Location Data (beardedSeals) comes with the following data:

error_radius: Argos error radius
error_semimajor_axis: Argos error ellipse major axis length
error_semiminor_axis: Argos error ellipse minor axis length
error_ellipse_orientation: Argos error ellispse degree orientation

How does the radius relate to the other three (which are ellipse parameters)? Is this all we need for an ellipse?

Also, would you have a simple reproducible example that demonstrates the use of errors in crawl, both as input and output? Thanks a lot.

basille · 2022-06-03T12:55:14Z

Closing this issue now that sftrack is developed and on CRAN.

basille added use-case Use cases crawl Related to the package 'crawl' labels Oct 1, 2019

basille closed this as completed Jun 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Use case] standardized format for observed and modeled animal movement paths #3

[Use case] standardized format for observed and modeled animal movement paths #3

jmlondon commented Sep 30, 2019 •

edited by basille

Loading

basille commented Oct 1, 2019

rociojoo commented Oct 10, 2019

basille commented Oct 10, 2019

jmlondon commented Oct 17, 2019

basille commented Oct 25, 2019 •

edited

Loading

basille commented Jun 3, 2022

[Use case] standardized format for observed and modeled animal movement paths #3

[Use case] standardized format for observed and modeled animal movement paths #3

Comments

jmlondon commented Sep 30, 2019 • edited by basille Loading

basille commented Oct 1, 2019

rociojoo commented Oct 10, 2019

basille commented Oct 10, 2019

jmlondon commented Oct 17, 2019

basille commented Oct 25, 2019 • edited Loading

basille commented Jun 3, 2022

jmlondon commented Sep 30, 2019 •

edited by basille

Loading

basille commented Oct 25, 2019 •

edited

Loading