Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Use case] standardized format for observed and modeled animal movement paths #3

Closed
jmlondon opened this issue Sep 30, 2019 · 6 comments
Labels
crawl Related to the package 'crawl' use-case Use cases

Comments

@jmlondon
Copy link

jmlondon commented Sep 30, 2019

Use case:

This specific use case focusses on the crawl R package for analysis of animal movement. We have long hoped for a standardized format that represents observed input data (e.g. GPS or Argos locations) and the associated error with each observation. The error bit is of critical importance for our application. With a standardized input, we could then develop a standardized output for predicted tracks from the model fit. The key benefit for us as developers is we could reduce the 'data cleaning and troubleshooting' overhead. And, package users would benefit from having an output product that could be used in a variety of, existing or yet to be developed, packages.

Requirements:

The bare minimum requirement would be an ordered (regular or irregular) time series with corresponding coordinates (geographic or projected or simulated space) and a representation of the error associated with each record. Ideally, there would also be support for unique 'animal IDs' and various group-by operations. For the output, we would likely create an extension of the sftraj object to properly capture additional model or prediction metadata and parameters.

Input:

For the crawl package, the idea would be for users to input an sftraj object. For wide scale adoption, the community would need an easy method for creating an sftraj object from typical telemetry observation data sources (usually a csv). The bio-logging community, various repositories (e.g. MoveBank), and manufacturers (e.g. Wildlife Computers, SMRU) might choose to develop options or APIs that allow export of data as an sftraj.

Output:

One option would be for a predicted track from a crawl model fit to be provided as an sftraj object. But, there are a number of model fit parameters that might useful to include in the output. Depending on the structure of the sftraj object, we might need to create a crawl extension/expansion of sftraj. This might be as simple as specifying a set of data frame columns that are included as part of the sftraj.

Additional information:

Convenience functions for sftraj to sf (points) or sf (linestring) (or multipoint, multilinestring) would be helpful. In some cases, telemetry data may record behavior or environmental observations at different time scales/frequencies than the observed locations. In these cases, it would be useful if sftraj supported NULL coordinates so these observations could be stored within the same sftraj structure.

@basille basille added use-case Use cases crawl Related to the package 'crawl' labels Oct 1, 2019
@basille
Copy link
Member

basille commented Oct 1, 2019

Ooooh this one is interesting, thanks Josh! So at first, I didn't think that handling of errors would make it into our first cycle… but it was on our road map for later, and you make a very good case for it. I have a few thoughts about it:

  • First of all, how do you define 'error' (please bare with me, I am familiar with crawl but actually never used it): is that a standardized metric? Is this one number in every direction, or maybe one for each x/y(/z) axes (thus forming an ellipse)? Is this more like a quality assessment (typically the DOP or something similar)? As far as I am concerned, I would be very reluctant to implement something that is not (or does not define) a standard approach. My initial thought here was to rely on the errors package, which does it very well, but I wonder how it can be adapted to 2D or 3D coordinates…
  • I wonder if the error (which is, I agree, essential for locations) should be part of the sftraj specification, e.g. directly in the class or as an attribute, or simply left as an additional (potentially constrained) column… Basically, I would be cautious to supercharge the specification with too much information (right now, that would be {x,y,z,t} at the location level + individual, projection and a few other attributes at higher levels), but it seems to me that errors would be a valid addition at the location level… which is why we had it on our road map to start with.
  • NULL coordinates should be part of it, as sf already allows it for POINT geometries. See a few more thoughts here.

@rociojoo
Copy link
Member

I think that Josh is referring to the errors to estimate the location by the tracking system, which is usually given by the provider (of the devices). I'm just directly copying some text from Johnson et al. 2008 which is the first thing that came to mind: "For each species location data were recorded by the Argos system (system description available online). Along with locations the Argos system rates the quality of observed location with a score from 1 (lowest measurement error; Argos class 3) to 6 (highest measurement error; Argos class B)." For the proposed model, the errors are treated as a covariate, so they need to keep that information at the location level.
Johnson, D. S., London, J. M., Lea, M.-A., & Durban, J. W. (2008). Continuous-Time Correlated Random Walk Model for Animal Telemetry Data. Ecology, 89(5), 1208–1215. https://doi.org/10.1890/10-1922.1

@basille
Copy link
Member

basille commented Oct 10, 2019

I've been thinking about that, and it is far from trivial: Argos has one way of classifying errors, GPS come with various measures of DOP, VHF triangulation would give you an ellipse, path interpolation might give you a confidence interval… There are many different ways to implement an error on a location or path. Hope this clarifies my questions above.

As long as there is no "standard" approach to handle error, it might as well be one column of the data with no particular (think semantic) meaning.

@jmlondon
Copy link
Author

All ... sorry I haven't had a chance to follow up on this issue until now.

I am thinking of error from a very Argos-centric experience, but it also applies to GPS locations. As @rociojoo mentioned, Argos provides a location quality class for each location calculated (which has some general correspondence to real-world values ... e.g, location quality class of 3 is within 250m). In more recent years, Argos has provided parameters for an error ellipse that specifies the specific error for each location. This is, by far, the preferred description of error and is supported by crawl (we also support location class error but, mostly, for legacy datasets). For GPS-based data, the approach is to follow the same ellipse structure, but treat it as a circle/ellipse with a constant error of 50m. There have been some efforts/suggestions that the error radius should be increased depending on the number of satellites used in the GPS calculation.

The other error to think about is uncertainty associated with predicted tracks/points from a model fit.

My thinking with respect to sftraj is that it would be helpful if there were a standard location (column, list column) where per location error is specified. And, if that error information was somehow standardized. As @basille points out, the last bit would be the hardest to sort out. I might imagine a workflow whereby sftraj specifies the location where error is stored AND the form of that error specification (would mimicking the Argos ellipse suffice? how could we specify the error distribution (e.g. uniform, bivariate normal, etc)).

Once that sftraj standard is established, various helper/wrapper functions could be developed to convert all the location errors out in the world to this standard. So, but burden isn't on sftraj to do the conversion, just to have put in some thoughtful effort at describing location error in a way that has broad applicability.

I'm going to cc @dsjohnson in on this discussion as well. Hoping he can chime in on some standardize approaches for describing location error that could be useful.

cheers
josh

@basille
Copy link
Member

basille commented Oct 25, 2019

Thanks Josh for the clarification. I need to acknowledge that I am not familiar with crawl (more than simply reading about it), so I will need a few more pointers.

From what I understand, there are a few forms of error:

  • Categorical: basically a class of error, could be numerical (e.g. 1, 2, 3… for Argos) or anything else (e.g. 1D/2D/3D for GPS).
  • Circular: a number in a geographic unit, giving the radius of the error (e.g. what is inferred from Argos or GPS parameters.
  • Ellipse: more parameters to define an ellipse (I guess major/minor axes, direction)
  • Model: basically a uni- or bivariate distribution defining the error.
  • Suite of parameters: typically what a full GPS record will show: Status (1D/2D/3D) + number of satellites + various measures of dilution of precision (HDOP, VDOP, PDOP, GDOP, etc.).

Anything else?

I can see in crawl that the Bearded Seal Location Data (beardedSeals) comes with the following data:

  • error_radius: Argos error radius
  • error_semimajor_axis: Argos error ellipse major axis length
  • error_semiminor_axis: Argos error ellipse minor axis length
  • error_ellipse_orientation: Argos error ellispse degree orientation

How does the radius relate to the other three (which are ellipse parameters)? Is this all we need for an ellipse?

Also, would you have a simple reproducible example that demonstrates the use of errors in crawl, both as input and output? Thanks a lot.

@basille
Copy link
Member

basille commented Jun 3, 2022

Closing this issue now that sftrack is developed and on CRAN.

@basille basille closed this as completed Jun 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
crawl Related to the package 'crawl' use-case Use cases
Projects
None yet
Development

No branches or pull requests

3 participants