Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MGnifyR - query samples by depth #17

Open
alexschickele opened this issue Jul 11, 2023 · 1 comment
Open

MGnifyR - query samples by depth #17

alexschickele opened this issue Jul 11, 2023 · 1 comment

Comments

@alexschickele
Copy link

alexschickele commented Jul 11, 2023

Dear Ben,
As part of the BlueCloud2026 project, I am trying to build a data access function to retrive species and KEGG (probably) annotations and the corresponding reads, from the MGnify database.
The first step would be to get the list of marine samples within a certain depth and time range.
If I understood correctly, the MGnifyR package does not allow for multiple filters in one query. That is why I first tried to query according to depth levels, that should be the most disciminant factor for filtering samples.

However, I encountered some issues with the mgnify_query() function. Here are some examples :

library(MGnifyR)
library(tidyverse)
mg <- mgnify_client(usecache = TRUE)

Trying with the metadata_value_gte argument. I guess it refers to "greater than".

foo <- mgnify_query(mg, "samples", biome_name = "Marine", 
                     metadata_key = "depth", 
                     metadata_value_gte = 100,
                     maxhits = 5,
                     usecache = TRUE)
foo$depth %>% unique()
 [1] "1988.0" "75.0"   "102.0"  "1008.0" "119.0"  "182.0"  "101.0"  "30.0"   "111.0"  "5601.0" "202.0"  "143.0"  "150.0"  "151.0"  "100.0"  "135.0"  "149.0" 
[18] "200.0"  "233.0"  "201.0"  "380.0"  "175.0" 

Trying with the metadata_value_gte argument. I guess it refers to "lower than".

foo <- mgnify_query(mg, "samples", biome_name = "Marine", 
                    metadata_key = "depth", 
                    metadata_value_lte = 100,
                    maxhits = 5,
                    usecache = TRUE)
foo$depth %>% unique()
 [1] "1988.0" "75.0"   "15.0"   "76.0"   "52.0"   "30.0"   "16.0"   "10.0"   "91.0"   "51.0"   "2.0"    "119.0"  "21.0"   "182.0"  "50.0"   "111.0"  "202.0" 
[18] "33.0"   "49.0"   "74.0"   "14.0"   "68.0"   "40.0"   "151.0"  "69.0" 

Both _gte and _lte does not seem to work properly. Therefore I tried to query only a single depth layer. Which would then be parallelized to query all depth levels within our range.

foo <- mgnify_query(mg, "samples", biome_name = "Marine", 
                    metadata_key = "depth", 
                    metadata_value = 100,
                    maxhits = -1,
                    usecache = TRUE)
foo$depth %>% unique()
[1] "100.0"

The samples equal to 100 m depth seems to work. However, if we try another depth level, it does not anymore...

foo <- mgnify_query(mg, "samples", biome_name = "Marine", 
                    metadata_key = "depth", 
                    metadata_value = 5,
                    maxhits = -1,
                    usecache = TRUE)
foo$depth %>% unique()
[1] "5.0"  "2.0"  "0.3"  "0.29" "0.33" "0.28"

Therefore, I am wondering if I am missing something in the functions or miss-use them ?
Thank your in advance for your feedback,
Best,

Alexandre

beadyallen pushed a commit that referenced this issue Oct 9, 2023
* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* Update documents

* update docs

* up

* up

* up

* preserve old behavior and functions

* up

* up

* Remove things causing errors and warnings in build check

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* verbose

* up
@TuomasBorman
Copy link
Collaborator

Thank you for this issue, and sorry for very late reply. Even tough this might not help you anymore,this might be relevant for others. I believe this is problem is not in MGnifyR but rather in API.

For instance, go to this address: https://www.ebi.ac.uk/metagenomics/api/latest/samples?accession=&experiment_type=&biome_name=Marine&metadata_key=depth&metadata_value_lte=0.1

It searches Marine biome samples that have depth less than or equal to 0.1. However, the filter does not work; there are depth values over 0.1. You can set other filters from the browser's tab "Filters".

@SandyRogers Do you have any idea what is the problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants