-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Providing correlation matrix from dataset #7
Comments
Hi Linus,
Thanks for reaching out! ESCO can simulate data with correlation as
estimated from the dataset, using the parameter "corr". The parameter
"corr" is a list of correlation matrices, you can define it by yourself via
#===if simulate one group
sim <- escoSimulateSingle(nGenes = 100, nCells = 50,
lib.loc = 7, withcorr = TRUE, verbose = FALSE,
*corr=list(cormat)*)
#===if simulate two groups
sim <- escoSimulateGroups(nGenes = 200, nCells = 100,
group.prob = c(0.6, 0.4), deall.prob = 0.3,
de.prob = c(0.3, 0.7),
de.facLoc = c(1.9, 2.5), withcorr = TRUE, *corr =
list(cormat_housekeep, cormat_1, cormat_2)*,
trials = 1, verbose =FALSE)
# One just needs to make sure that:
nrow(cormat_housekeep)=length(housekeep genes);
nrow(cormat_1)=length(marker genes for group1);
nrow(cormat_2)=length(marker genes for group2)
the easiest way to make sure of this is simulating data first without
specifying corr,
and check the automatically generated corr dimensions: slot(
metadata(sim)$Params,"corr")
The randcor function is just for a convenient purpose: it uses a realistic
dataset to generate a correlation structure automatically for users who do
not want to specify the correlation structure by themselves.
The vignettes here
<https://github.com/JINJINT/ESCO/blob/bf6d78c653dd06a38e092611265a76286ba6dfef/vignettes/esco.Rmd>
contain more examples.
Best,
Jinjin
…On Thu, May 11, 2023 at 8:42 AM Linus Schumacher ***@***.***> wrote:
Reading the ESCO paper I was under the impression I could provide a real
dataset and simulate data with correlation (in the copula) as estimated
from the dataset. However it is not clear from the documentation how to do
that, could you clarify?
Looking into the code, I can also see that when type is 'type=="traj"(
https://github.com/JINJINT/ESCO/blob/bf6d78c653dd06a38e092611265a76286ba6dfef/R/esco-simulate.R#LL1083C10-L1083C22),
the parametercorris either expected to be a scalar (not a correlation
matrix), or estimated usingrandcor` for the differentially expressed
genes. Two things that are different to my expectations:
1. One can only specify one correlation matrix, not one per cell type
2. In the randcor function
https://github.com/JINJINT/ESCO/blob/bf6d78c653dd06a38e092611265a76286ba6dfef/R/utils.R#L150,
a "purified gene expression dataset" is used (
https://www.eurekalert.org/pub_releases/2017-11/sfn-nwa111417.php),
rather than a user-defined dataset. Is there a way to change this to a
dataset of my choice?
—
Reply to this email directly, view it on GitHub
<#7>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AKGMHEG37EZUN6HKQGLDRETXFTNDDANCNFSM6AAAAAAX6CV434>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Reading the ESCO paper I was under the impression I could provide a real dataset and simulate data with correlation (in the copula) as estimated from the dataset. However it is not clear from the documentation how to do that, could you clarify?
Looking into the code, I can also see that when type is 'type=="traj"
(https://github.com/JINJINT/ESCO/blob/bf6d78c653dd06a38e092611265a76286ba6dfef/R/esco-simulate.R#LL1083C10-L1083C22), the parameter
corris either expected to be a scalar (not a correlation matrix), or estimated using
randcor` for the differentially expressed genes. Two things that are different to my expectations:randcor
functionESCO/R/utils.R
Line 150 in bf6d78c
The text was updated successfully, but these errors were encountered: