Extracting genes driving the enrichment and their mean bootstrap expression #76

iscastanho · 2022-09-05T23:23:44Z

I am not sure if these are already features of EWCE (although I could not find in manuals/tutorials), so I am posting them as questions.

1. Is there a way of extracting which genes are driving the enrichment for each cell type tested?
At the moment I am using the “marker” genes for each population/subpopulation I have (which I identified with Seurat) but I was wondering if there was a way of extracting this directly from EWCE as I can see that it performs differential expression as part of the pipeline (using LIMMA if I noticed it correctly).

2. Can I get the mean bootstrap expression for each gene from EWCE? If so, how?
In Skene and Grant, 2016 (https://doi.org/10.3389/fnins.2016.00016), in Figure 2 C and D, genes from one of the cell types (microglia) are shown, highlighting their expression against the mean bootstrap expression. I would like to have access to this type of information from my data too. How could I extract this?

Thank you.

Al-Murphy · 2022-09-06T09:00:57Z

Is there a way of extracting which genes are driving the enrichment for each cell type tested?
This is essentially the specificity i.e. how specific is the expression of a gene to a cell type. This is available in the ctd made from your reference scRNA-seq dataset. For example (from the vignette dataset):

ctd <- ewceData::[ctd]
#cell type level 1 specificity for genes
ctd[[1]]$specificity

EWCE as I can see that it performs differential expression as part of the pipeline (using LIMMA if I noticed it correctly)
EWCE uses limma as a filtering step to remove uninformative genes i.e. genes, the function drop_uninformative_genes. This is done to remove genes that don't vary across cell types (using limma) to help reduce noise in subsequent steps (makes a more far comparison between your gene list and the randomly sampled background genes). So in this sense, limma isn't used to identify cell type specific gene lists but as a preprocessing step to filter out genes.

Can I get the mean bootstrap expression for each gene from EWCE? If so, how?
I believe figure 2 C and D are essentially showing the specificity of genes (perhaps @NathanSkene can confirm?). So you should be able to use the specificity values to get a similar plot.

Thanks

NathanSkene · 2022-09-06T09:15:09Z

EWCE can generate bootstrap plots that show the bootstrap probabilities of each gene. I thibk the function is something like generate bootstrap plots. I don't really find them that useful but regularly get asked which genes are driving the enrichments, and this is the clearest way of getting at it... but the enrichment is driven by the set of genes, not any particular gene

…

________________________________ From: Alan Murphy ***@***.***> Sent: 06 September 2022 10:01 To: NathanSkene/EWCE ***@***.***> Cc: Skene, Nathan G ***@***.***>; Mention ***@***.***> Subject: Re: [NathanSkene/EWCE] Extracting genes driving the enrichment and their mean bootstrap expression (Issue #76) This email from ***@***.*** originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list<https://spam.ic.ac.uk/SpamConsole/Senders.aspx> to disable email stamping for this address. 1. Is there a way of extracting which genes are driving the enrichment for each cell type tested? This is essentially the specificity i.e. how specific is the expression of a gene to a cell type. This is available in the ctd made from your reference scRNA-seq dataset. For example (from the vignette dataset): ctd <- ewceData::[ctd] #cell type level 1 specificity for genes ctd[[1]]$specificity EWCE as I can see that it performs differential expression as part of the pipeline (using LIMMA if I noticed it correctly) EWCE uses limma as a filtering step to remove uninformative genes i.e. genes, the function drop_uninformative_genes<https://github.com/NathanSkene/EWCE/blob/0e8dba99c15afe928edcc61c9a44092dbd992018/R/drop_uninformative_genes.r>. This is done to remove genes that don't vary across cell types (using limma) to help reduce noise in subsequent steps (makes a more far comparison between your gene list and the randomly sampled background genes). So in this sense, limma isn't used to identify cell type specific gene lists but as a preprocessing step to filter out genes. 1. Can I get the mean bootstrap expression for each gene from EWCE? If so, how? I believe figure 2 C and D are essentially showing the specificity of genes (perhaps @NathanSkene<https://github.com/NathanSkene> can confirm?). So you should be able to use the specificity values to get a similar plot. Thanks — Reply to this email directly, view it on GitHub<#76 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AH5ZPE2SERGE5YUTVBIFUSTV44B5JANCNFSM6AAAAAAQFKUSOQ>. You are receiving this because you were mentioned.Message ID: ***@***.***>

bschilder · 2023-03-09T20:16:34Z

Here's the function @iscastanho
https://nathanskene.github.io/EWCE/reference/generate_bootstrap_plots.html

See here for some upgrades I'm making to it soon.
#77

bschilder closed this as completed Mar 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extracting genes driving the enrichment and their mean bootstrap expression #76

Extracting genes driving the enrichment and their mean bootstrap expression #76

iscastanho commented Sep 5, 2022 •

edited

Loading

Al-Murphy commented Sep 6, 2022

NathanSkene commented Sep 6, 2022 via email

bschilder commented Mar 9, 2023

Extracting genes driving the enrichment and their mean bootstrap expression #76

Extracting genes driving the enrichment and their mean bootstrap expression #76

Comments

iscastanho commented Sep 5, 2022 • edited Loading

Al-Murphy commented Sep 6, 2022

NathanSkene commented Sep 6, 2022 via email

bschilder commented Mar 9, 2023

iscastanho commented Sep 5, 2022 •

edited

Loading