Project: ConsensusML
Author: Jenny Smith
The code in this directory provides the methods for
- Downloading Gene Expression Data from the Genomic Data Commons
- GDC_Data_Download.Rmd
- Normalization and Batch Effect Investigations
- Normalization_and_Batch_Effect_Investigation.Rmd
- Differential expression and LASSO logistic regression using differentially expressed genes (DEGs) as variables
- Differential_Expression_and_Lasso.Rmd
The focus of these Investigations are to identify genes that are associated with high and standard risk AML, first by differential expression analysis and then to further select genes from a large list of DEGs using the LASSO regression. These would be genes that are associated with high and standard risk clinical features.
This procedure produced a small number of genes which can be easily further investigated for defining theraputic targets/biomarkers or potentially prediction of poor prognosis using diagnostic samples.