Skip to content

TracyHIT/eRNA_predict

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

eRNA_predict

Fitting the data distribution of TPM obtained by CAGE-seq technology

Script/Fit_noise_From_distribution.R

The function for training and testing models

Script/Ac_enhancer_FANTON_ActiveP_RF_fivefold.R
Script/Ac_enhancer_FANTON_ActiveP_XGBoost_fivefold.R

Trained models

The performance of RF and XGBoost models is similar.
Here is the XGBoost model trained on GM12878 data.
Different feature combinations were used.
Please refer to Script/make_model_feature_index.R for the index of feature combinations.
Each feature combination was trained 5 times. Five predictions can be made in parallel, and then integrated through voting.

data file

FANTOM_Mappd_to_Hg38_genome_loci.Rdata: the gene regions from the FANTOM5, mapped to hg38, 65399 regions.
MPBS/FANTOM_65407_hg38_delsomeUn_nuc.bed: the nucleotide composition for the regions mapped from hg19(65407 regions) to hg38(65399 regions).
gene_Ensemble_annotation.Rdata: gene_Ensemble_annotation,TSS, hg38.
Compare_analysis_data: Store preprocessed data for comparative analysis.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published