Skip to content
/ DRAFTS Public

DNA Regulatory element Analysis by cell-Free Transcription and Sequencing

License

Notifications You must be signed in to change notification settings

ssyim/DRAFTS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

93 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DRAFTS

DNA Regulatory element Analysis by cell-Free Transcription and Sequencing

Code and materials from paper "Multiplex transcriptional characterizations across diverse bacterial species using cell-free systems" Yim SS*, Johns NI*, Park J, Gomes ALC, McBee RM, Richardson M, Ronda C, Chen SP, Garenne D, Noireaux V, Wang HH. Molecular Systems Biology (2019) 15, e8875. *denotes equal contribution

The full paper and supplementary information can be accessed here.

Raw sequencing data can be found at NCBI SRA under PRJNA509603.

dependencies

The following must be installed prior to executing the code in this repository. For Python packages, it may be convenient to obtain these through a distribution such as Anaconda. Installation should only take a few minutes.

  • python 3.6.X, ipython/jupyter
    • biopython
    • pandas
    • numpy
    • scipy
    • matplotlib
    • seaborn
  • bbmerge

1. processing of raw sequencing data

01_DRAFTS_process_raw.sh

  • expects nextseq/miseq raw data folder, where each folder has 2 files of R1 and R2 (paired-end reads) and files sequenced from different lanes of flowcell are separated in four different folders labeled with _L00n
  • assumes foldernames are Samplename_L001_, Samplename_L002_, Samplename_L003_, or Samplename_L004_ Samplename here is SampleID for each sample in the sample sheet for illumna sequencing run

run 01_DRAFTS_process_raw.sh 1) to find and combine raw nextseq data in search_dir, 2) unzip them to the out_dir, then 3) assemble paired-end reads

bash 01_DRAFTS_process_raw.sh [search_dir] [out_dir (optional)]

after running 01_DRAFTS_process_raw.sh, group DNA-seq and RNA-seq reads in seperate folders for further analysis

2. error filtering and barcode counting

02_DRAFTS_extract_data.py

  • out_dir should contain a folder named 01_bccounts with 2 empty folders insde named [01_dna_bccounts, 02_rna_bccounts],
  • and a folder named 02_log with 10 empty folders inside named [01_bccounts, 02_lowq, 03_missingadapter, 04_badbc, 05_goodbc_badalign, 06_frag, 07_goodbc_perfectalign, 08_goodbc_goodalign, 09_goodbc_perfectalign_bccounts, 10_goodbc_goodalign_bccounts, 11_log_files]

run 02_DRAFTS_extract_data.py to 1) filter errors in oligo library synthesis or sequencing, 2) extract barcode counts and 3) other info for qc and additional analysis

python 02_DRAFTS_extract_data.py [ref_csv] [dna_directory] [rna_directory] [out_dir]

3. calculation of transcription levels

03_DRAFTS_compute_tx.py

  • out_dir should contain a folder named 01_tx

run 03_DRAFTS_compute_tx.py to 1) compute abundances of DNA and RNA barcode counts and 2) transcription levels

python 03_DRAFTS_compute_tx.py [ref_csv] [dna_bc_directory] [rna_bc_directory] [out_dir]

About

DNA Regulatory element Analysis by cell-Free Transcription and Sequencing

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published