Skip to content

Running state-of-the-art RNA-seq abundance quantification software "kallisto" on UPMEM DPU system

License

Notifications You must be signed in to change notification settings

gary0828gary/RNA-Abundance-Quantification-on-UPMEM

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RNA-Abundance-Quantification-on-UPMEM

More destails can be found in the paper: "RNA-seq Quantification on Processing in memory Architecture: Observation and Characterization"

https://doi.org/10.1109/NVMSA56066.2022.00014

Cite the paper if you use D_kallisto in your work

Liang-Chi Chen, Shu-Qi Yu, Chien-Chung Ho, Yuan-Hao Chang, Da-Wei Chang, Wei-Chen Wang, Yu-Ming Chang, "RNA-seq Quantification on Processing in memory Architecture: Observation and Characterization," The 11th IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA), August 23-25, 2022

@inproceedings{chen2022rna,
  title={RNA-seq Quantification on Processing in memory Architecture: Observation and Characterization},
  author={Chen, Liang-Chi and Yu, Shu-Qi and Ho, Chien-Chung and Chang, Yuan-Hao and Chang, Da-Wei and Wang, Wei-Chen and Chang, Yu-Ming},
  booktitle={2022 IEEE 11th Non-Volatile Memory Systems and Applications Symposium (NVMSA)},
  pages={26--32},
  year={2022},
  organization={IEEE}
}

Build

// build htslib first
cd ext/htslib
autoheader
autoconf
make -j16
// build our main program
cd ../..
mkdir obj
cd src 
make -j16

Usage

./D_kallisto pseudo [fastq file] 
      -i [index file] 
      -o [output path] 
      -t [num of CPU threads] 
      -d [num DPUs]
      --single
      -l [double]
      -s [double]

E.g., testing 100K reads/11-mer by 64*8 dpus

time ./D_kallisto pseudo -i ~/data/experiment/11-mer.idx -o out --single ~/data/experiment/RNA_read/100K.fastq -l 150 -s 30 -t 8 -d 64

More information

DPU program is in src/dpu_app
DPU allocation and CPU-DPU(DPU-CPU) transfers are in src/ProcessReads.cpp

Reference

kallisto

https://github.com/pachterlab/kallisto

UPMEM

https://github.com/CMU-SAFARI/prim-benchmarks
https://sdk.upmem.com/2021.3.0/
https://sdk.upmem.com/2021.3.0/CppAPI/index.html

Testing CPU-based kallisto

time ./kallisto pseudo -i ~/data/experiment/11-mer.idx -o out --single ~/data/experiment/RNA_read/100K.fastq -l 150 -s 30 -t 8 

About

Running state-of-the-art RNA-seq abundance quantification software "kallisto" on UPMEM DPU system

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C 78.7%
  • C++ 16.5%
  • Perl 1.5%
  • Makefile 1.4%
  • Roff 0.9%
  • M4 0.6%
  • Other 0.4%