RUST-backed utilities for bioinformatic data processing.
The fastest way to get started it to download the applications found in the Releasehttps://github.com/zachcp/bioinformaticstoolkit/releases section. This project aims to demonstrate how the Rust toolchain enables efficient cross-platform support for high-performance applications. By using Tauri you can write the entire frontend in any tool that compiles to HMLT+Javascript, in this case I used Quarto to take advantage of its simple composition (its mostly markdown +yaml) as well as it's built-in use of the observable runtime.
Below are screenshots of a native application demonstrating the home page, the guide page, an example RNA secondary strucutre visualization using rnapkin;statistics of a fasta file including a histrogram of sequence lengths using noodles for IO; and DNA translation using the protein_translation crate.
# assuming quarto and cargo are installed and on your path.
git clone https://github.com/zachcp/bioinformaticstoolkit.git
cd bioinformaticstoolkit
# install the tauri cli
cargo install tauri-cli
# add cargo bind dir to the path
export PATH=$PATH:~/.cargo/bin/
# to develop
cargo-tauri dev
# to package. this build is ~8MB.
cargo-tauri build
# to test
cd src-tauri && cargo test
# or verbose
cd src-tauri && cargo test -- --nocapture
FASTX:
- convert fasta to fastq
- basic stats of fasta/fastq
- histrogram of read lengths (possibly set max number)
- merge PE reads // split interleaved
- splitting into multiple files ( create directory ?)
- filter-fastx length // quality
- sample the fasta/x files
- plot: length x quality metrics ( optional hexagon plots )
- plot: coverage by location.
GFA:
- Utilites from GFATK including filtering
- GFAStats
DNA Analysis:
- Digestability of DNA sequences:
- Search for RE locations
- Other Patterns to Avoid
- Data: Standard RE enzymes
- Plot: Genome View of RE sites.
- Global view of Palettes and coding types
- Insilico PCR: https://github.com/dlesl/pcr
- Clonifier: https://github.com/dlesl/clonifier
- Phenogram
- Pangenome TK: https://github.com/GeneDx/pgr-tk (cdep in the build)
- RE digest and assembly calculations
VCF:
- convert
- concat
- split
RNA Secondary Structure:
- RNApkin https://lib.rs/crates/rnapkin
rna-seq: - [ ] gencounts https://github.com/NKI-GCF/gensum - [ ] rust-lapper https://crates.io/crates/rust-lapper
Taxonomy:
- load and display a tree file
- load and display kraken
- load and display bracken
Peptides and Proteomics:
Javascript: - SGTK - ribbon - [jbrowse](https://jbrowse.org/jb2/docs/quickstart_web/ - ideogram - genomegraphviewer
Rust Software:
Miscelleaneous:
- Genome Card: e.g viz with global genome statistics.
- Genome name, overview, produces compounds
- Utilities for Codons
- VCF plotein
- ASGArt (cdep in the build)
- UDON
- GFAESTUS (c++ dep )
- BioSeq
- 10x Genomics Rust
- fq parser
- fastats
- fqmerge
- ggcat
- light motif
- liftover with crusmapr
- exon
- phylogeny # not much action
- chemical Reaction networks
- gb-io
- charming - a nive gui library
- met map
- barcode counter
- hpo
- nanopore read assessment: https://lib.rs/crates/nanoq#readme-read-report
- niffler
- OBO Validatio
- rustyms
- preotienogenic
- rdkit
- bigwig2bam
- Plasmapr: https://github.com/BradyAJohnston/plasmapR
- flate2use flate2::read::MultiGzDecoder;
- bio_streams
- BRICK Webapp
- Streaming iterators for bioinformatics data
- quickdna
- Conway-Bromage-Lyndon
- https://github.com/weng-lab/logojs-package
- fiber-seq
This is an Observable Framework project. To start the local preview server, run:
npm run dev
Then visit http://localhost:3000 to preview your project.
For more, see https://observablehq.com/framework/getting-started.
A typical Framework project looks like this:
.
├─ docs
│ ├─ components
│ │ └─ timeline.js # an importable module
│ ├─ data
│ │ ├─ launches.csv.js # a data loader
│ │ └─ events.json # a static data file
│ ├─ example-dashboard.md # a page
│ ├─ example-report.md # another page
│ └─ index.md # the home page
├─ .gitignore
├─ observablehq.config.ts # the project config file
├─ package.json
└─ README.md
docs
- This is the “source root” — where your source files live. Pages go here. Each page is a Markdown file. Observable Framework uses file-based routing, which means that the name of the file controls where the page is served. You can create as many pages as you like. Use folders to organize your pages.
docs/index.md
- This is the home page for your site. You can have as many additional pages as you’d like, but you should always have a home page, too.
docs/data
- You can put data loaders or static data files anywhere in your source root, but we recommend putting them here.
docs/components
- You can put shared JavaScript modules anywhere in your source root, but we recommend putting them here. This helps you pull code out of Markdown files and into JavaScript modules, making it easier to reuse code across pages, write tests and run linters, and even share code with vanilla web applications.
observablehq.config.ts
- This is the project configuration file, such as the pages and sections in the sidebar navigation, and the project’s title.
Command | Description |
---|---|
npm install |
Install or reinstall dependencies |
npm run dev |
Start local preview server |
npm run build |
Build your static site, generating ./dist |
npm run deploy |
Deploy your project to Observable |
npm run clean |
Clear the local data loader cache |
npm run observable |
Run commands like observable help |