Skip to content

zachcp/bioinformaticstoolkit

Repository files navigation

The Bioinformatics Toolkit

RUST-backed utilities for bioinformatic data processing.

Get started

The fastest way to get started it to download the applications found in the Releasehttps://github.com/zachcp/bioinformaticstoolkit/releases section. This project aims to demonstrate how the Rust toolchain enables efficient cross-platform support for high-performance applications. By using Tauri you can write the entire frontend in any tool that compiles to HMLT+Javascript, in this case I used Quarto to take advantage of its simple composition (its mostly markdown +yaml) as well as it's built-in use of the observable runtime.

Screenshots

Below are screenshots of a native application demonstrating the home page, the guide page, an example RNA secondary strucutre visualization using rnapkin;statistics of a fasta file including a histrogram of sequence lengths using noodles for IO; and DNA translation using the protein_translation crate.

Develop

# assuming quarto and cargo are installed and on your path.
git clone https://github.com/zachcp/bioinformaticstoolkit.git
cd bioinformaticstoolkit

# install the tauri cli
cargo install tauri-cli

# add cargo bind dir to the path
export PATH=$PATH:~/.cargo/bin/

# to develop 
cargo-tauri dev

# to package. this build is ~8MB. 
cargo-tauri build

# to test
cd src-tauri && cargo test
# or verbose
cd src-tauri && cargo test -- --nocapture

Other Ideas/Tools for Rust Incorporation

FASTX:

  • convert fasta to fastq
  • basic stats of fasta/fastq
  • histrogram of read lengths (possibly set max number)
  • merge PE reads // split interleaved
  • splitting into multiple files ( create directory ?)
  • filter-fastx length // quality
  • sample the fasta/x files
  • plot: length x quality metrics ( optional hexagon plots )
  • plot: coverage by location.

GFA:

  • Utilites from GFATK including filtering
    • GFAStats

DNA Analysis:

VCF:

  • convert
  • concat
  • split

RNA Secondary Structure:

rna-seq: - [ ] gencounts https://github.com/NKI-GCF/gensum - [ ] rust-lapper https://crates.io/crates/rust-lapper

Taxonomy:

  • load and display a tree file
  • load and display kraken
  • load and display bracken

Peptides and Proteomics:

Javascript: - SGTK - ribbon - [jbrowse](https://jbrowse.org/jb2/docs/quickstart_web/ - ideogram - genomegraphviewer

Rust Software:

Miscelleaneous:

bioinformatics-toolkit

This is an Observable Framework project. To start the local preview server, run:

npm run dev

Then visit http://localhost:3000 to preview your project.

For more, see https://observablehq.com/framework/getting-started.

Project structure

A typical Framework project looks like this:

.
├─ docs
│  ├─ components
│  │  └─ timeline.js           # an importable module
│  ├─ data
│  │  ├─ launches.csv.js       # a data loader
│  │  └─ events.json           # a static data file
│  ├─ example-dashboard.md     # a page
│  ├─ example-report.md        # another page
│  └─ index.md                 # the home page
├─ .gitignore
├─ observablehq.config.ts      # the project config file
├─ package.json
└─ README.md

docs - This is the “source root” — where your source files live. Pages go here. Each page is a Markdown file. Observable Framework uses file-based routing, which means that the name of the file controls where the page is served. You can create as many pages as you like. Use folders to organize your pages.

docs/index.md - This is the home page for your site. You can have as many additional pages as you’d like, but you should always have a home page, too.

docs/data - You can put data loaders or static data files anywhere in your source root, but we recommend putting them here.

docs/components - You can put shared JavaScript modules anywhere in your source root, but we recommend putting them here. This helps you pull code out of Markdown files and into JavaScript modules, making it easier to reuse code across pages, write tests and run linters, and even share code with vanilla web applications.

observablehq.config.ts - This is the project configuration file, such as the pages and sections in the sidebar navigation, and the project’s title.

Command reference

Command Description
npm install Install or reinstall dependencies
npm run dev Start local preview server
npm run build Build your static site, generating ./dist
npm run deploy Deploy your project to Observable
npm run clean Clear the local data loader cache
npm run observable Run commands like observable help