Skip to content

SAPPORO is a workflow and individual task execution system. It is also useful for continuous testing of workflows.

License

Notifications You must be signed in to change notification settings

MappingSystem/feed

 
 

Repository files navigation

SAPPORO

Sapporo is a workflow execution system (WES) that provides the GA4GH WES Standard compatible API server and the Web GUI.

SAPPORO - Home

Documentation in Japanese

Features

  • Easy to deploy with docker-compose
    • Suitable for on-premise, local cluster, cloud
  • Workflow language and runner flexibility
    • A wrapper run.sh encapsulates the differences in languages/runners/job schedulers

Features work in progress

  • Compatibility with object storage servers
  • Collect and manage batch job results
  • Verification of the reproducibility of a workflow

Architecture

Sapporo has two components, Web and Service, which enable the cloud-friendly deployment. Sapporo-fileserver is also available for I/O testing.

SAPPORO - System Architecture

Users need to register Sapporo-service or other WES implementations to Sapporo-web to submit and manage workflows. Details of the components are in the documentation of each repository:

Aims and expectations

Individual task execution system

From Wikipedia - Batch Computing:

the scripted running of one or more programs, as directed by Job Control Language, with no human interaction other than,

Batch computing is a common technique used in the various fields of data engineering and science:

  • Animation rendering
  • Software testing
  • Machine learning
  • Genomic data analysis
  • Simulations

Batch jobs usually have problems on portability and reproducibility, because many are implemented for a specific computing environment such as local computing clusters. Sapporo aims to support technologies such as container virtualization (e.g. Docker, Singularity, etc.), or workflow runners (e.g. Airflow, Luigi, etc.), and workflow languages (Common Workflow Language, Workflow Description Language, Nextflow, Snakemake, etc.).

SAPPORO - Batch Job

Continuous testing of workflows

Packaging software in containers and describing processes in workflow languages are powerful methods to improve portability. However, there are still problems to prevent workflow execution like:

  • Server down
  • Network down
  • Other processes occupies resources such as CPU, memory, or storage
  • Unexpected modification of container images

Sapporo aims to introduce the continuous integration (CI) / continous deployment (CD) concept to the management of batch job execution. Testing batch job with WES can make sure that the registered batch job is running correctly, or failed at some point.

License

SAPPORO is released under the Apache 2.0 license.

About

SAPPORO is a workflow and individual task execution system. It is also useful for continuous testing of workflows.

Resources

License

Stars

Watchers

Forks

Packages

No packages published