Calculon - Co-design for large scale parallel applications

Running

Run Calculon like this:

$> PYTHONPATH=. ./bin/calculon <args>

Calculon is a hierarchical command line. To see the commands it accepts, use --help or -h:

$> PYTHONPATH=. ./bin/calculon -h

You can also see how to use any command specifically by using --help or -h on the command:

$> PYTHONPATH=. ./bin/calculon llm -h

LLM Example

Run a single calculation for LLM (~1 sec):

$> PYTHONPATH=. ./bin/calculon llm models/megatron-1T.json examples/3072_t4_p64_d12_mbs4_full.json systems/a100_80g.json -

Run a system execution optimizer for LLM (~1 min):

$> PYTHONPATH=. ./bin/calculon llm-optimal-execution models/turing-530B.json 5128 2520 float16 systems/a100_80g.json output.json -m

opt_exe.json will contain the optimal way to run Turing-530B across 5128 A100 GPUs.

To store results from all successful runs from the same experiment, run a special system optimizer (~1 min):

$> PYTHONPATH=. ./bin/calculon llm-all-executions models/turing-530B.json 5128 2520 float16 systems/a100_80g.json all_output.csv

Testing and validation (optional)

To make sure that the current build is working, use

$> make test

To validate Calculon performance modeling against Megatron run on NVIDIA's Selene A100-based supercomputer with results published in "Sequence parallelism" paper, use

$> PYTHONPATH=. ./bin/calculon llm-validation

Publications

Calculon: A Methodology and Tool for High-Level Co-Design of Systems and Large Language Models
Mikhail Isaev, Nic McDonald, Larry Dennison, Richard Vuduc
Paper
Scaling Infrastructure to Support Multi-Trillion Parameter LLM Training
Mikhail Isaev, Nic McDonald, Richard Vuduc
Paper

Name		Name	Last commit message	Last commit date
Latest commit History 223 Commits
bin		bin
calculon		calculon
examples		examples
models		models
scripts		scripts
systems		systems
test		test
validation/seqsel		validation/seqsel
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
NOTICE		NOTICE
README.md		README.md
pylintrc		pylintrc
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Calculon - Co-design for large scale parallel applications

Running

LLM Example

Testing and validation (optional)

Publications

About

Releases

Packages

Languages

License

quetric/calculon

Folders and files

Latest commit

History

Repository files navigation

Calculon - Co-design for large scale parallel applications

Running

LLM Example

Testing and validation (optional)

Publications

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages