Skip to content

Latest commit

 

History

History
721 lines (506 loc) · 19.2 KB

README.md

File metadata and controls

721 lines (506 loc) · 19.2 KB

test-workflow coverage-workflow codeql-workflow deps-report

⏱️ dyno

test code against a certain rate of production traffic

Overview

Loops a task function, for a given duration, across multiple threads.

A test is deemed succesful if it ends without creating a cycle backlog.

example: benchmark a recursive fibonacci function across 4 threads

// benchmark.js
import { dyno } from '@nicholaswmin/dyno'

await dyno(async function cycle() { 
  // <benchmarked-code>

  function fibonacci(n) {
    return n < 1 ? 0
    : n <= 2 ? 1 : fibonacci(n - 1) + fibonacci(n - 2)
  }

  fibonacci(35)

  // </benchmarked-code>
}, {
  // test parameters
  parameters: { cyclesPerSecond: 100, threads: 4, durationMs: 5 * 1000 },
  
  // log live stats
  onTick: list => {    
    console.clear()
    console.table(list().primary().pick('count'))
    console.table(list().threads().pick('mean'))
  }
})

run it:

node benchmark.js

logs:

cycle stats

┌─────────┬────────┬───────────┬─────────┐
 uptime   issued  completed  backlog 
├─────────┼────────┼───────────┼─────────┤
 4        100     95         5       
└─────────┴────────┴───────────┴─────────┘

average timings/durations, in ms

┌─────────┬───────────┬────────┐
 thread   evt_loop   cycle  
├─────────┼───────────┼────────┤
 '46781'  10.47      10.42  
 '46782'  10.51      10.30  
 '46783'  10.68      10.55  
 '46784'  10.47      10.32  
└─────────┴───────────┴────────┘

Install

npm i @nicholaswmin/dyno

Generate benchmark

npx init

creates a preconfigured sample benchmark.js.

Run it:

node benchmark.js

Configuration

import { dyno } from '@nicholaswmin/dyno'

await dyno(async function cycle() { 

  // add benchmarked task
  // code in this block runs in its own thread

}, {
  parameters: { 
    // add test parameters
  },
  
  onTick: list => {    
    // build logging from the provided measurements
  }
})

Test parameters

name type default description
cyclesPerSecond Number 50 global cycle issue rate
durationMs Number 5000 how long the test should run
threads Number auto number of spawned threads

auto means it detects the available cores but can be overriden

these parameters are user-configurable on test startup.

The test process

The primary spawns the benchmarked code as task threads.

Then, it starts issuing cycle commands to each one, in round-robin, at a set rate, for a set duration.

The task threads must execute their tasks faster than the time it takes for their next cycle command to come through, otherwise the test will start accumulating a cycle backlog.

When that happens, the test stops; the configured cycle rate is deemed as the current breaking point of the benchmarked code.

An example:

A benchmark configured to use threads: 4 & cyclesPerSecond: 4.

Each task thread must execute its own code in < 1 second since this is the rate at which it receives cycle commands.

Glossary

primary

The main process. Orchestrates the test and the spawned task threads.

task thread

The benchmarked code, running in its own separate process.

Receives cycle commands from the primary, executes it's code and records its timings.

task

The benchmarked code

cycle

A command that signals a task thread to execute it's code.

cycle rate

The rate at which the primary sends cycle commands to the task threads

cycle timing

Amount of time it takes a task thread to execute it's own code

cycle backlog

Count of issued cycle commands that have been issued/sent but not executed yet.

The process model

This is how the process model would look, if sketched out.

// assume `fib()` is the benchmarked code

Primary 0: cycles issued: 100, finished: 93, backlog: 7


├── Thread 1
   └── function fib(n) {
       ├── return n < 1 ? 0
       └── : n <= 2 ? 1 : fib(n - 1) + fib(n - 2)}

├── Thread 2
   └── function fib(n) {
       ├── return n < 1 ? 0
       └── : n <= 2 ? 1 : fib(n - 1) + fib(n - 2)}

└── Thread 3
    └── function fib(n) {
        ├── return n < 1 ? 0
        └── : n <= 2 ? 1 : fib(n - 1) + fib(n - 2)}

Metrics

The benchmarker comes with a statistical measurement system that can be optionally used to diagnose bottlenecks.

Some metrics are recorded by default; others can be recorded by the user within a task thread.

Every recorded value is tracked as a Metric, represented as a histogram with min, mean, max properties.

Histogram

A metric is represented as a histogram with the following properties:

name description
count number of values/samples.
min minimum value
mean mean/average of values
max maximum value
stddev standard deviation between values
last last value
snapshots last 50 states

Timing metrics are collected in milliseconds.

Querying metrics

Metrics can be queried from the list argument of the onTick callback.

// ...
onTick: list => {    
  // primary metrics
  console.log(list().primary())

  // task thread metrics
  console.log(list().threads()) 
}

.primary()

get all primary/main metrics

// log all primary metrics
console.log(list().primary())

.threads()

get all metrics, for each task thread

// log all metric of every task-thread
console.log(list().threads())

.pick()

reduce all metrics to a single histogram property

list().threads().pick('min')

// from this: { cycle: [{ min: 4, max: 5 }, evt_loop: { min: 2, max: 8 } ... 
// to this  : { cycle: 4, evt_loop: 2 ...

available: min, mean, max, stdev, snapshots, count, last

  • stddev: standard deviation between recorded values
  • last : last recorded value
  • count : number of recorded values

.of()

reduce all metrics that have been pick-ed to an array of histograms, to an array of single histogram values.

list().primary().pick('snapshots').of('max')
// from this: [{ cycle: [{ ... max: 5 }, { ... max: 3 }, { ... max: 2 } ] } ... 
// to this  : [{ cycle: [5,3,2 ....] } ...

note: only makes sense if it comes after .pick('snapshots')

.metrics()

get specific metric(s) instead of all of them

const loopMetrics = list().threads().metrics('evt_loop', 'fibonacci')
// only the `evt_loop` and `fibonacci` metrics

.sortBy()

sort by specific metric

const sorted = list().threads().pick('min').sort('cycle', 'desc')
// sort by descending min 'cycle' durations

available: desc, asc

.group()

get result as an Object, like `Object.groupBy with the metric name used as the key.

const obj = list().threads().pick('snapshots').of('mean').group()

Default metrics

The following metrics are collected by default:

primary

name description
issued count of issued cycles
completed count of completed cycles
backlog size of cycles backlog
uptime seconds since test start

threads

name description
cycles cycle timings
evt_loop event loop timings

any custom metrics will appear here.

Recording custom metrics

Custom metrics can be recorded with either:

both of them are native extensions of the User Timing APIs.

The metrics collector records their timings and attaches the tracked Metric histogram to its corresponding task thread.

example: instrumenting a function using performance.timerify:

// performance.timerify example

import { dyno } from '@nicholaswmin/dyno'

await dyno(async function cycle() { 

  performance.timerify(function fibonacci(n) {
    return n < 1 ? 0
      : n <= 2 ? 1
      : fibonacci(n - 1) + fibonacci(n - 2)
  })(30)

}, {
  parameters: { cyclesPerSecond: 20 },
  
  onTick: list => {    
    console.log(list().threads().metrics().pick('mean'))
  }
})

// logs 
// ┌─────────┬───────────┐
// │ cycle   │ fibonacci │
// ├─────────┼───────────┤
// │ 7       │ 7         │
// │ 11      │ 5         │
// │ 11      │ 5         │
// └─────────┴───────────┘

note: the stats collector uses the function name for the metric name, so named functions should be preffered to anonymous arrow-functions

Plotting

Each metric contains up to 50 snapshots of its past states.

This allows plotting them as a timeline, using the console.plot module.

The following example benchmarks 2 sleep functions & plots their timings as an ASCII chart

// Requires: 
// `npm i @nicholaswmin/console-plot --no-save`

import { dyno } from '@nicholaswmin/dyno'
import console from '@nicholaswmin/console-plot'

await dyno(async function cycle() { 

  await performance.timerify(function sleepRandom1(ms) {
    return new Promise(r => setTimeout(r, Math.random() * ms))
  })(Math.random() * 20)
  
  await performance.timerify(function sleepRandom2(ms) {
    return new Promise(r => setTimeout(r, Math.random() * ms))
  })(Math.random() * 20)
  
}, {

  parameters: { cyclesPerSecond: 15, durationMs: 20 * 1000 },

  onTick: list => {  
    console.clear()
    console.plot(list().threads().pick('snapshots').of('mean').group(), {
      title: 'Plot',
      subtitle: 'mean durations (ms)'
    })
  }
})

which logs:

Plot

-- sleepRandom1  -- cycle  -- sleepRandom2  -- evt_loop

11.75 ┤╭╮                                                                                                   
11.28 ┼─────────────────────────────────────────────────────────────────────╮                               
10.82 ┤│╰───╮    ╭╯ ╰╮   │╰╮  ╭─────────╯╰──────────╮ ╭─────────────────╯   ╰───────────╮╭─╮    ╭────────── 
10.35 ┼╯    ╰╮╭╮╭╯   ╰───╯ ╰──╯                     ╰─╯                                 ╰╯ ╰────╯           
 9.88       ╰╯╰╯                                                                                           
 9.42                                                                                                      
 8.95                                                                                                      
 8.49                                                                                                      
 8.02                                                                                                      
 7.55                                                                                                      
 7.09 ┤╭╮                                                                                                   
 6.62 ┼╯╰───╮    ╭─────────╮   ╭──╮                                                                         
 6.16      ╰╮╭──╯         ╰───╯  ╰───────────────────────╮       ╭─────────────────────╮╭───╮   ╭───────── 
 5.69 ┤╭╮    ╰╯                                        ╭───────────╮  ╭╮╭──────╮        ╰╯   ╰──╭╮╭─╮╭───── 
 5.22 ┤│╰╮╭─╮   ╭──╮     ╭───╮╭─╮ ╭────────────────────╯           ╰──╯╰╯      ╰────────────────╯╰╯ ╰╯      
 4.76 ┤│ ╰╯ ╰───╯  ╰─────╯   ╰╯ ╰─╯                                                                         
 4.29 ┼╯                                                                                                    

mean durations (ms)

- last: 100 items

Gotchas

Missing custom metrics

Using lambdas/arrow functions means the metrics collector has no function name to use for the metric. By their own definition, they are anonymous.

Change this:

const foo = () => {
  // test code
}

performance.timerify(foo)()

to this:

function foo() {
  // test code
}

performance.timerify(foo)()

code running multiple times

The benchmark file self-forks itself. 👀

This means that any code that exists outside the dyno block will also run in multiple threads.

This is a design tradeoff, made to provide the ability to create simple, single-file benchmarks but it can create issues if you intent to run code after the dyno() resolves/ends; or when running this as part of an automated test suite.

In this example, 'done' is logged 3 times instead of 1:

import { dyno } from '@nicholaswmin/dyno'

const result = await dyno(async function cycle() { 
  // task code, expected to run 3 times ...
}, { threads: 3 })

console.log('done')
// 'done'
// 'done'
// 'done'

Using hooks

To work around this, the before/after hooks can be used for setup and teardown, like so:

await dyno(async function cycle() { 
  console.log('task')
}, {
  parameters: { durationMs: 5 * 1000, },

  before: async parameters => {
    console.log('before')
  },

  after: async parameters => {
    console.log('after')
  }
})

// "before"
// ...
// "task"
// "task"
// "task"
// "task"
// ...  
// "after"

Fallback to using a task file

Alternatively, the task function can be extracted to it's own file.

// task.js
import { task } from '@nicholaswmin/dyno'

task(async function task(parameters) {
  // task code ...

  // `benchmark.js` test parameters are
  // available here.
})

then referenced as a path in benchmark.js:

// benchmark.js

import { join } from 'node:path'
import { dyno } from '@nicholaswmin/dyno'

const result = await dyno(join(import.meta.dirname, './task.js'), { 
  threads: 5
})

console.log('done')
// 'done'

This should be the preferred method when running this as part of a test suite.

Not a load-testing tool

This is not a stress-testing tool.
Stress-tests are far more complex and require a near-perfect replication of an actual production environment.

This is a prototyping tool that helps testing whether some prototype idea is worth proceeding with or whether it has unworkable scalability issues.

It's multi-threaded model is meant to mimic the execution model of horizontally-scalable, share-nothing services.

It's original purpose was for benchmarking a module prototype that heavily interacts with a data store over a network.

It's not meant for side-to-side benchmarking of synchronous code, Google's Tachometer being a much better fit.

Tests

install deps:

npm ci

unit & integration tests:

npm test

test coverage:

npm run test:coverage

note: the parameter prompt is suppressed when NODE_ENV=test

meta checks:

npm run checks

Misc

generate a sample benchmark:

npx init

generate Heroku-deployable benchmark:

npx init-cloud

Contributors

Todos are available here

Scripts

update README.md code snippets:

npm run examples:update

source examples are located in: /bin/example

Authors

@nicholaswmin

License

MIT-0 License