Skip to content

CAFxX/mlns

Repository files navigation

Minimal Latency N-Sorter (MLNS)

This repo contains MLNS for small values of N, as well as the generator used to generate them.

A N-sorter1 is a circuit that, given N input values, returns them sorted. A 2-sorter, commonly used in sorting networks, is a special case of N-sorter; a 2-sorter implements the expression {o0, o1} = i0 < i1 ? {i0, i1} : {i1, i0}.

A MLNS is an N-sorter that aims at ensuring that the sorting happens with the smallest possible latency. Specifically, the latency virtually does not depend on the number of inputs N2, as all comparisons are done in parallel (instead of in serial stages like they are done in sorting networks).

Generator

To generate the N-sorter for a specifc value of N, run: ruby mlnsgen/mlnsgen.rb <N> (e.g. ruby mlnsgen/mlnsgen.rb 4) or ruby mlnsgen/mlnsgenfast.rb <N>.

mlnsgen

The current algorithm performs an exhaustive search (i.e. uses superlinear time), so it is too slow for values of N much higher than 8 (and furthermore the resulting Verilog code that implements such wide N-sorters becomes impractically large for synthesis tools to handle efficiently).

mlnsgenfast

This generator directly synthesizes the cases. It can be used to generate larger sorters (but keep in mind that sorters larger than 10 are going to result in gigabytes of Verilog code).

Generated sorters

The files nsorter_*.v contain the generated n-sorters.

Just to provide a rough idea, these are the currently available generated n-sorters and their characteristics:

N Comparisons Orderings Gates3 Area (µm²)
2 1 2 447 6906.64
3 3 6 1736 14589.04
4 6 24 3543 27781.64
5 10 120 6366 61711.75
6 15 720 12551 ?
7 21 5040 41387 ?
8 28 40320 250448 ?
9 36 362880 ? ?
... ... ... ... ...
$N$ $\dfrac{N^2-N}{2}$ $N!$ ? ?

SiliconCompiler 0.30.0 yields the following for the smaller n-sorters:

4-sorter synthesized on the skywater130 process 4-sorter synthesized on the skywater130 process 4-sorter synthesized on the skywater130 process 5-sorter synthesized on the skywater130 process

License

MIT

Footnotes

  1. Sometimes also called k-sorter.

  2. Specifically, the number of stages does not depend on the number of inputs, like it does in sorting networks, because there is a single comparison stage and a single selection stage regardless of the number of inputs N, and the latency of the comparison stage is constant regardless of number of inputs N. The selection stage though does contain N-way mux, so the latency of the selection stage is proportional to log2(N).

  3. Yosys 0.33, synth; 64 bit values.

  4. SiliconCompiler 0.30.0, skywater130 process, density 50; 64 bit values. 2 3

  5. SiliconCompiler 0.30.0, skywater130 process, density 40; 64 bit values.