A VM based on the (turing-complete!) Subleq one-instruction set computer implemented in JS and circom.
git clone https://github.com/lucasgleba/zkSubleq.git
cd zkSubleq
npm install
npx mocha # might take 30+ seconds
You will need circom and snarkjs to compile circuits and generate proofs. Check out the circom introduction.
Subleq?
Given three operands A, B, C
- sub: Set
Memory[B] -= Memory[A]
. - leq: If the new value is less than or equal to 0, set
ProgramCounter = C
. OtherwiseProgramCounter += 1
.
There is only one instruction, so no opcodes are needed. Other than the program counter (aka instruction pointer), there are no registers.
To make zk-SNARKs practical, the state of the machine is Merkleized; only the memory slots involved in a given step are passed as input signals.
// Circuit for one VM step I/O
// **** INPUT ****
signal input pcIn; // Program counter value
signal input aAddr; // A operand (address)
signal input bAddr; // B operand (address)
signal input cIn; // C operand (instruction index)
signal input aIn; // Memory[A]
signal input bIn; // Memory[B]
signal input mRoot0; // Memory subtree root
signal input sRoot0; // State tree root
// Merkle proof elements (paths are derived in the circuit)
signal input aAddrPathElements[mLevels]; // aAddr -> mRoot
signal input bAddrPathElements[mLevels]; // bAddr -> mRoot
signal input cInPathElements[mLevels]; // cIn -> mRoot
signal input aInPathElements[mLevels]; // AIn -> mRoot
signal input bInPathElements[mLevels]; // bIn/bOut -> mRoot
// **** OUTPUT ****
// Values after step execution
signal output pcOut; // Program counter value
signal output mRoot1; // Memory subtree root
signal output sRoot1; // State tree root
MEMORY
+-----------+---+-------+-------+-----+---+-------+---+-------+---+
| Address: | … | I+0 | I+1 | I+2 | … | aAddr | … | bAddr | … |
+-----------+---+-------+-------+-----+---+-------+---+-------+---+
| Value: | … | aAddr | bAddr | cIn | … | aIn | … | bIn | … |
+-----------+---+-------+-------+-----+---+-------+---+-------+---+
[_____________________]
instruction
(I=PC*3)
Instruction set too simple
Turing-complete, yes. But using this for real-life programs would be super slow. Given that most of the constraints in the circuit come from validating Merkle proofs, a more sophisticated instruction set would not cost much and would speed things up a lot. It would be good to make it easily transpilable/compilable from a popular ISA/intermediate representation.
One memory slot per signal
Every leaf node in the state tree corresponds to a register or memory slot so values can go as high as circom integers (2^253+), as opposed to conventional computers limited to 8-bit memory slots. It would be better to pack multiple smaller registers/slots into a single value to make the state tree more shallow. e.g., one instruction should correspond to one signal, not three.
Two many long Merkle proofs to validate
The circuit validates 6 long Merkle proofs (memory slot -> memory root): aAddr, bAddr, cIn, aIn, bIn, bOut. A better machine would have a load-store architecture where most of the operations happen between registers (which would be in a much smaller subtree) and at most one memory slot is read from or written to every step (other than the instruction).
I'm making a VM implementing the improvements mentioned above based on a risc-v instruction set. I expect to be able to run transpiled real-world software for less than 75k constrains/step.
Update: I implemented a Risc-V CPU in circom. I can generate a somewhat efficient zk-Proof for a slice of execution of minimally transpiled rv32i binary.
Ping me on twitter if you want to learn more.
One can use more sophisticated zero-knowledge math to make a much more efficient (but complex) zkVM. Most of the efforts in this front are being made by teams working on EVM-compatible zero-knowledge rollups.