VSD Sky130 RTL Design and Synthesis Workshop was a five day workshop conducted by VSD-IAT. I have learnt about the open source tools used in the VLSI Industry, timing liberties, hierarchical and flat synthesis concepts, optimization in both sequencial logic and combinational logic circuitry codes, GLS Blocking and non blocking statements and finally I have learnt about the optimization techniques used during synthesis of both logic and combinational logic circuitry codes . I will be briefly explaining about these topics daywise and I'll be cataloging my journey in this 5 days workshop.
-
Day-2 Timing libs Hierarchy versus flat synthesis and efficient flop styles
-
Day-4 Gate Level Synthesis Synthesis Simulation mismatch and Blocking Non blocking statements
Simulator is a tool for examining the correctness of your design. RTL design is the implementation of a given specification.The intent of specification needs to be verified by simulating the design. iverilog was used for simulation of a given design. Simulator looks for changes in the input signal and so if there is no change in the input signal of a design the output will reflect the same value given in the input signal.
Design can be an actual code or set of Verilog code whose functional intention is to meet a required specification and it should be a synthesizable verilog code. Design may have one or more primary inputs and primary outputs depending on the design specifications.
Testbench is an another verilog file which consists of instantiation of the top level module of a given design and it doesn't necessarily needed to be synthesizable code. But testbench doesn't really have primary inputs or any primary outputs.
The above image shows the flow of iverilog simulator whose inputs consist of design files and testbenches of the design and the value change dump (vcd) file is our output file.
In the very same image we see that the output file of iverilog that is our vcd file is sent as input to gtk wave which is a vcd waveform viewer where we can check the correctness of a particular design logic which can be a simple or a complex circuit or can be a combinational or sequential circuit designs.
Here I will be covering how we are going to setup our lab server to simulate, view and synthesis the given specification of a design.
Initially we will be creating a VLSI directory using mkdir VLSI and then we cd into the directory using cd VLSI as shown below
$ mkdir VLSI
$ cd VLSI
$ ls
$ mkdir VLSI
$ cd VLSI
$ ls
Here we use ls command to list out all the files in that particular directory as shown in the above image.
Before git cloning any repository in linux platform we can use sudo -i to enter root directory or by directly cloning that particular repository to the present folder and then we need to check whether git package is already installed if not install it using sudo apt-get install git.
Now that the prerequisites of git cloning are done we need to start cloning vsdflow repository and sky130 RTL Design and Synthesis Workshop for this workshop.
VSDFLOW repository can be cloned by git clone
$ git clone https://github.com/kunalg123/vsdflow.git
$ cd vsdflow
$ ls -ltr
This image shows the contents of vsdflow directory after cloning its repository.
After the command ./opensource_eda_tool_install.sh is executed, a list of open source tools which get installed are: a) Yosys is RTL Synthesis tool b) blifFanout is a high fanout net synthesis tool. c) graywolf is a tool used specifically for placement phase d) qrouter is used as a tool for detailed routing of the design e) magic is a tool to extract GDSII layout file as well as carries out DRC and Antenna checks. f) netgen is a tool for checking Lsyout vs Schematic for a given design g) OpenTimer & OpenSTA are STA (Static timing analysis) tools
sky130 RTL Design and Synthesis Workshop can be cloned by
$ git clone https://github.com/kunalg123/sky130RTLDesignAndSynthesisWorkshop.git
$ cd sky130RTLDesignAndSynthesisWorkshop
$ ls -ltr
This image displays all the contents of sky130 RTL Design and Synthesis Workshop directory in a detailed fashion.
$ cd my_lib
$ ls
$ cd lib
$ ls
$ cd verilog_models
$ ls
These two images displays all the subfolders of my_lib, lib and verilog files in the sky130 RTL Design and Synthesis Workshop directory
In this lab iverilog was used as simulator and gtkwave for vcd waveform viewer
$ iverilog good_mux.v tb_good_mux.v
$ ls
$ ./a.out
$ gtkwave tb_good_mux.vcd
These images demonstrates how iverilog was used for simulating design code good_mux .v along with testbench tb_good_mux.v and how we view the vcd format file using gtkwave. The final image shows how the design and testbench files are viewed using gedit command.
Note: The testbench and the design code can be passed using iverilog command in any order. In the above example we can give iverilog good_mux.v tb_good_mux.v or as iverilog tb_good_mux.v good_mux.v
Synthesizer is a tool to convert RTL Code to netlist. Yosys is the synthesizer used in the workshop. Constraints are the guidance offered to the synthesizers.
The above image shows the flow of yosys. Yosys takes RTL code and .lib files as input and netlist as output. Netlist is the standard cell representation of the design.
From the above image, we use read_verilog to read the design code, read_liberty to read .lib files and finally write_verilog for generating netlist
We use the very same iverilog flow to verify the synthesis of the design as we did in iverilog lab section.
During the synthesis of netlist verification, the output should be same as that of what what we did in iverilog lab. For synthesis of netlist we will be using the same testbench as used in iverilog and gtkwave lab.
RTL Design is the behavioral representation of the needed specifications.
In Logic synthesis, RTL code is converted to gate level translation and so the connections are made with the gates and finally it's taken out as netlist.
These are collection of logic modules which consists of basic gates like and, or, not etc and can have different flavors for the same gate.
In the logic path, combinational delay determines the maximum speed of operation of digital logic circuits.
# TCLK > TCQ_A+TCOMBI+TSETUP_B
# THOLD_B< TCQ_A+TCOMBI
The above images describes the clock cycle process and how it can be calculated. We need faster cell lib for faster clock speed and setup time. But we need slower cells to avoid hold time issues. Collection of fast cells and slow cells form the .lib.
The output loads in digital circuits are mostly capacitive loads. When charging or discharging of capacitance is faster, the cells have lesser delays for this to happen we need transistor capable of sourcing more current.
Faster cells has a tradeoff in power and area even though it has lesser delays for a given design.
Wider transistors has low delays but greater power and area for a given design specifications. On the other hand narrower transistors has high delays but lesser powe and area from the same design specifications.
$ yosys
yosys> read_liberty -lib ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> read_verilog good_mux.v
yosys> synth -top good_mux
yosys> abc -liberty ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> show
sky130_fd_sc_hd__o21ai_1 in netlist visualization has 3 inputs and 1 output as it has 2 input OR gate into first input of 2input NAND gate. sky130_fd_sc_hd is a high density digital standard cell contributed Skywater foundry.
sky130_fd_sc_hd__tt_025c_1v80.lib --> has 025c as temperature parameter, 1v80 as voltage parameter and tt which is a typical process parameter.
Hierarchy Synthesis
$ yosys
yosys> read_liberty -lib ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> read_verilog multiple_module.v
yosys> synth -top multiple_module
yosys> abc -liberty ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> show multiple_modules
yosys> write_verilog -noattr multiple_modules_hier.v
yosys> show
yosys> !gedit multiple_modules_hier.v
yosys> write_verilog -noattr multiple_modules_flat.v
yosys> !gedit multiple_modules_flat.v
$ yosys
yosys> read_liberty -lib ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> read_verilog multiple_module.v
yosys> synth -top multiple_module
yosys> abc -liberty ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> show multiple_modules
yosys> write_verilog -noattr multiple_modules_hier.v
yosys> show
The above image shows the comparison Asynchronous reset, Asynchronous set , Synchronous reset and Synchronous reset.
Using Simulator and Waveform viewer
$ iverilog dff_asyncres.v tb_dff_asyncres.v
$ ./a.out
$ gtkwave tb_dff_asynres.vcd
$ iverilog dff_async_set.v tb_dff_async_set.v
$ ./a.out
$ gtkwave tb_dff_async_set.vcd
$ iverilog dff_syncres.v tb_dff_syncres.v
$ ./a.out
$ gtkwave tb_dff_synres.vcd
Using Simulator and Waveform viewer
$ yosys
yosys> read_liberty -lib ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> read_verilog dff_asynres.v
yosys> synth -top dff_asynres
yosys> dfflibmap -liberty ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> show
$ gedit mul_*.v
$ yosys
yosys> read_liberty -lib ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> read_verilog mult_2.v
yosys> synth -top mult2
yosys> abc -liberty ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> show
Logics are squeezed to obtain the most optimized design in terms of Area, Power savings
- Constant propagation -Direct optimization
We will be taking an example, Y= ((AB)+ C)'.
If we take A=0 as input, we get Y= C' So, clearly we see that we directly need inverter to invert input C.
- Boolean Logic Optimization
-K map -Quine McKluskey
Here we take an example, assign y= a?(b?c:(c?a:0)):!c), this statement uses ternary operator and this one is a mux kind of expression. If a=1, it will take b?c:(c?a:0) expression or else a=0 it will take !c as the expression. If b=1, it will take c's value or else b=0 it will take c?a:0 expression. if c=1, it will take a's value or else c=0 it will take the value as '0'.
The overall MUX expression will be Y= ((ac+c'0). b'+bc).a+ a'c' (Since c'.0=0) Y= ((acb'+bc).a + a'c' Y= a'c' +acb'+abc Y= a'c' +ac(b'+b) (b'+b=1) Y= a'c + ac ie Y= a xor c
- Basic
-Sequential constant propagation
Here in this circuit even if there is a reset or not the Q value value is going to be 0. So Y value is always 1 as it has nand gate inverting the value of Q.
2)Advanced
-State optimization ( optimization of unused states)
-Retiming (Splitting the logic more equally to get a better effective frequency for a sequential circuit)
-Sequential Logic cloning (floor plan aware synthesis)
$ ls *opt*
$ ls *opt_check*
Here we will be taking first file opt_check module where we have assign y= a?b:0, which is a mux so at a=1, y will take b's values, but at a=0, y will take 0 as its value.
$ yosys
yosys> read_liberty -lib ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> read_verilog opt_check.v
yosys> synth -top opt_check
yosys> opt_clean -purge
yosys> abc -liberty ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> show
Here we will be taking second file opt_check2 module where we have assign y= a?1;b, which is a mux so at a=1, y will take 1, but at a=0, y will take b's value.
$ yosys
yosys> read_liberty -lib ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> read_verilog opt_check2.v
yosys> synth -top opt_check2
yosys> opt_clean -purge
yosys> abc -liberty ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> show
$ iverilog dff_const2.v tb_dff_const2.v
$ ./a.out
$ gtkwave tb_dff_const2.vcd
$ ls *dff*const*
$ gedit dff_const1.v o dff_const2.v
$ yosys
yosys> read_liberty -lib ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> read_verilog opt_check2.v
yosys> synth -top opt_check2
yosys> opt_clean -purge
yosys> abc -liberty ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> show
$ iverilog dff_const1.v tb_dff_const1.v
$ ./a.out
$ gtkwave tb_dff_const1.vcd
$ yosys
yosys> read_liberty -lib ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> read_verilog opt_check2.v
yosys> synth -top opt_check2
yosys> opt_clean -purge
yosys> abc -liberty ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> show
$ gedit counter_opt.v
$ iverilog counter_opt.v tb_counter_opt.v
$ ./a.out
$ gtkwave tb_counter_opt.vcd
$ yosys
yosys> read_liberty -lib ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> read_verilog counter_opt.v
yosys> synth -top counter_opt
yosys> dfflibmap -liberty ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> abc -liberty ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> show
Here we are going to run the tesbench along with netlist as DUT (Design under Test). Netlist and RTL code are one and the same as we are going to use the same testbench for the design We need GLS to verify the correctness of the design after synthesis and also for ensuring the timing of the design is met. If the Gate level models are delay annotated, GLS can be used for timing validation.
This can happen because of following conditions as listed below:
- Missing sensitivity list
Here we will be taking an example code
Here we see that whenever 'sel' is changing 'y' will also be changing. The always block statement gets evaluated only when 'sel' is changing and so always block is not sensitive to either i1 or i2 activities because of which we will not be getting Y output based on changes in i0 and i1. So this circuit behaves like a latch
So, the solution to this problem is the below image showing us the code.
Here the always block gets evaluated for 'sel', 'i0' and 'i1'. So now we will be getting the exact output as we had expected.
- Blocking versus non blocking assignments
Inside always block
- Executes the statements in the order it is written and so the first statement is evaluated before the second statement
- Executes all the RHS when the always block is entered and assigned to LHS. So in other words, this assignments does parallel evaluation.
Comparing the above images with code the first images shows the correct code But in the second image in the third begin statement instead of q=qo then q0=d but we've q0=d followed by q=q0 so by the time q0 is assigned to q, q0 will be already having the value of d that means there is only one flip flop instead of two in the 1st image of this section.
The second image code problem can be fixed by using non blocking statements as shown below:
We take a different example to understand the problem more clearly
Here in the above image has y=q0&c and q0=a|b so q0 will be taking the previous value to evaluate y then a|b is assigned to q0. So this code mimic a delay or a flip flop
Solution to this problem is the below image
$ iverilog ternary_operator_mux.v tb_ternary_operator_mux.v
$ ./a.out
$ gtkwave tb_ternary_operator_mux.vcd
$ yosys
yosys> read_liberty -lib ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> read_verilog ternary_operator_mux.v
yosys> synth -top ternary_operator_mux
yosys> abc -liberty ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> write_verilog -noattr ternary_operator_mux_net.v
yosys> show
$ iverilog ../my_lib/verilog_model/primitives.v ../my_lib/verilog_model/sky130_fd_sc_hd.v ternary_operator_mux_net.v tb_ternary_operator_mux.v
$ ./a.out
$ gtkwave tb_ternary_operator_mux.vcd
$ iverilog bad_mux.v tb_bad_mux.v
$ ./a.out
$ gtkwave tb_bad_mux.vcd
$ yosys
yosys> read_liberty -lib ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> read_verilog bad_mux.v
yosys> synth -top bad_mux
yosys> abc -liberty ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> write_verilog -noattr bad_mux_net.v
yosys> show
$ iverilog ../my_lib/verilog_model/primitives.v ../my_lib/verilog_model/sky130_fd_sc_hd.v bad_mux_net.v tb_bad_mux.v
$ ./a.out
$ gtkwave tb_bad_mux.vcd
The above image d= x &c and x = a|b, here we have to take previous value of x for evaluating d as if x is to be a flopped output in simulation
$ gedit blocking_caveat.v
$ iverilog blocking_caveat.v tb_blocking_caveat.v
$ ./a.out
$ gtkwave tb_blocking_caveat.vcd
$ yosys
yosys> read_liberty -lib ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> read_verilog blocking_caveat.v
yosys> synth -top blocking_caveat
yosys> abc -liberty ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> write_verilog -noattr blocking_caveat_net.v
yosys> show
$ iverilog ../my_lib/verilog_model/primitives.v ../my_lib/verilog_model/sky130_fd_sc_hd.v blocking_caveat_net.v tb_blocking_caveat.v
$ ./a.out
$ gtkwave tb_blocking_caveat.vcd
If statements are mainly used to create priority logic. These statements are used inside alaways block. Variables used in if statements are register variables.
The syntax and structure of If statement is given below
if <condition 1>
begin
---------
---------
---------
end
else if <condition 2>
begin
---------
---------
---------
end
else if <condition 3>
begin
---------
---------
---------
end
else
begin
---------
---------
---------
end
The above structure depending on the number of if-else if statements we can map these statements in the form of hardware circuit that is a mux with conditions.
Incomplete If statements can cause inferred latches and this is particularly a bad coding style. The example is shown below:
if (condt1)
y=a;
else if (condt2)
y=b;
If we leave the code incomplete like this if any of the conditions are not satisfied it will automatically try to latch as we have not provided the else statement in the above example. This latch condition is called inferred latch condition when incomplete if statement is used.
There are cases where we use incomplete if statements say for examplwe use it in counters then the incomplete if statements looks like this,
always @ (posedge clk, posedge reset) begin if(reset) count<=3'b000; else if (en) count<=count+1; end
If there is no count value the circuit will automatically latches to store previous avlue to be used in the next iteration. Since counters are sequential circuits this latching proves to be useful and so this is one of the exception cases of incomplete if statements.
Case statements are used inside always block. Variables used in case statements are register variables. Example for case statement is shown below:
reg y;
always @(*)
begin
case(sel)
2'b00: begin
---------
---------
end
2'b01: begin
---------
---------
end
---------------
---------------
---------------
default: -----------
--------------------
endcase
end
This incomplete case statements also causes inferred latches just as in incomplete if statements. The example is shown below:
reg y;
always @(*)
begin
case(sel)
2'b00: begin
---------
---------
end
2'b01: begin
---------
---------
end
endcase
end
Since, we have not told what's to be done after 2'b01 condition, the hardware will map the rest as a latch to retain previous values.
Solution to this problem is by adding default case statement to aviod latch up condition.
reg y;
always @(*)
begin
case(sel)
2'b00: begin
x=a;
y=b;
end
2'b01: begin
x=c;
end
default: begin
x=d;
y=b;
end
endcase
end
Here we see that in 2'b01 we have incomplete assignments ie x is assigned 'c' buy y value is not known to us due to this again latch up happens to retain the previous value of y ie it take y=b from 2'b00 condition.
reg y;
always @(*)
begin
case(sel)
2'b00: begin
---------
---------
end
2'b01: begin
---------
---------
end
2'b10: begin
---------
---------
end
2'b1?: begin
---------
---------
end
endcase
end
Here we will be getting unpredictable output as case statements are executed one after the other escpecially reaching the last condition statement 2'b1?: where we say our MSB is 1 bur lsb can be anything from 1 or 0 ans we can predict the exact value of the final outputs.
This is an overlapping case statements.
We will be taking an example, incomp_if file
$ iverilog incomp_if.v tb_incomp_if.v
$ ./a.out
$ gtkwave tb_incomp_if.vcd
$ yosys
yosys> read_liberty -lib ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> read_verilog incomp_if.v
yosys> synth -top incomp_if
yosys> abc -liberty ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> show
We will be taking another example, incomp_if2
$ iverilog incomp_if2.v tb_incomp_if2.v
$ ./a.out
$ gtkwave tb_incomp_if2.vcd
$ yosys
yosys> read_liberty -lib ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> read_verilog incomp_if2.v
yosys> synth -top incomp_if2
yosys> abc -liberty ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> show
We will be taking an example of complete case statement file
$ iverilog comp_case.v tb_comp_case.v
$ ./a.out
$ gtkwave tb_comp_case.vcd
$ yosys
yosys> read_liberty -lib ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> read_verilog comp_case.v
yosys> synth -top comp_case
yosys> abc -liberty ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> show
We will be taking another example, incomplete statement file
$ iverilog incomp_case.v tb_incomp_case.v
$ ./a.out
$ gtkwave tb_incomp_case.vcd
$ yosys
yosys> read_liberty -lib ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> read_verilog incomp_case.v
yosys> synth -top incomp_case
yosys> abc -liberty ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> show
Looping constructs uses for loop and generate for loop statements
For loops can be used only inside always block. These loops can be used for evaluating expressions but not for instantaiating hardware. Example we are taking is a 32:1 Mux is as shown below:
integer i;
always @(*)
begin
for(i=0; i<32; i=i=1)
begin
if (i==sel)
y= inp[i]; //Assuming that inp[32:0] bus is declared in the main module.
end
end
Generate for loop cannot be used inside always block but outside always block. Generate for loop is used in instantiating hardware. Taking an example for generate foor loop as shown below
Suppose we need to instantiate and u_and( .a(), .b(), .y()) for 20 times it is not feasible to and u_and( .a(), .b(), .y()) this many times ie
and u_and( .a(), .b(), .y())
and u_and( .a(), .b(), .y())
and u_and( .a(), .b(), .y())
..................
..................
..................
..................
Till 20th and gate instantiation
Here we use generate for loops to instantiate gates and smaller modules in main modules as my times we want.
genvar i
generate
for (i=0; i,8; i=i+1) begin
and u_and( .a(a[i]), .b(b[i]), .y(y[i]));
end
end
We are taking an example where we are taking mux_generate file.
$ iverilog mux_generate.v tb_mux_generate.v
$ ./a.out
$ gtkwave tb_mux_generate.vcd
$ yosys
yosys> read_liberty -lib ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> read_verilog mux_generate.v
yosys> synth -top mux_generate
yosys> abc -liberty ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> show
We are taking an example where we are taking rca.v file which is a ripple carry adder
$ iverilog fa.v rca.v tb_rca.v
$ ./a.out
$ gtkwave tb_rca.vcd
$ yosys
yosys> read_liberty -lib ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> read_verilog fa.v rca.v
yosys> synth -top rca
yosys> abc -liberty ../my_lib/lib/sky130_fd_sc_hd__tt_025C_1v80.lib
yosys> write_verilog -noattr rca_net.v
yosys> show
$ iverilog ../my_lib/verilog_model/primitives.v ../my_lib/verilog_model/sky130_fd_sc_hd.v rca_net.v tb_rca.v
$ ./a.out
$ gtkwave tb_blocking_caveat.vcd
Here we see that the waveform obtained in normal simulation and GLS simulation are the same.
By this, we have come to the end of 5 days workshop so by now I have sucessfully learnt theoretical and practical approach on how to simulate, displaying waveform, synthesize, do gate level synthesis (netlist generation) and finally again display the waveforms to check the correctness of the design starting from smaller verilog codes to complex ones.
- Kunal Ghosh - Co-founder(VSD Corp. Pvt. Ltd)
- Shon Taware - VSD Teaching Assistant
- Chaitanya Bharathi