Skip to content

Commit

Permalink
Configs: update MinimalConfig for FPGA (OpenXiangShan#809)
Browse files Browse the repository at this point in the history
* Configs: add MinimalFPGAConfig

* TODO: change cache parameters

* Chore: add parameter print

* README: add simulation usage

Currently, XiangShan does not support NOOP FPGA. FPGA related
instructions are removed

* Configs: limit frontend width in MinimalConfig

* MinimalConfig: limit L1/L2 cache size

* MinimalConfig: limit ptw size, disable L2

* MinimalConfig: limit L3 size

* Sbuffer: force trigger write if sbuffer fulls
  • Loading branch information
AugustusWillisWang authored May 12, 2021
1 parent 632fc81 commit 05f23f5
Show file tree
Hide file tree
Showing 14 changed files with 124 additions and 102 deletions.
5 changes: 3 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ MEM_GEN = ./scripts/vlsi_mem_gen

SIMTOP = top.SimTop
IMAGE ?= temp
CONFIG ?= DefaultConfig

# co-simulation with DRAMsim3
ifeq ($(WITH_DRAMSIM3),1)
Expand All @@ -30,7 +31,7 @@ help:

$(TOP_V): $(SCALA_FILE)
mkdir -p $(@D)
mill XiangShan.test.runMain $(FPGATOP) -td $(@D) --full-stacktrace --output-file $(@F) --disable-all --remove-assert --infer-rw --repl-seq-mem -c:$(FPGATOP):-o:$(@D)/$(@F).conf $(SIM_ARGS)
mill XiangShan.test.runMain $(FPGATOP) -td $(@D) --config $(CONFIG) --full-stacktrace --output-file $(@F) --disable-all --remove-assert --infer-rw --repl-seq-mem -c:$(FPGATOP):-o:$(@D)/$(@F).conf $(SIM_ARGS)
$(MEM_GEN) $(@D)/$(@F).conf --tsmc28 --output_file $(@D)/tsmc28_sram.v > $(@D)/tsmc28_sram.v.conf
$(MEM_GEN) $(@D)/$(@F).conf --output_file $(@D)/sim_sram.v
# sed -i -e 's/_\(aw\|ar\|w\|r\|b\)_\(\|bits_\)/_\1/g' $@
Expand Down Expand Up @@ -58,7 +59,7 @@ $(SIM_TOP_V): $(SCALA_FILE) $(TEST_FILE)
mkdir -p $(@D)
@echo "\n[mill] Generating Verilog files..." > $(TIMELOG)
@date -R | tee -a $(TIMELOG)
$(TIME_CMD) mill XiangShan.test.runMain $(SIMTOP) -td $(@D) --full-stacktrace --output-file $(@F) --infer-rw --repl-seq-mem -c:$(SIMTOP):-o:$(@D)/$(@F).conf $(SIM_ARGS)
$(TIME_CMD) mill XiangShan.test.runMain $(SIMTOP) -td $(@D) --config $(CONFIG) --full-stacktrace --output-file $(@F) --infer-rw --repl-seq-mem -c:$(SIMTOP):-o:$(@D)/$(@F).conf $(SIM_ARGS)
$(MEM_GEN) $(@D)/$(@F).conf --output_file $(@D)/$(@F).sram.v
@git log -n 1 >> .__head__
@git diff >> .__diff__
Expand Down
88 changes: 27 additions & 61 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# NOOP
# XiangShan

NOOP(NJU Out-of-Order Processor) is a processor targeting super-scalar out-of-order execution.
Currently it only supports riscv32.
XiangShan is a processor targeting super-scalar out-of-order execution.
Currently it supports riscv64GC.

## Compile chisel code

Expand All @@ -13,75 +13,41 @@ Currently it only supports riscv32.

## Run programs by simulation

### Prepare environment

* Set a new environment variable `NEMU_HOME` to the **absolute path** of the NEMU project.
* Set a new environment variable `NOOP_HOME` to the **absolute path** of the NOOP project.
* Set a new environment variable `NOOP_HOME` to the **absolute path** of the XiangShan project.
* Clone the [AM project](https://github.com/NJU-ProjectN/nexus-am.git).
* Set a new environment variable `AM_HOME` to the **absolute path** of the AM project.
* Add a new AM `riscv64-noop` in the AM project if it is not provided.
* Run the application in the AM project by `make ARCH=riscv64-noop run`.

## Run on FPGA

### Sub-directories Overview
```
fpga
├── board # supported FPGA boards and files to build a Vivado project
├── boot # PS boot flow of zynq and zynqmp
├── lib # HDL sources shared by different boards
├── Makefile
├── Makefile.check
└── noop.tcl # wrapper of NOOP core in the Vivado project
```

### Build a Vivado project

* Install Vivado 2019.1, and source the setting of Vivado and SDK
* Run the following command to build a Vivado project
```
cd fpga
make PRJ=myproject BOARD=axu3cg
```
Change `axu3cg` to the target board you want. Supported boards are listed under `board/`.
The project will be created under `board/axu3cg/build/myproject-axu3cg`.
* Open the project with Vivado and generate bitstream.
### Verilator simulation

### Prepare SD card
Install verilator:

Refer to the instructions of [fpga/boot/README.md](fpga/boot/README.md).
TBD

NOTE: Remember to put the bitstream into BOOT.BIN, since the guide is going to boot everything from SD card.
Generate verilog files and compile them using verilator:
* Move to project root, run `make emu` to compile verilator simulator. You can use `make emu config=CONFIG_NAME` to choose different size of XiangShan.
* To speed up compiling, use `make emu REMOTE=YOUR_REMOTE_SERVER`. (If you have remote server setuped)

### Set your board to SD boot mode
Run program generated by verilator:
* If compile succeed, you can run the application in the AM project by `make ARCH=riscv64-noop run`.
* Or you can run emulator and select image manually: `./build/emu -i PROGRAM_IMAGE`
* Use parameters to control emulator behavior: `./build/emu [-b DUMP_BEGIN_TIME] [-e DUMP_END_TIME] [--force-dump-result] [--dump-wave] -i PROGRAM_IMAGE`.
* Run `./build/emu` for further instructions.

Please refer to the user guide of your board.
* [zedboard](http://www.zedboard.org/sites/default/files/ZedBoard_HW_UG_v1_1.pdf)
* [zcu102](https://www.xilinx.com/support/documentation/boards_and_kits/zcu102/ug1182-zcu102-eval-bd.pdf)
* [sidewinder](http://sidewinder.fidus.com)
* ultraZ (currently not avaliable to the public)
* axu3cg (currently not avaliable to the public)
Example:
```makefile
make emu config=MinimalSimConfig
./build/emu -b 0 -e 0 --force-dump-reult -i ./mem.bin
```

### Boot linux in PS
`debug` dir provides some scripts for verilator simulation.

Just insert the SD card into the board, open a serial terminal and powerup the board.
### VCS simulation

### Boot NOOP (the RISC-V subsystem)
Make sure you have VCS installed.

To boot the RISC-V subsystem
* Send `fpga/resource/ddr-loader/ddr-loader.c` to PS.
This can be achieved by either copying the file to SD card,
or by sending the file with `scp` if you have your board connected to your host by network.
* Compile the loader by gcc on PS.
```
gcc -O2 -o ddr-loader ddr-loader.c
```
* Send the RISC-V program (bin file, should start at 0x80000000) to PS.
* Open minicom on PS to connect to the UART of NOOP.
Note that you can connect to PS via `ssh` and use `tmux` to get multiple terminals.
```
minicom -D /dev/ttyUL1
```
* Use the loader to load the program to NOOP memory and start running NOOP.
```
./ddr-loader axu3cg bin-file
```
* To shutdown the board, first run `poweroff` in PS.
* Run `make simv` to compile vcs simulator.
* After that, run `./simv`
4 changes: 2 additions & 2 deletions src/main/scala/system/SoC.scala
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,12 @@ case class SoCParameters
cores: List[XSCoreParameters],
EnableILA: Boolean = false,
extIntrs: Int = 150,
useFakeL3Cache: Boolean = false
useFakeL3Cache: Boolean = false,
L3Size: Int = 4 * 1024 * 1024 // 4MB
){
val PAddrBits = cores.map(_.PAddrBits).reduce((x, y) => if(x > y) x else y)
// L3 configurations
val L3InnerBusWidth = 256
val L3Size = 4 * 1024 * 1024 // 4MB
val L3BlockSize = 64
val L3NBanks = 4
val L3NWays = 8
Expand Down
49 changes: 45 additions & 4 deletions src/main/scala/top/Configs.scala
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,18 @@ class DefaultConfig(n: Int) extends Config((site, here, up) => {
)
})

// TODO: disable L2 and L3
// Synthesizable minimal XiangShan
// * It is still an out-of-order, super-scalaer arch
// * L1 cache included
// * L2 cache NOT included
// * L3 cache included
class MinimalConfig(n: Int = 1) extends Config(
new DefaultConfig(n).alter((site, here, up) => {
case SoCParamsKey => up(SoCParamsKey).copy(
cores = up(SoCParamsKey).cores.map(_.copy(
DecodeWidth = 2,
RenameWidth = 2,
FetchWidth = 4,
IssQueSize = 8,
NRPhyRegs = 80,
LoadQueueSize = 16,
Expand All @@ -33,6 +40,8 @@ class MinimalConfig(n: Int = 1) extends Config(
BrqSize = 8,
FtqSize = 16,
IBufSize = 16,
StoreBufferSize = 4,
StoreBufferThreshold = 3,
dpParams = DispatchParameters(
IntDqSize = 8,
FpDqSize = 8,
Expand All @@ -41,19 +50,51 @@ class MinimalConfig(n: Int = 1) extends Config(
FpDqDeqWidth = 4,
LsDqDeqWidth = 4
),
icacheParameters = ICacheParameters(
nSets = 8, // 4KB ICache
tagECC = Some("parity"),
dataECC = Some("parity"),
replacer = Some("setplru"),
nMissEntries = 2
),
dcacheParameters = DCacheParameters(
nSets = 8, // 4KB DCache
nWays = 4,
tagECC = Some("secded"),
dataECC = Some("secded"),
replacer = Some("setplru"),
nMissEntries = 4,
nProbeEntries = 4,
nReleaseEntries = 4,
nStoreReplayEntries = 4,
),
L2Size = 16 * 1024, // 16KB
L2NWays = 8,
EnableBPD = false, // disable TAGE
EnableLoop = false,
TlbEntrySize = 4,
TlbSPEntrySize = 2,
PtwL1EntrySize = 2,
PtwL2EntrySize = 2,
PtwL3EntrySize = 4,
PtwL2EntrySize = 64,
PtwL3EntrySize = 128,
PtwSPEntrySize = 2,
useFakeL2Cache = true,
)),
L3Size = 32 * 1024, // 32KB
)
})
)

// Non-synthesizable MinimalConfig, for fast simulation only
class MinimalSimConfig(n: Int = 1) extends Config(
new MinimalConfig(n).alter((site, here, up) => {
case SoCParamsKey => up(SoCParamsKey).copy(
cores = up(SoCParamsKey).cores.map(_.copy(
useFakeDCache = true,
useFakePTW = true,
useFakeL1plusCache = true,
)),
useFakeL3Cache = true
)
})
)
)
4 changes: 2 additions & 2 deletions src/main/scala/xiangshan/PMA.scala
Original file line number Diff line number Diff line change
Expand Up @@ -78,11 +78,11 @@ object AddressSpace {
def MemMapList = SimpleMemMapList

def printMemmap(){
println("-------------------- memory map --------------------")
println("\nMemory map:")
for(i <- MemMapList){
println("[" + i._1._1 + " -> " + i._1._2 + "] Width:" + (if(i._2.get("width").get == "h0") "unlimited" else i._2.get("width").get) + " Description:" + i._2.get("description").get + " [" + i._2.get("mode").get + "]")
}
println("----------------------------------------------------")
println("")
}

def checkMemmap(){
Expand Down
60 changes: 33 additions & 27 deletions src/main/scala/xiangshan/Parameters.scala
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ case class XSCoreParameters
LoadPipelineWidth: Int = 2,
StorePipelineWidth: Int = 2,
StoreBufferSize: Int = 16,
StoreBufferThreshold: Int = 7,
RefillSize: Int = 512,
TlbEntrySize: Int = 32,
TlbSPEntrySize: Int = 4,
Expand All @@ -92,9 +93,33 @@ case class XSCoreParameters
PtwL1EntrySize: Int = 16,
PtwL2EntrySize: Int = 2048, //(256 * 8)
NumPerfCounters: Int = 16,
icacheParameters: ICacheParameters = ICacheParameters(
tagECC = Some("parity"),
dataECC = Some("parity"),
replacer = Some("setplru"),
nMissEntries = 2
),
l1plusCacheParameters: L1plusCacheParameters = L1plusCacheParameters(
tagECC = Some("secded"),
dataECC = Some("secded"),
replacer = Some("setplru"),
nMissEntries = 8
),
dcacheParameters: DCacheParameters = DCacheParameters(
tagECC = Some("secded"),
dataECC = Some("secded"),
replacer = Some("setplru"),
nMissEntries = 16,
nProbeEntries = 16,
nReleaseEntries = 16,
nStoreReplayEntries = 16
),
L2Size: Int = 512 * 1024, // 512KB
L2NWays: Int = 8,
useFakePTW: Boolean = false,
useFakeDCache: Boolean = false,
useFakeL1plusCache: Boolean = false
useFakeL1plusCache: Boolean = false,
useFakeL2Cache: Boolean = false
){
val loadExuConfigs = Seq.fill(exuParameters.LduCnt)(LdExeUnitCfg)
val storeExuConfigs = Seq.fill(exuParameters.StuCnt)(StExeUnitCfg)
Expand Down Expand Up @@ -193,6 +218,7 @@ trait HasXSParameter {
val LoadPipelineWidth = coreParams.LoadPipelineWidth
val StorePipelineWidth = coreParams.StorePipelineWidth
val StoreBufferSize = coreParams.StoreBufferSize
val StoreBufferThreshold = coreParams.StoreBufferThreshold
val RefillSize = coreParams.RefillSize
val DTLBWidth = coreParams.LoadPipelineWidth + coreParams.StorePipelineWidth
val TlbEntrySize = coreParams.TlbEntrySize
Expand All @@ -206,29 +232,9 @@ trait HasXSParameter {
val instBytes = if (HasCExtension) 2 else 4
val instOffsetBits = log2Ceil(instBytes)

val icacheParameters = ICacheParameters(
tagECC = Some("parity"),
dataECC = Some("parity"),
replacer = Some("setplru"),
nMissEntries = 2
)

val l1plusCacheParameters = L1plusCacheParameters(
tagECC = Some("secded"),
dataECC = Some("secded"),
replacer = Some("setplru"),
nMissEntries = 8
)

val dcacheParameters = DCacheParameters(
tagECC = Some("secded"),
dataECC = Some("secded"),
replacer = Some("setplru"),
nMissEntries = 16,
nProbeEntries = 16,
nReleaseEntries = 16,
nStoreReplayEntries = 16
)
val icacheParameters = coreParams.icacheParameters
val l1plusCacheParameters = coreParams.l1plusCacheParameters
val dcacheParameters = coreParams.dcacheParameters

val LRSCCycles = 100

Expand All @@ -240,11 +246,11 @@ trait HasXSParameter {
val useFakePTW = coreParams.useFakePTW
val useFakeL1plusCache = coreParams.useFakeL1plusCache
// L2 configurations
val useFakeL2Cache = useFakeDCache && useFakePTW && useFakeL1plusCache
val useFakeL2Cache = useFakeDCache && useFakePTW && useFakeL1plusCache || coreParams.useFakeL2Cache
val L1BusWidth = 256
val L2Size = 512 * 1024 // 512KB
val L2Size = coreParams.L2Size
val L2BlockSize = 64
val L2NWays = 8
val L2NWays = coreParams.L2NWays
val L2NSets = L2Size / L2BlockSize / L2NWays

// L3 configurations
Expand Down
2 changes: 2 additions & 0 deletions src/main/scala/xiangshan/backend/ftq/Ftq.scala
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,8 @@ class Ftq(implicit p: Parameters) extends XSModule with HasCircularQueuePtrHelpe
}
})

println("Ftq: size:" + FtqSize)

val headPtr, tailPtr = RegInit(FtqPtr(false.B, 0.U))

val validEntries = distanceBetween(tailPtr, headPtr)
Expand Down
2 changes: 1 addition & 1 deletion src/main/scala/xiangshan/backend/fu/CSR.scala
Original file line number Diff line number Diff line change
Expand Up @@ -360,7 +360,7 @@ class CSR(implicit p: Parameters) extends FunctionUnit with HasCSRConst

// smblockctl: memory block configurations
// bits 0-3: store buffer flush threshold (default: 8 entries)
val smblockctl = RegInit(UInt(XLEN.W), "h7".U)
val smblockctl = RegInit(UInt(XLEN.W), "hf".U & StoreBufferThreshold.U)
csrio.customCtrl.sbuffer_threshold := smblockctl(3, 0)

val srnctl = RegInit(UInt(XLEN.W), "h1".U)
Expand Down
2 changes: 2 additions & 0 deletions src/main/scala/xiangshan/backend/regfile/Regfile.scala
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@ class Regfile
val debug_rports = Vec(32, new RfReadPort(len))
})

println("Regfile: size:" + NRPhyRegs + " read: " + numReadPorts + "write: " + numWirtePorts)

val useBlackBox = false
if (!useBlackBox) {
val mem = Reg(Vec(NRPhyRegs, UInt(len.W)))
Expand Down
2 changes: 2 additions & 0 deletions src/main/scala/xiangshan/backend/roq/Roq.scala
Original file line number Diff line number Diff line change
Expand Up @@ -261,6 +261,8 @@ class Roq(numWbPorts: Int)(implicit p: Parameters) extends XSModule with HasCirc
val roqFull = Output(Bool())
})

println("Roq: size:" + RoqSize + " wbports:" + numWbPorts + " commitwidth:" + CommitWidth)

// instvalid field
// val valid = RegInit(VecInit(List.fill(RoqSize)(false.B)))
val valid = Mem(RoqSize, Bool())
Expand Down
2 changes: 0 additions & 2 deletions src/main/scala/xiangshan/cache/ICache.scala
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,6 @@ case class ICacheParameters(
tagECC: Option[String] = None,
dataECC: Option[String] = None,
replacer: Option[String] = Some("random"),
nSDQ: Int = 17,
nRPQ: Int = 16,
nMissEntries: Int = 1,
nMMIOs: Int = 1,
blockBytes: Int = 64
Expand Down
Loading

0 comments on commit 05f23f5

Please sign in to comment.