Shakeflow
ShakeFlow: Functional Hardware Description with Latency-Insensitive Interface Combinators (ASPLOS 2023)
Install / Use
/learn @kaist-cp/ShakeflowREADME
ShakeFlow: Functional Hardware Description with Latency-Insensitive Interface Combinators
This repository contains the artifact for the following paper:
ShakeFlow: Functional Hardware Description with Latency-Insensitive Interface Combinators. Sungsoo Han*, Minseong Jang*, and Jeehoon Kang (*: co-first authors with equal contributions). ASPLOS 2023 (to appear, submission #43 of the Spring cycle).
The paper title has been changed from the following during the review:
ShakeFlow: A Hardware Description Language Supporting Bidirectional Interface Combinators.
Artifact Retrieval
-
Option 1: from GitHub:
$ git clone git@github.com:kaist-cp/shakeflow.git $ cd shakeflow -
Option 2: from Zenodo (a link will be provided to the reviewers).
Artifact Contents
This artifact consists of the following directories:
./shakeflow-macro: Rust macro for derivingSignalandInterfacetraits (Section 3)./shakeflow: the ShakeFlow compiler (Section 4)./shakeflow-std: the ShakeFlow standard library (Section 5)./shakeflow-bsg: our port of BaseJump STL to ShakeFlow (Section 5)./shakeflow-corundum: our port of Corundum 100Gbps NIC to ShakeFlow (Section 5)./shakeflow-examples: example ShakeFlow modules including FIR filter (Section 1, 2)./scripts: scripts to build the project, to perform evaluation, and to draw graphs (Section 6)
This artifact aims to achieve the following goals:
- G1: Locating ShakeFlow's core concepts (Section 3) in the development
- G2: Locating the submodules of Corundum's
tx_checksum(Figure 9) in the development - G3: Reproducing Table 1: SLOC of Corundum in Verilog and ShakeFlow
- G4: Reproducing Table 2: Resource Consumption of
C_OrigandC_SF - G5: Reproducing Figure 12: Throughput of NICs for TCP/IP Micro-benchmark (iperf)
- G6: Reproducing Figure 13: Throughput of NICs for Remote File Read Workload (fio)
- G7: Reproducing Figure 14: Throughput of NICs for Web Server and Client Workload
- G8: Reproducing Figure 15: Scalability of NICs for Web Server Workload
G1: Locations of ShakeFlow's Core Concepts (Section 3)
| Paper Section | Concept | Location |
| --- | --- | --- |
| 3.1 Custom Interface Types | Custom Signal Types | Signal trait (shakeflow/src/hir/signal.rs)
| | Custom Channel Types | Interface trait, channel! macro (shakeflow/src/hir/interface.rs)
| | Composite Interface Types | Interface trait (shakeflow/src/hir/interface.rs)
| 3.3 Application-Specific Combinational Logics | Signal Expressions | Expr struct (shakeflow/src/hir/expr.rs)
| 3.4 Custom Interface Combinators | The Generic Combinator | comb_inline method (shakeflow/src/hir/interface.rs)
| | Custom Combinators | ShakeFlow's standard library (shakeflow-std)
| 3.5 Module Combinators | Feedback Loops | loop_feedback method (shakeflow/src/hir/module_composite.rs)
| | Declarations | module_inst method (shakeflow/src/hir/interface.rs)
G2: Locations of the Submodules of Corundum's tx_checksum (Figure 9)
An overview of the tx_checksum module is presented in the figure below.
For each submodule in the figure, corresponding lines in ShakeFlow are as follows.
No. | Submodule (Lines in tx_checksum.rs) | Description
--- | --- | ---
0 | map (L110-113) | Adds always-asserted tlast wire to s_axis_cmd channel
1 | axis_rr_mux (L116) | Muxes s_axis and s_axis_cmd in a round-robin manner
2 | duplicate (L117) | Duplicates the channel into checksum pipeline (submodules 3-6) and the packet buffer (submodules 7-8)
3 | fsm (L132-269) | Calculates a checksum from a command and a packet
4 | map (L270-277) | Serializes checksum info
5 | FIFO (L278) | Adds checksum info to FIFO queue
6 | map (L279-283) | Deserializes checksum info
7 | filter_map (L122-126) | Discards when the command (s_axis_cmd) come
8 | FIFO (L128) | Adds data info to FIFO queue
9 | axis_rr_mux (L287) | Selects one of csum and data from round-robin mux
10 | fsm (L289-349) | Muxes two FIFO outputs in a round-robin manner
11 | filter_map (L350-354) | Discards when the checksum value is not placed at the packet
12 | buffer_skid (L355) | Adds a buffer
G3: SLOC of Corundum in Verilog and ShakeFlow (Table 1)
We report the significant lines of code (SLOC, excluding comments empty lines) of the original and our ShakeFlow port of two IPs: the Corundum 100Gbps NIC and BaseJump STL's dataflow and network-on-chip modules. We use cloc to measure SLOC of each file.
The LOCs of our ShakeFlow ports reported here are lower than those reported in the accepted version of the paper, as we has further refactored the development since the re-submission.
SLOC of Ported Corundum 100Gbps NIC modules
You can find the ported modules in shakeflow-corundum/src.
No. | Module | LOC (Original) | LOC (ShakeFlow) | LOC (Generated Verilog) --- | --- | --- | --- | --- 0 | (common types) (ShakeFlow) | | 384 | 1 | cmac_pad (Original, ShakeFlow) | 54 | 20 | 59 2 | event_mux (Original, ShakeFlow) | 128 | 17 | 203 3 | cpl_op_mux (Original, ShakeFlow) | 179 | 57 | 277 4 | desc_op_mux (Original, ShakeFlow) | 293 | 85 | 626 5 | rx_hash (Original, ShakeFlow) | 202 | 183 | 2564 6 | rx_checksum (Original, ShakeFlow) | 109 | 88 | 354 7 | tx_checksum (Original, ShakeFlow) | 424 | 297 | 1466 8 | cpl_write (Original, ShakeFlow) | 377 | 295 | 1090 9 | desc_fetch (Original, ShakeFlow) | 438 | 321 | 1224 10 | rx_engine (Original, ShakeFlow) | 639 | 464 | 1265 11 | tx_engine (Original, ShakeFlow) | 641 | 498 | 1425 12 | queue_manager (ShakeFlow) | | 115 | 13 | fetch_queue_manager (Original, ShakeFlow) | 491 | 219 | 1862 14 | cpl_queue_manager (Original, ShakeFlow) | 512 | 250 | 1984 15 | tx_scheduler_rr (Original, ShakeFlow) | 630 | 498 | 2020 | | (total) | 5117 | 3791 | 16419 |
SLOC of Ported BaseJump STL modules
You can find the ported modules in shakeflow-bsg/src.
No. | Module | LOC (Original) | LOC (ShakeFlow) | LOC (Generated Verilog) --- | --- | --- | --- | --- 0 | bsg_dataflow (Original, ShakeFlow) | 3720 | 2004 | 19960 1 | bsg_noc (Original, ShakeFlow) | 1703 | 1385 | 11463
Compiling ShakeFlow Modules to Verilog
Software Requirement
- Rust nightly-2022-09-27
Script
To generate the Verilog code for the FIR filter (Section 2):
cargo run --bin shakeflow-examples
To generate the Verilog code for our ShakeFlow port of Corundum (Section 5):
cargo run --bin shakeflow-corundum
To generate the Verilog code for our ShakeFlow port of BaseJump STL's dataflow and network-on-chip modules (Section 5):
cargo run --bin shakeflow-bsg
The generated code is located in build.
Building Corundum
We ported Corundum's core packet processing functionalities, including descriptor and completion queue management, checksum validation and offloading, receive flow hashing, and receive-side scaling, from Verilog to ShakeFlow (Section 5, 6).
