Acceltran

[TCAD'23] AccelTran: A Sparsity-Aware Accelerator for Transformers

Generate Convert Improve

Install / Use

/learn @jha-lab/Acceltran

About this skill

Quality Score

0/100

README

AccelTran: A Sparsity-Aware Monolithic 3D Accelerator for Transformer Architectures at Scale

AccelTran is a tool to simulate a design space of accelerators on diverse flexible and heterogeneous transformer architectures supported by the FlexiBERT 2.0 framework at jha-lab/txf_design-space.

The figure below shows the utilization of different modules in an AccelTran architecture for the BERT-Tiny transformer model.

AccelTran GIF

Environment Setup
- Clone this repository and initialize sub-modules
- Setup python environment
Run synthesis
Run pruning
Run simulator
Developer
Cite this work
License

Environment setup

Clone this repository and initialize sub-modules

git clone https://github.com/JHA-Lab/acceltran.git
cd ./acceltran/
git submodule init
git submodule update

Setup python environment

The python environment setup is based on conda. The script below creates a new environment named txf_design-space:

source env_setup.sh

For pip installation, we are creating a requirements.txt file. Stay tuned!

Run synthesis

Synthesis scripts use Synopsys Design Compiler. All hardware modules are implemented in SystemVerilog in the directory synthesis/top.

To get area and power consumption reports for each module, use the following command:

cd ./synthesis/
dc_shell -f 14nm_sg.tcl -x "set top_module <M>"
cd ..

Here, <M> is the module that is to be synthesized in: mac_lane, ln_forward_<T> (for layer normalization), softmax_<T>, etc. where <T> is the tile size among 8, 16, or 32.

All output resports are stored in synthesis/reports.

To run the synthesis for the DMA module, run the following command instead:

cd ./synthesis/
dc_shell -f dma.tcl

Run pruning

To get the sparsity in activations and weights in an input transformer model and its corresponding performance on the GLUE benchmark, use the dynamic pruning model: DP-BERT.

To test the effect of different sparsity ratios on the model performance on the SST-2 benchmark, use the following script:

cd ./pruning/
python3 run_evaluation.py --task sst2 --max_pruning_threshold 0.1
cd ..

The script uses a weight-pruned model, and so, the weights are not pruned futher. To prune the weights with a pruning_threshold as well, use the flag: --prune_weights.

Run simulator

AccelTran supports a diverse range of accelerator hyperparameters. It also supports all ~10<sup>88</sup> models in the FlexiBERT 2.0 design space.

To specify the configuration of an accelerator's architecture, use a configuration file in simulator/config directory. Example configuration files are given accelerators optimized for BERT-Nano and BERT-Tiny. Accelerator hardware configuration files should conform with the design space specified in the simulator/design_space/design_space.yaml file.

To specify the transformer model parameters, use a model dictionary file in simulator/model_dicts. Model dictionaries for BERT-Nano and BERT-Tiny have already been provided for convenience.

To run AccelTran on the BERT-Tiny model, while plotting utilization and metric curves every 1000 cycles, use the following command:

cd ./simulator/
python3 run_simulator.py --model_dict_path ./model_dicts/bert_tiny.json --config_path ./config/config_tiny.yaml --plot_steps 1000 --debug
cd ..

This will output the accelerator state for every cycle. For more information on the possible inputs to the simulation script, use:

cd ./simulator/
python3 run_simulator.py --help
cd ..

Developer

Shikhar Tuli. For any questions, comments or suggestions, please reach me at stuli@princeton.edu.

Cite this work

Cite our work using the following bitex entry:

@article{tuli2023acceltran,
  title={{AccelTran}: A Sparsity-Aware Accelerator for Dynamic Inference with Transformers},
  author={Tuli, Shikhar and Jha, Niraj K},
  journal={arXiv preprint arXiv:2302.14705},
  year={2023}
}

If you use the AccelTran design space to implement transformer-accelerator co-design, please also cite:

@article{tuli2023transcode,
  title={{TransCODE}: Co-design of Transformers and Accelerators for Efficient Training and Inference},
  author={Tuli, Shikhar and Jha, Niraj K},
  journal={arXiv preprint arXiv:2303.14882},
  year={2023}
}

License

See License file for more details.

Related Skills

node-connect

339.3k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

83.9k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

339.3k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

commit-push-pr

83.9k

Commit, push, and open a PR

jha-lab

View profile

View on GitHub

GitHub Stars59

CategoryDevelopment

Updated4d ago

Forks10

jha-lab/acceltran

Languages

Python

Security Score

100/100

Audited on Mar 24, 2026

No findings