SkillAgentSearch skills...

Acceltran

[TCAD'23] AccelTran: A Sparsity-Aware Accelerator for Transformers

Install / Use

/learn @jha-lab/Acceltran
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

AccelTran: A Sparsity-Aware Monolithic 3D Accelerator for Transformer Architectures at Scale

Python Version Conda PyTorch Hits

AccelTran is a tool to simulate a design space of accelerators on diverse flexible and heterogeneous transformer architectures supported by the FlexiBERT 2.0 framework at jha-lab/txf_design-space.

The figure below shows the utilization of different modules in an AccelTran architecture for the BERT-Tiny transformer model.

AccelTran GIF

Table of Contents

Environment setup

Clone this repository and initialize sub-modules

git clone https://github.com/JHA-Lab/acceltran.git
cd ./acceltran/
git submodule init
git submodule update

Setup python environment

The python environment setup is based on conda. The script below creates a new environment named txf_design-space:

source env_setup.sh

For pip installation, we are creating a requirements.txt file. Stay tuned!

Run synthesis

Synthesis scripts use Synopsys Design Compiler. All hardware modules are implemented in SystemVerilog in the directory synthesis/top.

To get area and power consumption reports for each module, use the following command:

cd ./synthesis/
dc_shell -f 14nm_sg.tcl -x "set top_module <M>"
cd ..

Here, <M> is the module that is to be synthesized in: mac_lane, ln_forward_<T> (for layer normalization), softmax_<T>, etc. where <T> is the tile size among 8, 16, or 32.

All output resports are stored in synthesis/reports.

To run the synthesis for the DMA module, run the following command instead:

cd ./synthesis/
dc_shell -f dma.tcl 

Run pruning

To get the sparsity in activations and weights in an input transformer model and its corresponding performance on the GLUE benchmark, use the dynamic pruning model: DP-BERT.

To test the effect of different sparsity ratios on the model performance on the SST-2 benchmark, use the following script:

cd ./pruning/
python3 run_evaluation.py --task sst2 --max_pruning_threshold 0.1
cd ..

The script uses a weight-pruned model, and so, the weights are not pruned futher. To prune the weights with a pruning_threshold as well, use the flag: --prune_weights.

Run simulator

AccelTran supports a diverse range of accelerator hyperparameters. It also supports all ~10<sup>88</sup> models in the FlexiBERT 2.0 design space.

To specify the configuration of an accelerator's architecture, use a configuration file in simulator/config directory. Example configuration files are given accelerators optimized for BERT-Nano and BERT-Tiny. Accelerator hardware configuration files should conform with the design space specified in the simulator/design_space/design_space.yaml file.

To specify the transformer model parameters, use a model dictionary file in simulator/model_dicts. Model dictionaries for BERT-Nano and BERT-Tiny have already been provided for convenience.

To run AccelTran on the BERT-Tiny model, while plotting utilization and metric curves every 1000 cycles, use the following command:

cd ./simulator/
python3 run_simulator.py --model_dict_path ./model_dicts/bert_tiny.json --config_path ./config/config_tiny.yaml --plot_steps 1000 --debug
cd ..

This will output the accelerator state for every cycle. For more information on the possible inputs to the simulation script, use:

cd ./simulator/
python3 run_simulator.py --help
cd ..

Developer

Shikhar Tuli. For any questions, comments or suggestions, please reach me at stuli@princeton.edu.

Cite this work

Cite our work using the following bitex entry:

@article{tuli2023acceltran,
  title={{AccelTran}: A Sparsity-Aware Accelerator for Dynamic Inference with Transformers},
  author={Tuli, Shikhar and Jha, Niraj K},
  journal={arXiv preprint arXiv:2302.14705},
  year={2023}
}

If you use the AccelTran design space to implement transformer-accelerator co-design, please also cite:

@article{tuli2023transcode,
  title={{TransCODE}: Co-design of Transformers and Accelerators for Efficient Training and Inference},
  author={Tuli, Shikhar and Jha, Niraj K},
  journal={arXiv preprint arXiv:2303.14882},
  year={2023}
}

License

BSD-3-Clause. Copyright (c) 2022, Shikhar Tuli and Jha Lab. All rights reserved.

See License file for more details.

Related Skills

View on GitHub
GitHub Stars59
CategoryDevelopment
Updated4d ago
Forks10

Languages

Python

Security Score

100/100

Audited on Mar 24, 2026

No findings