TTC
TTC: A high-performance Compiler for Tensor Transpositions
Install / Use
/learn @HPAC/TTCREADME
Tensor Transpose Compiler
The Tensor Transpose Compiler (TTC) generates high-performance parallel and vectorized C++ code for multidimensional tensor transpositions.
TTC supports arbitrarily dimensional, out-of-place tensor transpositions of the general form:

where A and B respectively denote the input and output tensor; <img src=https://github.com/HPAC/TTC/blob/master/misc/pi.png height=16px/> represents the user-specified transposition, and <img src=https://github.com/HPAC/TTC/blob/master/misc/alpha.png height=14px/> and <img src=https://github.com/HPAC/TTC/blob/master/misc/beta.png height=16px/> being scalars (i.e., setting <img src=https://github.com/HPAC/TTC/blob/master/misc/beta.png height=16px/> != 0 enables the user to update the output tensor B).
Please also have a look at TTC-C which provides a C wrapper API for TTC.
Current version: v0.1.1
<span style="color:red">Attention:</span> Please note that TTC is no longer updated since it has been replaced by an equivalent C++ library: HPTT.
Key Features
- Generates parallelized and vectorized code
- Support for multiple instruction sets: AVX, AVX2, AVX512, KNC, CUDA
- Support for all datatypes (i.e., single, double, single-complex and double-complex)
- Support for mixed precision (i.e., different data types for A and B)
- For instance, this feature can be used to generate a mixed-precision BLAS
- Support for multiple leading dimensions in both A and B
- This enables the user to extract (and transpose) a smaller subtensor out of a larger tensor
- TTC allows the user to guide the code generation process:
- E.g., specifying the --maxImplementations=N argument will limited the number of generated implementations to N
Install
-
Clone the repository into a desired directory and change to that location:
git clone https://github.com/HPAC/TTC.git cd TTC
-
Install TTC:
python setup.py install --user
-
Make sure that you export the TTC_ROOT environment variable (add to your .bashrc):
export TTC_ROOT=
pwd -
You might have to add the installed location to your PATH environment variable:
export PATH=$PATH:~/.local/bin
Getting Started
Please run ttc --help to get an overview of TTC's parameters.
Here is one exemplary input to TTC:
ttc --perm=1,0,2 --size=1000,768,16 --dataType=s --alpha=1.0 --beta=1.0 --numThreads=20
Requirements
You must have a working C compiler. I have tested TTC with:
- Python (tested with v2.7.5 and v2.7.9)
- GCC (>= 4.8)
- Intel's ICC (>= 15.0)
Benchmark
TTC provides a benchmark for tensor transpositions.
python benchmark.py <num_threads>
This will generate the input strings for TTC for each of the test-cases within the benchmark. The benchmark uses a default tensor size of 200 MiB (see _sizeMB variable)
Citation
In case you want refer to TTC as part of a research paper, please cite the following article (pdf):
@article{ttc2016a,
author = {Paul Springer and Jeff R. Hammond and Paolo Bientinesi},
title = {{TTC}: A high-performance Compiler for Tensor Transpositions},
archivePrefix = "arXiv",
eprint = {1603.02297},
primaryClass = "quant-ph",
journal = {CoRR},
year = {2016},
issue_date = {March 2016},
url = {http://arxiv.org/abs/1603.02297}
}
Changelog
V0.1.1:
- Improved performance for streaming stores (only applicable if beta=0)
Feedback & Contributions
We are happy for any feedback or feature requests. Please contact springer@aices.rwth-aachen.de.
We also welcome any contributions to the code base.
Related Skills
node-connect
336.9kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
83.0kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
336.9kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
83.0kCommit, push, and open a PR
