17 skills found
microsoft / AcceraOpen source cross-platform compiler for compute-intensive loops used in AI algorithms, from Microsoft Research
anaghasethu / KTU Sem7 Compilerdesign ProgramsKTU 7th Semester Compiler Design lab programs along with algorithms
cstjean / Unrolled.jlUnrolling loops at compile-time
eira-fransham / CrunchyCrunchy unroller - deterministically unroll constant loops
ZigZag-Project / Zigzag V1A Fast DNN Accelerator Design Space Exploration Framework.
StephenVavasis / Unroll.jljulia macro for unrolling for-loops
CliMA / UnrolledUtilities.jlA toolkit for optimizing Julia code that uses statically sized iterators.
schneiderfelipe / Unrolled🧻 Unroll for-loops at compile-time.
rohanverma94 / Stm32f4xxx SIMD AddThis is a demostration of SIMD code in stm32f4xxx microcontroller.
yianan261 / Multi GPU TRAINING OPTIMIZATIONThis project optimizes multi-GPU parallelism for machine learning training by accelerating multi-GPU using fused gradient buffers, NCCL AllReduce, and CUDA C kernel-level optimizations including memory coalescing, shared memory tiling, loop unrolling, and stream-based communication overlap.
millardjn / Typenum LoopsA rust library that provides loops which are fully or partially unrolled at compile time.
opalkale / Matrix Multiply OptimizationUsed cache blocking, parallelizing, loop unrolling, register blocking, loop ordering, and SSE instructions to optimize the multiplication of large matrices to 55 gFLOPS
DinaTaklit / ANN Predict Loop Unrolling FactorDeep neural network model based to predict optimal loop unrolling factors to optimize loops in tiramisu compiler.
remcofl / HilbertSFCUltra-fast 2D & 3D Hilbert curve kernels in Python. JIT compiled, branchless, L1-cache-friendly lookup tables, loop unrolling, SIMD, and multi-threading.
Michaelangel007 / Apple2 Count MillionOne of the fastest ways to count to 1,000,000 on the Apple 2
ianmicheal / SH4 LOOP UNROLL TESTTarget the best amount to unroll loops
jawj / HextremeFast hex and base64 encoding and decoding, string <-> Uint8Array