16 skills found
hahnyuan / LLM ViewerAnalyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.
feifeibear / LLMRooflineCompare different hardware platforms via the Roofline Model for LLM inference tasks.
psmarter / CUDA PracticeCUDA编程练习项目-Hands-on CUDA kernels and performance optimization, covering GEMM, FlashAttention, Tensor Cores, CUTLASS, quantization, KV cache, NCCL, and profiling.
ProjectPhysX / PTXprofilerA simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.
maestro-project / FrameFRAME: Fast Roofline Analytical Modeling and Estimation
NicolasDenoyelle / Locality Aware Roofline ModelInstanciate the Cache Aware Roofline Model on single socket and multisocket systems.
champ-hub / Carm RooflineCross-platform Cache-Aware Roofline Model (CARM) and Application Benchmarking Tool for Intel, AMD, ARM, and RISC-V CPUs, and NVIDIA and AMD GPUs
ekondis / Gpuroofperf ToolkitA GPU performance prediction toolkit for CUDA programs
caparrov / ERMExtended Roofline Model - LLVM source tree with additional libraries for the analysis of the dynamic execution in the interpreter
giopaglia / RoofliniA Python script for plotting roofline analyses. Intel Advisor style.
dengls24 / LLM ParaAnalyze LLM inference: FLOPs, memory, Roofline model. Supports GQA, MoE, MLA, RoPE, SwiGLU. 19 models × 20+ hardware platforms.
jeewhanchoi / A Roofline Model Of Energy UbenchmarksAutomatically exported from code.google.com/p/a-roofline-model-of-energy-ubenchmarks
mohamed / RooflineA simple script to plot the Roofline model for given HW platforms and applications
Techercise / AMD Instruction Roofline Using RocProf MetricsThis repository contains example spreadsheets and scripts to construct instruction roofline models for AMD GPUs using metrics from rocProf
PawseySC / Performance Modelling ToolsThis repository hosts configuration files for HPC Toolkit, ROCprof, NVprof and ERT, and scripts to help us create roofline and instruction based roofline diagrams (performance models) for applications
ebt-hpc / Cca EbtCode Comprehension Assistance for Evidence-Based performance Tuning