SkillAgentSearch skills...

Pmpp

Complete solutions to the Programming Massively Parallel Processors Edition 4

Install / Use

/learn @tugot17/Pmpp
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Programming Massively Parallel Processors - Complete Solutions

<div align="center"> <img src="image.png" alt="Book Cover" width="300">

CUDA Python License

Complete solutions to Kirk & Hwu's Programming Massively Parallel Processors (4th Edition)

Theoretical explanations + Working implementations + Performance analysis

</div>

Overview

This repository contains comprehensive solutions to all exercises in Programming Massively Parallel Processors by David Kirk and Wen-mei Hwu (4th Edition). Each chapter includes:

  • Detailed exercise solutions with step-by-step explanations
  • Working code implementations in both CUDA C and Python
  • Performance benchmarks comparing different approaches
  • Visual diagrams for complex algorithms

Chapter Organization

Each chapter follows this structure:

├── code/
│   ├── *.cu          # CUDA implementations
│   ├── *.py          # Python alternatives  
│   ├── Makefile      # Build configuration
│   └── ...
└── README.md         # Theory + Exercises + Solutions

Available Chapters

| Chapter | Topic | Focus Areas | |---------|-------|-------------| | Chapter 2 | Heterogeneous Data Parallel Computing | Vector operations, thread mapping, CUDA basics | | Chapter 3 | Multidimensional Grids and Data | Grid organization, thread hierarchy | | Chapter 4 | Compute Architecture and Scheduling | GPU architecture, warps, occupancy | | Chapter 5 | Memory Architecture and Data Locality | Memory types, tiling, bandwidth optimization | | Chapter 6 | Performance Considerations | Memory coalescing, latency hiding | | Chapter 7 | Convolution | Constant memory, caching, halo cells | | Chapter 8 | Stencil | 2D/3D stencil computations, register tiling | | Chapter 9 | Parallel Histogram | Atomic operations, privatization, aggregation | | Chapter 10 | Reduction | Tree reduction, divergence minimization | | Chapter 11 | Prefix Sum (Scan) | Work-efficient algorithms, Kogge-Stone, Brent-Kung | | Chapter 12 | Merge | Co-rank function, circular buffers | | Chapter 13 | Sorting | Radix sort, merge sort optimization | | Chapter 14 | Sparse Matrix Computation | SpMV, CSR/ELL/COO formats | | Chapter 15 | Graph Traversal | BFS algorithms, frontier-based approaches | | Chapter 16 | Deep Learning | CNN implementation, GEMM formulation | | Chapter 17 | Iterative MRI Reconstruction | Medical imaging algorithms | | Chapter 18 | Electrostatic Potential Map | Scatter vs gather, cutoff binning | | Chapter 19 | Parallel Programming and Computational Thinking | Algorithm selection, problem decomposition | | Chapter 20 | Heterogeneous Computing Cluster | CUDA streams, MPI integration | | Chapter 21 | CUDA Dynamic Parallelism | Recursive algorithms, quadtrees |

Quick Start

Prerequisites

  • NVIDIA GPU with CUDA support
  • CUDA Toolkit installed
  • Python 3.11+ (optional, for Python examples)

Setup

# Clone the repository
git clone <repository-url>
cd pmpp

# For Python examples (optional)
conda create -n pmpp python=3.11
conda activate pmpp
pip install -r requirements.txt

Running Examples

CUDA/C Examples:

cd chapter-XX/code
make
./program_name

Python Examples:

cd chapter-XX/code
python script_name.py

Contributing

Found an error? Please open an issue using this template:

Describe the bug

Describe where the problem is and what precisely is wrong.

Proposed solution

Here paste your proposed solution. Please include the reasoning behind why you believe your solution is correct.

Contribution Guidelines

  • Maintain the existing explanation style with clear reasoning
  • Include working code for any new implementations
  • Add performance data where relevant
  • Follow the existing code formatting standards

License

This project is licensed under the MIT License - see the LICENSE file for details.

Related Skills

View on GitHub
GitHub Stars696
CategoryDevelopment
Updated16h ago
Forks95

Languages

Jupyter Notebook

Security Score

95/100

Audited on Mar 25, 2026

No findings