SkillAgentSearch skills...

Parrot

Parrot is a C++ library for fused array operations using CUDA/Thrust. It provides efficient GPU-accelerated operations with lazy evaluation semantics, allowing for chaining of operations without unnecessary intermediate materializations.

Install / Use

/learn @NVlabs/Parrot
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<div align="center"> <img src="docs/_static/logo.png" alt="Sean Parrot" width="300"> <h1><b>Parrot</b></h1> </div> <p align="center"> <a href="https://github.com/nvlabs/parrot/issues" alt="contributions welcome"> <img src="https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat" /></a> <a href="https://en.cppreference.com/w/cpp/compiler_support/11"> <img src="https://img.shields.io/badge/C++%20-20-ff69b4.svg"/></a> <a href="https://developer.nvidia.com/cuda-toolkit" alt="CUDA 13.0+"> <img src="https://img.shields.io/badge/CUDA-13.0-ff69b4.svg"/></a> <a href="https://nvlabs.github.io/parrot/" alt="Documentation"> <img src="https://img.shields.io/badge/docs-latest-blue.svg" /></a> <a href="https://github.com/NVLabs/parrot/actions/workflows/unit-tests.yml" alt="Unit Tests"> <img src="https://github.com/NVLabs/parrot/actions/workflows/unit-tests.yml/badge.svg" /></a> </p>

Parrot is a C++ library for fused array operations using CUDA/Thrust. It provides efficient GPU-accelerated operations with lazy evaluation semantics, allowing for chaining of operations without unnecessary intermediate materializations.

✨ Features

  • Fused Operations - Operations are fused when possible
  • GPU Acceleration - Built on CUDA/Thrust for high performance
  • Chainable API - Clean, expressive syntax for complex operations
  • Header-Only - Simple integration with #include "parrot.hpp"

¹ Lazy-ish means that any operation that can be lazily evaluated is lazily evaluated.

🚀 Quick Start

#include "parrot.hpp"

int main() {
    // Create arrays
    auto A = parrot::array({3, 4, 0, 8, 2});
    auto B = parrot::array({6, 7, 2, 1, 8});
    auto C = parrot::array({2, 5, 7, 4, 3});
    
    // Chain operations
    (B * C + A).print();  // Output: 15 39 14 12 26
}

Advanced Example: Row-wise Softmax

#include "parrot.hpp"

using namespace parrot::literals;

auto softmax(auto matrix) {
    auto cols = matrix.ncols();
    auto z    = matrix - matrix.maxr(2_ic).replicate(cols);
    auto num  = z.exp();
    auto den  = num.sum(2_ic);
    return num / den.replicate(cols);
}

int main() {
    auto matrix = parrot::range(6).as<float>().reshape({2, 3});
    softmax(matrix).print();
}

🏗️ Building

Prerequisites

  • CMake (3.10+)
  • C++ compiler with C++20 support
  • NVIDIA CUDA Toolkit 13.0 or later
  • NVIDIA GPU with compute capability 7.0+

Build Steps

# Clone the repository
git clone <repository-url>
cd parrot

# Create build directory
mkdir build && cd build

# Configure and build
cmake ..
cmake --build . -j$(nproc)

# Run tests
ctest

For detailed build instructions, see BUILDING.md.

📊 Comparisons

Parrot provides significant code reduction compared to other CUDA libraries:

| Library | Code Reduction | | ---------- | ------------------ | | Thrust | ~10x less code |

See detailed comparisons in our documentation.

📚 Documentation

🧪 Examples

The examples/ directory contains:

  • getting_started/ - Simple examples to get started
  • thrust/ - Parrot implementations of Thrust examples

🛠️ Development

Running Tests

# All tests
ctest

# Individual test categories
./test_basic      # Basic operations
./test_sorting    # Sorting algorithms
./test_math       # Mathematical operations
./test_reductions # Reduction operations

Code Quality

# Run clang-tidy
./scripts/run-clang-tidy.sh

# Auto-fix issues
./scripts/run-clang-tidy.sh --fix

Building Documentation

# Install dependencies
uv venv
uv pip install -r requirements.txt

# Build docs
cd scripts && ./build-docs.sh

📁 Repository Structure

parrot/
├── parrot.hpp              # Main header (single-file library)
├── thrustx.hpp             # Extended Thrust utilities
├── examples/               # Example code
│   ├── getting_started/    # Simple getting-started examples
│   ├── thrust/             # Parrot versions of Thrust examples
│   ├── real_world/         # Parrot versions of examples from open source projects
├── tests/                  # Unit tests
├── docs/                   # Documentation source
├── scripts/                # Development scripts

🤝 Contributing

We welcome contributions! Please see our CONTRIBUTING.md guide for details on:

  • Setting up the development environment
  • Code standards and review process
  • Running tests and code quality checks
  • Building documentation

All contributions must comply with the Apache License 2.0.

📄 License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

Third-Party Licenses

This project includes third-party software components. See THIRD_PARTY_LICENSES for complete license information and attributions.

🙏 Acknowledgments

Built on top of NVIDIA Thrust and CUDA.


📖 Read the Full Documentation →

View on GitHub
GitHub Stars259
CategoryDevelopment
Updated1d ago
Forks16

Languages

Cuda

Security Score

85/100

Audited on Mar 26, 2026

No findings