SkillAgentSearch skills...

Attesor

AI-powered reverse-engineering of Rosetta (2 for Linux). Disclaimer: due to the user agreement, I will not touch the code. All is done by AI, so there might be messy implementation.

Install / Use

/learn @Inokinoki/Attesor
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Rosetta 2 Reverse Engineering Project

License: MIT Status

A comprehensive reverse-engineering effort to understand and document Apple's Rosetta 2 binary translation technology.

Table of Contents

  1. Background
  2. What is Rosetta?
  3. What is Rosetta 2?
  4. How Apple Delivers Rosetta 2 in macOS
  5. Technical Architecture
  6. This Project
  7. File Structure
  8. Usage
  9. Progress
  10. References

Background

The Architecture Transition

In November 2020, Apple announced their first Apple Silicon Macs, marking a historic transition from Intel x86_64 processors to their own ARM-based M1 chips. This was Apple's third major architecture transition:

  1. 1994: Motorola 68000 -> PowerPC
  2. 2006: PowerPC -> Intel x86_64
  3. 2020: Intel x86_64 -> Apple Silicon (ARM64)

Each transition required a binary translation solution to run existing software during the migration period. Rosetta 2 is Apple's most sophisticated binary translation system yet.


What is Rosetta?

Rosetta (2006-2011) was Apple's first dynamic binary translation software, enabling PowerPC applications to run on Intel-based Macs.

Key Features:

  • Dynamic Translation: Translated PowerPC code to x86_64 at runtime
  • OS Integration: Built into Mac OS X 10.4 (Tiger) through 10.6 (Snow Leopard)
  • Transparent Operation: Users launched PowerPC apps normally
  • Performance Overhead: Typically 20-50% slower than native code

Rosetta was removed in Mac OS X 10.7 (Lion), completing the Intel transition.


What is Rosetta 2?

Rosetta 2 is Apple's advanced dynamic binary translation technology that enables applications compiled for Intel x86_64 Macs to run on Apple Silicon (ARM64) Macs.

Architecture Overview

┌─────────────────────────────────────────────────────────────┐
│                    User Application (x86_64)                 │
├─────────────────────────────────────────────────────────────┤
│                     Rosetta 2 Layer                          │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐  │
│  │  Translator │  │  Runtime    │  │  System Call        │  │
│  │  (AOT/JIT)  │  │  Library    │  │  Translation        │  │
│  └─────────────┘  └─────────────┘  └─────────────────────┘  │
├─────────────────────────────────────────────────────────────┤
│                    macOS Kernel (ARM64)                      │
├─────────────────────────────────────────────────────────────┤
│                    Apple Silicon Hardware                    │
└─────────────────────────────────────────────────────────────┘

Key Technologies

  1. Ahead-of-Time (AOT) Translation

    • Translates x86_64 binaries to ARM64 at install time
    • Stores translated code in a cache for faster subsequent launches
    • Reduces runtime overhead compared to pure JIT translation
  2. Just-in-Time (JIT) Translation

    • Translates code blocks on-demand during execution
    • Handles dynamically loaded code and self-modifying code
    • Maintains translation cache for efficiency
  3. Instruction Set Translation

    • x86_64 -> ARM64 instruction mapping
    • SSE/AVX -> NEON vector instruction translation
    • x86_64 flags -> ARM64 condition codes
  4. System Call Translation

    • Translates x86_64 macOS syscalls to ARM64 equivalents
    • Handles different calling conventions
    • Manages register state across syscall boundaries
  5. Runtime Support

    • CPU feature detection emulation
    • Thread-local storage handling
    • Signal and exception handling

How Apple Delivers Rosetta 2 in macOS

Installation Location

Rosetta 2 is located at:

/Library/Apple/usr/libexec/oah/
├── rosetta        # Main translator binary
├── rosettad       # Rosetta daemon
└── librosetta.*   # Runtime libraries

The oah directory stands for "Old Architecture Hardware" - a continuation from the PowerPC transition era.

Automatic Installation

On Apple Silicon Macs, Rosetta 2 is not installed by default. It's triggered in two ways:

  1. First Launch Prompt

    The "Rosetta" software is not installed on your Mac.
    Rosetta translates apps from Intel-based Macs for use on Apple Silicon Macs.
    
  2. Command-Line Installation

    softwareupdate --install-rosetta --agree-to-license
    

Components Delivered

| Component | Description | |-----------|-------------| | RosettaLinux/rosetta | Core ARM64 binary containing translation engine | | RosettaLinux/rosettad | System daemon managing translation services | | debugserver -> /usr/libexec/rosetta/debugserver | Debugging support for translated processes | | libRosettaRuntime | Runtime library linked during translation | | translate_tool -> /usr/libexec/rosetta/translate_tool | Translation tool for building translated binaries |

Integration with macOS

  1. launchd Integration: Rosetta daemon runs as a system service
  2. Code Signing: Translated binaries are code-signed automatically
  3. Gatekeeper: Rosetta-translated apps pass security checks
  4. System Integrity Protection: Protected from modification

Technical Architecture

Translation Process

┌──────────────────────────────────────────────────────────────────┐
│ Phase 1: Binary Loading                                          │
│ ───────────────────────────────────────────────────────────────  │
│ 1. Load x86_64 Mach-O binary                                    │
│ 2. Parse segments, sections, symbols                            │
│ 3. Validate code signatures                                      │
│ 4. Map into translation context                                  │
└──────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌──────────────────────────────────────────────────────────────────┐
│ Phase 2: AOT Translation                                         │
│ ───────────────────────────────────────────────────────────────  │
│ 1. Disassemble x86_64 code sections                              │
│ 2. Translate instructions to ARM64                               │
│ 3. Apply optimizations                                           │
│ 4. Store in translation cache (~/.oah)                          │
└──────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌──────────────────────────────────────────────────────────────────┐
│ Phase 3: Runtime Execution                                       │
│ ───────────────────────────────────────────────────────────────  │
│ 1. Load translated ARM64 code                                    │
│ 2. Set up x86_64 emulation context                               │
│ 3. Handle JIT translations for dynamic code                      │
│ 4. Translate syscalls on-the-fly                                 │
└──────────────────────────────────────────────────────────────────┘

Key Translation Challenges

  1. Register Mapping

    • x86_64 has 16 GPRs; ARM64 has 31 GPRs
    • x86_64 flags register -> ARM64 NZCV flags
    • RIP (instruction pointer) emulation
  2. Memory Ordering

    • x86_64: Strong memory ordering (TSO)
    • ARM64: Weak memory ordering
    • Requires memory barriers for correctness
  3. Vector Instructions

    • SSE (128-bit) -> NEON (128-bit) direct mapping
    • AVX (256-bit) -> NEON pair emulation
    • Different exception handling for SIMD
  4. Calling Conventions

    • x86_64: First 6 args in registers (RDI, RSI, RDX, RCX, R8, R9)
    • ARM64: First 8 args in registers (X0-X7)
    • Different stack frame layouts

This Project

This repository contains reverse-engineered implementations of functions from the Rosetta 2 binaries. Through careful analysis and decompilation, we've identified and documented the semantic purpose of hundreds of functions.

Goals

  1. Educational: Understand how Rosetta 2 works internally
  2. Documentation: Create comprehensive documentation of translation techniques
  3. Implementation: Provide clean, well-documented C implementations
  4. Community: Share knowledge with the reverse-engineering community

What We've Accomplished

  • 828 functions identified and named in the main rosetta binary
  • 612 functions fully implemented with clean C code
  • 66 categories of functionality documented
  • Complete function name mappings with semantic names

Categories of Functions

| Category | Functions | Description | |----------|-----------|-------------| | Entry Point | 1 | Rosetta initialization | | FP/Vector Operations | ~20 | Floating-point and SIMD state management | | SIMD Memory Operations | ~10 | memchr, memcmp, memcpy with SIMD | | Vector Operations | ~30 | NEON vector arithmetic, comparison | | Binary Translation | ~50 | x86_64 -> ARM64 instruction translation | | Syscall Handlers | ~60 | System call translation and forwarding | | Memory Management | ~20 | malloc, free, mmap wrappers | | Hash Functions | ~5 | Address hashing for translation cache | | String Operations | ~30 | SIMD-optimized string functions | | Cryptographic Extensions | ~30 | AES, SHA, CRC32 passthrough | | ELF Parsing | ~15 | Linux binary format support | | Translation Cache | ~20 | AOT/JIT cache management |


File Structure

Core Files

Rosetta2/
├── README.md                      # This file
├── rosetta_decomp.c               # Original decompilation (74,677 lines)
├── rosettad_decomp.c              # Daemon decompilation (44,064 lines)
├── rosetta_refactored.c           # Minimal wrapper (59 lines) - includes modular headers
├── rosetta_ref
View on GitHub
GitHub Stars59
CategoryDevelopment
Updated1d ago
Forks1

Languages

C

Security Score

80/100

Audited on Mar 29, 2026

No findings