Tetris
A lightweight, high-performance Tetris implementation in pure MASM Assembly for Windows 11. Features 7-bag randomizer, GDI double-buffering, registry-based persistence, and zero external dependencies. Final binary: <20 KB. Demonstrates low-level Win32 API programming with optimized memory management and direct OS integration.
Install / Use
/learn @wesmar/TetrisREADME
Tetris-Assembly (x86 & x64)

📅 Update 25.01.2026 — x64 Visual Enhancements
The x64 version received significant visual and UX improvements, leveraging modern Windows 11 APIs while maintaining the lightweight assembly approach:
| Feature | Description |
| :--- | :--- |
| Mica Backdrop Effect | Dark mode title bar with Windows 11 Mica material (DWMWA_USE_IMMERSIVE_DARK_MODE + DWMWA_SYSTEMBACKDROP_TYPE) for a sleek, modern appearance |
| Segoe UI Typography | All UI elements now use the Segoe UI font family with proper weight variations for improved readability |
| Green Player Field | Player name input field changes to light green (#E0FFE0) when text is entered, providing visual feedback that the name is set |
| Gold Line Clear Animation | Clearing lines triggers a smooth 300ms fade-out animation from gold (RGB 255,215,0) to black, replacing instant row removal |
| Modern Button Styling | Buttons use smaller, cleaner font styling consistent with Windows 11 design language |
| Resource Files | Added tetris.rc (resource script) and tetris.manifest (application manifest for DPI awareness and visual styles) |
Binary size impact: x64 binary increased from ~15 KB to ~18 KB (+20%) due to DWM API integration and animation code.
Note: These enhancements are exclusive to the x64 version. The x86 version remains unchanged at ~13 KB with classic Windows styling. The x64 implementation demonstrates more advanced Win32/DWM techniques due to the additional complexity already inherent in 64-bit assembly programming.
A high-performance, lightweight Tetris implementation written in pure Assembly (MASM) utilizing the Windows API. The project focuses on minimal binary footprint, efficient memory management, and direct OS integration without C Runtime (CRT) dependency.
Available in two architectures:
- x86 (32-bit): ~13 KB binary
- x64 (64-bit): ~18 KB binary
Subsystem: Windows (GUI)
🔗 Quick Links
- Download Binaries: tetris.zip (v1.0.0) - Contains both x86 and x64 versions
- Source Repository: GitHub
🏗️ Architecture Comparison: x86 vs x64
Binary & Source Code Metrics
| Metric | x86 (32-bit) | x64 (64-bit) | Notes |
| :--- | :--- | :--- | :--- |
| Final Binary Size | ~13 KB | ~18 KB | +38% size increase due to 64-bit pointers, alignment, and DWM/Mica integration |
| Source Code Lines | ~2,400 LOC | ~2,600 LOC | +8% lines for manual calling convention management |
| Calling Convention | stdcall | Microsoft x64 (fastcall) | Fundamental architectural difference |
Key Technical Differences
1. Calling Convention Complexity
The x86 version benefits from the comfortable stdcall convention with MASM's invoke macro, which abstracts argument passing:
; x86: Simple and readable
invoke MessageBox, hWnd, addr szMessage, addr szTitle, MB_OK
The x64 version requires manual implementation of Microsoft's x64 calling convention (fastcall variant):
; x64: Manual register loading and stack management
mov rcx, hWnd ; 1st argument in RCX
lea rdx, szMessage ; 2nd argument in RDX
lea r8, szTitle ; 3rd argument in R8
mov r9d, MB_OK ; 4th argument in R9
sub rsp, 32 ; Shadow space (mandatory 32 bytes)
call MessageBox
add rsp, 32 ; Clean up shadow space
2. Shadow Space Requirement
x64 mandates 32 bytes (4×8-byte slots) of "shadow space" on the stack for every function call, even if the function takes fewer than 4 parameters. This is a strict ABI requirement for Windows x64 and must be maintained even when not passing arguments via stack.
3. Stack Alignment
x64 requires 16-byte stack alignment (RSP & 0xF == 0) before call instructions. Misalignment causes crashes in many API functions (particularly graphics-related). This requires explicit alignment:
and rsp, -16 ; Align to 16-byte boundary
4. Register Usage
- x86: Arguments pushed on stack (right-to-left), return value in
EAX - x64: First 4 integer/pointer arguments in
RCX,RDX,R8,R9; additional arguments on stack; return value inRAX
5. Pointer Size Impact
All pointers and handles are 64-bit (8 bytes) in x64, affecting:
- Structure sizes and alignment
- Memory access patterns
- Address arithmetic
Development Challenges: x86 → x64 Migration
The transition from x86 to x64 was a significant undertaking, primarily due to:
-
Loss of High-Level Abstractions: The comfortable
invokemacro in x86 (which auto-generates push sequences) is unavailable in x64. Every API call requires 5-7 lines of manual register/stack management. -
Shadow Space Management: Unlike x86's simple stack cleanup (
add esp, N), x64's shadow space requirement adds cognitive overhead to every function call. Forgetting to allocate or deallocate shadow space leads to stack corruption. -
Stack Alignment Debugging: Crashes due to misaligned stacks are notoriously difficult to debug. A single misalignment early in the call chain can cause failures deep in GDI/Win32 APIs, far from the actual error.
-
Increased Code Verbosity: Simple operations in x86 (1 line with
invoke) expand to 6+ lines in x64, reducing code readability and increasing maintenance burden. -
No
invokeSafety Net: The x86invokemacro performs type checking and automatic stack cleanup. x64 requires manual verification of argument types, counts, and calling conventions for every API.
Despite these challenges, the x64 version maintains identical functionality and visual behavior, demonstrating that low-level assembly can achieve platform parity with careful attention to ABI details.
🛠 Technical Specifications & Features
1. Core Engine
- Zero-Dependency: No external libraries beyond standard Windows system DLLs (
user32,gdi32,kernel32,advapi32,shell32). - Memory Footprint: Highly optimized data structures. The entire game state is encapsulated in a single
GAME_STATEstructure. - 7-Bag Randomizer: Implements the modern Tetris Guideline "Random Generator" (7-bag) algorithm using Fisher-Yates shuffle. This ensures a uniform distribution of pieces and prevents long droughts of specific shapes by shuffling a "bag" of all 7 tetrominoes.
- Fixed Timestep: Game logic is driven by a high-frequency loop tuned for 60 FPS (~16ms delta), ensuring smooth input response and movement.
- SRS-inspired Rotation: Super Rotation System with wall kick tables for both standard pieces and I-piece, allowing rotation near walls and floors.
2. Graphics & Rendering
- GDI Double Buffering: Implementation of a backbuffer system using
CreateCompatibleDCandCreateCompatibleBitmapto eliminate flickering during high-frequency screen invalidation. - Ghost Piece Preview: Toggleable semi-transparent hatch pattern overlay showing the landing position of the current piece, rendered using
CreateHatchBrushwithHS_DIAGCROSSpattern. - Animated UI Elements: Pulsing "PAUSED" text with sine-wave brightness modulation (127-255 range) at 60 FPS for smooth visual feedback.
- Vector-like Tetromino Definition: Shapes are defined as coordinate offsets in
SHAPE_TEMPLATES, allowing for efficient rotation and collision calculations via iterative offset addition. - Dynamic UI: Integration of standard Win32 controls (Edit boxes, Buttons) with custom GDI-rendered game area.
- Color-Coded Interface: Next piece preview and record holder name displayed in matching piece colors for visual consistency.
3. Data Persistence (Registry-based)
Unlike traditional implementations using .ini or .cfg files, this project utilizes the Windows Registry for state persistence:
- Path:
HKEY_CURRENT_USER\Software\Tetris - Stored Keys:
PlayerName(REG_SZ / Unicode): Last active player identity.HighScore(REG_DWORD): Maximum score achieved.HighScoreName(REG_SZ / Unicode): Name of the record holder.
- Encoding: Full Unicode support for player names via
RegQueryValueExWandRegSetValueExW. - Clear Record Feature: One-click registry cleanup via Alt+C or dedicated button with confirmation dialog.
4. Collision & Logic
- AABB-style Collision: Piece-to-wall and piece-to-stack collision detection implemented through boundary checking and bitmask-like array lookups in the 10x20 board buffer.
- Line Clearing: Optimized scanline algorithm that identifies full rows and performs a memory-shift operation to drop the remaining blocks. Supports simultaneous multi-line clears.
- Progressive Difficulty: Gravity speed increases with level (every 10 lines cleared), calculated using fixed-point arithmetic with 1/10000 precision for smooth acceleration.
- Scoring System: Quadratic scaling (lines² × 100 × level) rewards multi-line clears and higher levels.
5. User Experience
- Keyboard Shortcuts: Full accelerator table support (P/Alt+P, P/Alt+R, Alt+C) for pause, resume, and clear operations.
- Customizable Icon: Dynamic icon loading from
shell32.dllviaExtractIconAPI (configurable index). - Real-time Name Persistence: Player name auto-saves on text change via
EN_CHANGEnotification. - Anonymous Fallback: Automatically assigns "Anonymous" to high scores when no player name is set.
📂 Project Structure
The repository contains separate implementations for both architectures in dedicated directories:
Tetris_asm/
├── x86/ # 32-bit implementation
│ ├── main.asm # Entry point, WndProc, message loop
│ ├── game.asm # Core game logic
│ ├── render.asm # GDI rendering engine
│ ├── registry.asm # Registry persistence layer
│ ├── data.inc # Structure
Related Skills
node-connect
341.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
84.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
341.0kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
84.4kCommit, push, and open a PR
