Steve
Cursor for Minecraft
Install / Use
/learn @YuvDwi/SteveREADME
Steve AI - Autonomous AI Agent for Minecraft
We built Cursor for Minecraft. Instead of AI that helps you write code, you get AI agents that actually play the game with you.
https://github.com/user-attachments/assets/23f0ccdd-7a7a-4d49-9dd9-215ebf67265a
What It Does
Steve acts as an Agent, or a series of Agents if you choose to employ all of them. You describe what you want, and he understands the context and executes. Same concept here, except instead of code editing, you get embodied Steves that operate in your Minecraft world.
The interface is simple: press K to open a panel, type what you need. The agents handle the interpretation, planning, and execution. Say "mine some iron" and the agent reasons about where iron spawns, navigates to the appropriate depth, locates ore veins, and extracts the resources. Ask for a house and it considers the available materials, generates an appropriate structure, and builds it block by block.
What makes this interesting is the multi-agent coordination. When multiple Steves work on the same task, they don't just independently execute, they actively coordinate to avoid conflicts and optimize workload distribution. Tell three agents to build a castle and they'll automatically partition the structure, divide sections among themselves, and parallelize the construction.
The agents aren't following predefined scripts. They're operating off natural language instructions, which means:
- Resource extraction where agents determine optimal mining locations and strategies
- Autonomous building with agents planning layouts and material usage
- Combat and defense where agents assess threats and coordinate responses
- Exploration and gathering with pathfinding and resource location
- Collaborative execution with automatic workload balancing and conflict resolution
Quick Start
You need:
- Minecraft 1.20.1 with Forge
- Java 17
- An OpenAI API key (or Groq/Gemini if you prefer)
Installation:
- Download the JAR from releases
- Put it in your
modsfolder - Launch Minecraft
- Copy
config/steve-common.toml.exampletoconfig/steve-common.toml - Add your API key to the config
Config example:
[openai]
apiKey = "your-api-key-here"
model = "gpt-3.5-turbo"
maxTokens = 1000
temperature = 0.7
Then spawn a Steve with /steve spawn Bob and press K to start giving commands.
Usage Examples
"mine 20 iron ore"
"build a house near me"
"help Alex with the tower"
"defend me from zombies"
"follow me"
"gather wood from that forest"
"make a cobblestone platform here"
"attack that creeper"
The agents are pretty good at figuring out what you mean. You don't need to be super specific.
Technical Architecture
System Overview
Each Steve runs an autonomous agent loop that processes natural language commands through an LLM, converts them into structured actions, and executes them using Minecraft's game mechanics. The system uses a direct action execution model optimized for real-time gameplay rather than a traditional ReAct framework.
Core execution flow:
- User input captured via GUI (press K)
- Task sent to TaskPlanner with conversation context
- LLM (Groq/OpenAI/Gemini) generates structured action plan
- ResponseParser extracts actions from LLM response
- ActionExecutor processes actions through specialized action classes
- Actions execute tick-by-tick to avoid freezing the game
- Results fed back into conversation memory for context
Core Components
LLM Integration (com.steve.ai.llm)
- GeminiClient, GroqClient, OpenAIClient: Pluggable LLM providers for agent reasoning
- TaskPlanner: Orchestrates LLM calls with context (conversation history, world state, Steve capabilities)
- PromptBuilder: Constructs prompts with available actions, examples, and formatting instructions
- ResponseParser: Extracts structured action sequences from LLM responses
Action System (com.steve.ai.action)
- ActionExecutor: Tick-based action execution engine (prevents game freezing)
- BaseAction: Abstract class for all actions (mine, build, move, combat, etc.)
- Task: Data model for action parameters and metadata
- Available Actions:
- MineBlockAction: Intelligent ore/block mining with pathfinding
- BuildStructureAction: Procedural and template-based building
- PlaceBlockAction: Single block placement with validation
- MoveToAction: Pathfinding-based movement
- AttackAction: Combat with target selection
- FollowAction: Player/entity following
- WaitAction: Controlled delays and synchronization
Structure Generation (com.steve.ai.structure)
- StructureGenerators: Procedural generation algorithms (houses, castles, towers, barns)
- StructureTemplateLoader: NBT file loading from resources
- BlockPlacement: Shared data structure for block positioning
Multi-Agent Collaboration (com.steve.ai.action)
- CollaborativeBuildManager: Server-side coordination for parallel building
- Spatial partitioning: Automatically divides structures into non-overlapping sections
- Work distribution: Assigns sections to available Steves
- Conflict prevention: Atomic block placement with position tracking
- Dynamic rebalancing: Reassigns work when agents finish early
Memory & Context (com.steve.ai.memory)
- SteveMemory: Per-agent conversation history and task context
- WorldKnowledge: Tracks discovered resources, landmarks, and spatial data
- StructureRegistry: Catalogs built structures for reference and avoidance
Code Execution (com.steve.ai.execution)
- CodeExecutionEngine: GraalVM JavaScript engine for LLM-generated scripts
- SteveAPI: Safe API bridge exposing Minecraft actions to scripts
- Sandboxing: Restricted environment preventing harmful operations
Key Design Decisions
Tick-Based Execution
Actions run incrementally across multiple game ticks rather than blocking. This prevents server freezes and maintains responsiveness. Each action's tick() method does minimal work per frame and tracks progress internally.
Direct Action Execution (Not Traditional ReAct) While inspired by ReAct, we use direct action execution for real-time gameplay. The LLM generates complete action sequences upfront rather than iterative observe-think-act cycles. This reduces API calls and latency, critical for game responsiveness.
Multi-Agent Coordination Collaborative builds use deterministic spatial partitioning. Structures are divided into rectangular sections based on agent count. Each Steve claims a section atomically, preventing conflicts. The manager is fully server-side using ConcurrentHashMap for thread safety.
Memory Management Context windows are managed by pruning old messages while keeping recent exchanges and critical world state. Each LLM call includes: conversation history (last 10 exchanges), current task details, Steve's position/inventory, and known world features.
Integration with Minecraft
Entity Registration Steves are custom EntityType registered via Forge's deferred registry system. They extend PathfinderMob for vanilla pathfinding integration and implement custom goals for AI behavior.
Event Hooks
- ServerStarting: Initialize collaborative build manager
- ServerStopping: Cleanup active tasks and save state
- ClientTick: GUI rendering and input handling
GUI Implementation Custom overlay GUI activated with K key. Uses Minecraft's Screen class with custom rendering. Text input forwarded to TaskPlanner on submission.
Building from Source
Standard Gradle workflow:
git clone https://github.com/YuvDwi/Steve.git
cd Steve
./gradlew build
Output JAR will be in build/libs/. To test in development:
./gradlew runClient
Project Structure:
src/main/java/com/steve/ai/
├── entity/ # Steve entity, spawning, lifecycle
├── llm/ # LLM clients, prompt building, response parsing
├── action/ # Action classes and collaborative build manager
├── structure/ # Procedural generation and template loading
├── memory/ # Context management and world knowledge
├── execution/ # JavaScript code execution engine
├── client/ # GUI overlay
└── command/ # Minecraft commands (/steve spawn, etc)
Contributing
We welcome contributions! Here's how to get started:
Reporting Bugs
- Check existing issues first
- Include:
- Minecraft/Forge/Steve AI versions
- Steps to reproduce
- Expected vs actual behavior
- Logs from
logs/latest.log
Submitting Code
-
Fork and clone
git clone https://github.com/YourUsername/Steve.git cd Steve -
Create feature branch
git checkout -b feature/your-feature-name -
Make changes
- Follow code style (4-space indent, JavaDoc for public APIs)
- Test with
./gradlew build && ./gradlew runClient
-
Submit PR
- Clear commit messages
- Describe changes and reasoning
- Link related issues
Code Style
- Classes: PascalCase
- Methods/Variables: camelCase
- Constants: UPPER_SNAKE_CASE
- Indentation: 4 spaces
- Line length: Max 120 characters
- Comments: JavaDoc for public methods
Adding New Actions:
- Extend
BaseActionincom.steve.ai.action.actions - Implement
tick(),isComplete(),onCancel() - Update
PromptBuilder.javato inform LLM about new action - Add example usage in prompt template
Configuration
Edit config/steve-common.toml:
[llm]
provider = "groq" # Options: openai, groq, gemini
[openai]
apiKey = "sk-..."
model = "gpt-3.5-turbo"
maxTokens = 1000
temperature = 0.7
[groq]
apiKey = "gsk_..."
model = "llama3-70b-8192"
maxTokens = 1000
[gemini]
apiKey = "AI..."
model = "gemini-1.5-flash"
maxTokens = 1000
Performance Tips:
- Use Groq for fastest inference (
Related Skills
node-connect
347.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
107.8kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
347.0kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
347.0kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
