Autograder
Flexible, Consistent and Powerful Autograding Tool that grades and generates reports on your students submissions.
Install / Use
/learn @webtech-network/AutograderREADME
Autograder
<div align="center"> <img width="397" height="300" alt="image" src="https://github.com/user-attachments/assets/1e07d48e-08ac-4491-be92-569a9610e44d" />A educational-standards-driven autograding tool that transforms assignment grading into an engaging learning experience.
Features • Architecture • Quick Start • Templates • Pipeline • API • GitHub Action
</div>[!IMPORTANT]
The Autograder is in active development. New features are being added continuously, and we welcome contributions from the community. We would love to hear your suggestions or feature requests! Don't hesitate on opening an issue on GitHub.
Overview
The Autograder is an advanced educational tool designed to efficiently and accurately grade student submissions using actual pedagogical standards. What makes it stand out is its highly elaborated grading methodology that follows teacher-configured rubrics and generates comprehensive, student-friendly feedback reports.
Why Autograder?
- Teacher-Controlled Grading: Complete control over evaluation criteria with tree-structured rubrics
- Educational Standards: Implements proper scoring categories (base, bonus, penalty) with weighted subjects
- Multiple Assignment Types: Native support for Web Development, APIs, Command-Line Programs, and Custom Templates
- Secure Code Execution: Isolated sandbox environments for safe remote code execution
- Proven Engagement: Students treat assignments as iterative learning challenges
- High Performance: Warm container pools and pipeline architecture enable rapid grading at scale
- Intelligent Feedback: Focus-based feedback generation that highlights the most impactful improvements
Try It Now!
Want to see it in action? Run the interactive demo:
make examples-demo
Then open http://localhost:8080 in your browser to:
- Create grading configurations with visual tree builders
- Submit code examples in Python, Java, JavaScript, or C++
- View real-time grading results and score breakdowns
- Explore all API endpoints interactively
Note: Requires the API server running. Start it with:
make start-autograder
The Grading Pipeline
Every submission flows through a sophisticated pipeline:

Each step is designed to maintain educational standards while providing maximum flexibility.
Features
For Educators
-
Flexible Grading Rubrics: Create complex, tree-structured grading criteria with unlimited nesting
- Base requirements, bonus points, and penalty deductions
- Subject grouping with custom weights
- Hierarchical test organization
-
One-Time Configuration: Configure an assignment once, reuse for all submissions
- Store grading configurations as reusable packages
- Version control for grading criteria
- Template library for common assignment types
-
Customizable Feedback: Control how students receive feedback
- Default mode: Structured reports with test results
- AI mode: Intelligent, conversational feedback
- Focus-based feedback highlighting high-impact improvements
For Students
- Detailed Reports: Understand exactly why you received a certain score
- Actionable Feedback: Get specific guidance on what to improve
- Iterative Learning: Use feedback to improve and resubmit
- Transparent Grading: See the breakdown of scores across all criteria
For Developers
- REST API: Modern FastAPI-based web service
- GitHub Action: Seamless integration with GitHub Classroom
- Extensible Architecture: Pipeline-based design for easy customization
- Multiple Languages: Python, Java, JavaScript/Node.js, C++ support
- Custom Templates: Upload your own grading logic for specialized contexts
Architecture
The Autograder uses a pipeline architecture that processes submissions through choreographed steps, providing flexibility and excellent performance.
Core Components
Pipeline Pattern
The system is built around AutograderPipeline - a stateless, reusable grading workflow.
# Build a pipeline (configuration-driven)
pipeline = build_pipeline(
template_name="input_output",
include_feedback=True,
grading_criteria=criteria_config,
feedback_config=feedback_settings,
setup_config={"required_files": ["main.py"]},
feedback_mode="ai"
)
# Execute pipeline (reusable for any submission)
result = pipeline.run(submission)
Criteria Tree
Grading criteria are represented as a tree structure mirroring educational rubrics:
CriteriaTree
├── Base (weight: 100)
│ ├── Subject: Functionality (weight: 60)
│ │ ├── Test: Correct Output (weight: 100)
│ │ └── Test: Edge Cases (weight: 100)
│ └── Subject: Code Quality (weight: 40)
│ ├── Test: Proper Syntax (weight: 50)
│ └── Test: Good Practices (weight: 50)
├── Bonus (weight: 10)
│ └── Test: Extra Features
└── Penalty (weight: -20)
└── Test: Late Submission
Sandbox Management
The SandboxManager provides secure, isolated execution environments:
- Container Pooling: Pre-started warm containers ready to execute
- Multi-Language: Python, Java, JavaScript, and C++ support
- Automatic Lifecycle: TTL management, health checks, and cleanup
- Resource Control: Memory limits, timeouts, and isolation
Template System
Templates provide test functions for different assignment contexts:
- WebDev: HTML, CSS, JavaScript validation
- API Testing: HTTP request validation
- Input/Output: Command-line program testing
- Custom: Upload your own test logic
Pipeline Workflow
The pipeline executes these steps in sequence:
- Load Template - Select test functions from the template library
- Build Tree - Construct the grading rubric hierarchy
- Sandbox - Secure environment acquisition and initial workspace preparation
- Pre-Flight - Validate requirements and execute setup/compilation commands
- Grade - Execute tests and calculate weighted scores
- Focus - Identify high-impact failed tests
- Feedback - Generate student-friendly reports
- Export - Send results to external systems (optional)
Each step receives a PipelineExecution object, performs its operation, and passes results to the next step.
Grading Templates
Native Templates
1. Input/Output Template
Tests command-line programs by providing inputs and validating outputs.
| Test Name | Description | Key Parameters |
|--------------------|---------------------------------------------------------|----------------------------------------------|
| expect_output | Execute program with inputs and verify output | inputs, expected_output, program_command |
| dont_fail | Validates that a program don't crash on a given input | inputs, program_command |
| forbidden_import | Analyzes a file looking for specified libraries imports | forbidden_imports |
2. API Testing Template
Makes HTTP requests to student APIs and validates responses.
| Test Name | Description | Key Parameters |
|-----------|-------------|----------------|
| health_check | Verify endpoint returns 200 OK | endpoint |
| check_response_json | Validate JSON response structure | endpoint, expected_key, expected_value |
| check_status_code | Test specific HTTP status codes | endpoint, method, expected_status |
| validate_headers | Check response headers | endpoint, expected_headers |
3. Web Development Template
Validates HTML, CSS, and JavaScript files.
| Test Name | Description | Key Parameters |
|-----------|-------------|----------------|
| has_tag | Check for HTML tags | tag, required_count |
| has_class | Validate CSS classes (supports wildcards like col-*) | class_names, required_count |
| check_bootstrap_linked | Verify framework inclusion | framework |
| has_attribute | Check element attributes | tag, attribute, required_count |
| check_css_property | Validate CSS rules | selector, property, expected_value |
And much more! Check the WebDev Template Documentation for the full list of tests.
4. Custom Templates
Upload your own test functions for specialized grading contexts:
from autograder.models.abstract.test_function import TestFunction
from autograder.models.dataclass.test_result import TestResult
class MyCustomTest(TestFunction):
@property
def name(self):
return "my_custom_test"
def execute(self, files, sandbox, **kwargs) -> TestResult:
# Your custom grading logic
score = 100 if condition else 0
return TestResult(
test_name=self.name,
score=score,
report="Test passed!" if score == 100 else "Test failed"
)
Quick Start
Prerequisites
- Python 3.9+
- Docker and Docker Compose
Installation
- Clone the repository
git clone https://github.com/yourusername/autograder.git
cd autograder
- Install dependencies
pip install -r requirements.txt
- Configure sandbox pools (edit
sandbox_config.yml)
general:
# Number of sandboxes to create for each language at startup
# Development: 2-3, Produ
Related Skills
node-connect
352.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
352.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
352.2kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
