Autograder

Flexible, Consistent and Powerful Autograding Tool that grades and generates reports on your students submissions.

Generate Convert Improve

Install / Use

/learn @webtech-network/Autograder

About this skill

Quality Score

0/100

README

Autograder

A educational-standards-driven autograding tool that transforms assignment grading into an engaging learning experience.

Features • Architecture • Quick Start • Templates • Pipeline • API • GitHub Action

</div>

[!IMPORTANT]

The Autograder is in active development. New features are being added continuously, and we welcome contributions from the community. We would love to hear your suggestions or feature requests! Don't hesitate on opening an issue on GitHub.

Overview

The Autograder is an advanced educational tool designed to efficiently and accurately grade student submissions using actual pedagogical standards. What makes it stand out is its highly elaborated grading methodology that follows teacher-configured rubrics and generates comprehensive, student-friendly feedback reports.

Why Autograder?

Teacher-Controlled Grading: Complete control over evaluation criteria with tree-structured rubrics
Educational Standards: Implements proper scoring categories (base, bonus, penalty) with weighted subjects
Multiple Assignment Types: Native support for Web Development, APIs, Command-Line Programs, and Custom Templates
Secure Code Execution: Isolated sandbox environments for safe remote code execution
Proven Engagement: Students treat assignments as iterative learning challenges
High Performance: Warm container pools and pipeline architecture enable rapid grading at scale
Intelligent Feedback: Focus-based feedback generation that highlights the most impactful improvements

Try It Now!

Want to see it in action? Run the interactive demo:

make examples-demo

Then open http://localhost:8080 in your browser to:

Create grading configurations with visual tree builders
Submit code examples in Python, Java, JavaScript, or C++
View real-time grading results and score breakdowns
Explore all API endpoints interactively

Note: Requires the API server running. Start it with: make start-autograder

The Grading Pipeline

Every submission flows through a sophisticated pipeline:

Pipeline Diagram

Each step is designed to maintain educational standards while providing maximum flexibility.

Features

For Educators

Flexible Grading Rubrics: Create complex, tree-structured grading criteria with unlimited nesting
- Base requirements, bonus points, and penalty deductions
- Subject grouping with custom weights
- Hierarchical test organization
One-Time Configuration: Configure an assignment once, reuse for all submissions
- Store grading configurations as reusable packages
- Version control for grading criteria
- Template library for common assignment types
Customizable Feedback: Control how students receive feedback
- Default mode: Structured reports with test results
- AI mode: Intelligent, conversational feedback
- Focus-based feedback highlighting high-impact improvements

For Students

Detailed Reports: Understand exactly why you received a certain score
Actionable Feedback: Get specific guidance on what to improve
Iterative Learning: Use feedback to improve and resubmit
Transparent Grading: See the breakdown of scores across all criteria

For Developers

REST API: Modern FastAPI-based web service
GitHub Action: Seamless integration with GitHub Classroom
Extensible Architecture: Pipeline-based design for easy customization
Multiple Languages: Python, Java, JavaScript/Node.js, C++ support
Custom Templates: Upload your own grading logic for specialized contexts

Architecture

The Autograder uses a pipeline architecture that processes submissions through choreographed steps, providing flexibility and excellent performance.

Core Components

Pipeline Pattern

The system is built around AutograderPipeline - a stateless, reusable grading workflow.

# Build a pipeline (configuration-driven)
pipeline = build_pipeline(
    template_name="input_output",
    include_feedback=True,
    grading_criteria=criteria_config,
    feedback_config=feedback_settings,
    setup_config={"required_files": ["main.py"]},
    feedback_mode="ai"
)

# Execute pipeline (reusable for any submission)
result = pipeline.run(submission)

Criteria Tree

Grading criteria are represented as a tree structure mirroring educational rubrics:

CriteriaTree
├── Base (weight: 100)
│   ├── Subject: Functionality (weight: 60)
│   │   ├── Test: Correct Output (weight: 100)
│   │   └── Test: Edge Cases (weight: 100)
│   └── Subject: Code Quality (weight: 40)
│       ├── Test: Proper Syntax (weight: 50)
│       └── Test: Good Practices (weight: 50)
├── Bonus (weight: 10)
│   └── Test: Extra Features
└── Penalty (weight: -20)
    └── Test: Late Submission

Sandbox Management

The SandboxManager provides secure, isolated execution environments:

Container Pooling: Pre-started warm containers ready to execute
Multi-Language: Python, Java, JavaScript, and C++ support
Automatic Lifecycle: TTL management, health checks, and cleanup
Resource Control: Memory limits, timeouts, and isolation

Template System

Templates provide test functions for different assignment contexts:

WebDev: HTML, CSS, JavaScript validation
API Testing: HTTP request validation
Input/Output: Command-line program testing
Custom: Upload your own test logic

Pipeline Workflow

The pipeline executes these steps in sequence:

Load Template - Select test functions from the template library
Build Tree - Construct the grading rubric hierarchy
Sandbox - Secure environment acquisition and initial workspace preparation
Pre-Flight - Validate requirements and execute setup/compilation commands
Grade - Execute tests and calculate weighted scores
Focus - Identify high-impact failed tests
Feedback - Generate student-friendly reports
Export - Send results to external systems (optional)

Each step receives a PipelineExecution object, performs its operation, and passes results to the next step.

Grading Templates

Native Templates

1. Input/Output Template

Tests command-line programs by providing inputs and validating outputs.

| Test Name | Description | Key Parameters | |--------------------|---------------------------------------------------------|----------------------------------------------| | expect_output | Execute program with inputs and verify output | inputs, expected_output, program_command | | dont_fail | Validates that a program don't crash on a given input | inputs, program_command | | forbidden_import | Analyzes a file looking for specified libraries imports | forbidden_imports |

2. API Testing Template

Makes HTTP requests to student APIs and validates responses.

| Test Name | Description | Key Parameters | |-----------|-------------|----------------| | health_check | Verify endpoint returns 200 OK | endpoint | | check_response_json | Validate JSON response structure | endpoint, expected_key, expected_value | | check_status_code | Test specific HTTP status codes | endpoint, method, expected_status | | validate_headers | Check response headers | endpoint, expected_headers |

3. Web Development Template

Validates HTML, CSS, and JavaScript files.

| Test Name | Description | Key Parameters | |-----------|-------------|----------------| | has_tag | Check for HTML tags | tag, required_count | | has_class | Validate CSS classes (supports wildcards like col-*) | class_names, required_count | | check_bootstrap_linked | Verify framework inclusion | framework | | has_attribute | Check element attributes | tag, attribute, required_count | | check_css_property | Validate CSS rules | selector, property, expected_value |

And much more! Check the WebDev Template Documentation for the full list of tests.

4. Custom Templates

Upload your own test functions for specialized grading contexts:

from autograder.models.abstract.test_function import TestFunction
from autograder.models.dataclass.test_result import TestResult

class MyCustomTest(TestFunction):
    @property
    def name(self):
        return "my_custom_test"
    
    def execute(self, files, sandbox, **kwargs) -> TestResult:
        # Your custom grading logic
        score = 100 if condition else 0
        return TestResult(
            test_name=self.name,
            score=score,
            report="Test passed!" if score == 100 else "Test failed"
        )

Quick Start

Prerequisites

Python 3.9+
Docker and Docker Compose

Installation

Clone the repository

git clone https://github.com/yourusername/autograder.git
cd autograder

Install dependencies

pip install -r requirements.txt

Configure sandbox pools (edit sandbox_config.yml)

general:
    # Number of sandboxes to create for each language at startup
    # Development: 2-3, Produ

Related Skills

node-connect

352.2k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

111.1k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

352.2k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

352.2k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。