<div align="center"> <picture> <source media="(prefers-color-scheme: dark)" srcset="./assets/logo-light.png" /> <img alt="DeepFabric logo" src="./assets/logo-light.png" style="width:40%;max-width:40%;height:auto;display:block;margin:0 auto;" /> </picture> <h3>Training Model Behavior in Agentic Systems</h3>  <p> <a href="https://github.com/always-further/deepfabric/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22"> <img src="https://img.shields.io/badge/Contribute-Good%20First%20Issues-green?style=for-the-badge&logo=github" alt="Good First Issues"/> </a>   <a href="https://discord.gg/pPcjYzGvbS"> <img src="https://img.shields.io/badge/Chat-Join%20Discord-7289da?style=for-the-badge&logo=discord&logoColor=white" alt="Join Discord"/> </a> </p>  <p> <a href="https://opensource.org/licenses/Apache-2.0"> <img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg" alt="License"/> </a> <a href="https://github.com/always-further/deepfabric/actions/workflows/test.yml"> <img src="https://github.com/always-further/deepfabric/actions/workflows/test.yml/badge.svg" alt="CI Status"/> </a> <a href="https://pypi.org/project/deepfabric/"> <img src="https://img.shields.io/pypi/v/deepfabric.svg" alt="PyPI Version"/> </a> <a href="https://pepy.tech/project/deepfabric"> <img src="https://static.pepy.tech/badge/deepfabric" alt="Downloads"/> </a> <a href="https://discord.gg/pPcjYzGvbS"> <img src="https://img.shields.io/discord/1384081906773131274?color=7289da&label=Discord&logo=discord&logoColor=white" alt="Discord"/> </a> <a href="https://www.reddit.com/r/deepfabric/"> <img src="https://img.shields.io/badge/Reddit-r%2Fdeepfabric-FF4500?logo=reddit&logoColor=white" alt="Reddit"/> </a> </p> </div>

DeepFabric generates synthetic training data for language models and agent evaluations. By combining reasoning traces with tool-calling patterns, it creates high-quality, domain-specific datasets that teach models to think, plan, and act effectively, call tools correctly, and conform to strict schema structures.

What sets DeepFabric apart from other dataset generation tools is its ability to ensure high diversity yet domain-anchored relevance through unique topic graph generation algorithms. This guides sample creation to cover all necessary subtopics while avoiding redundancy, which is where other tools often fall short, resulting in model overfit.

Constrained decoding and response validation, along with real tool executions within isolated webassembly environments, ensure that generated samples strictly adhere to structured schema, variable constraints, and execution correctness, ensuring datasets have exact syntax and structure for use in model training pipelines. Tool definations can be either directly imported from MCP (Model Context Protocol) server schemas and automatically mocked, real life interfaces along with a standard set of common tools (list_files(), 'read_file() etc)

Once your dataset is generated, it can be automatically uploaded to Hugging Face and directly imported into popular training frameworks like TRL, Unsloth, and Axolotl.

Post-training, DeepFabric's built-in evaluation engine assesses model performance, whereby models prove their capabilities on unseen tasks derived from training splits—covering evaluation-only questions, answers, and tool traces.

Quickstart

DeepFabric can be used in several ways, as a library, CLI tool, or via YAML configuration. Here's a quick example using the CLI:

pip install deepfabric

export OPENAI_API_KEY="your-api-key"

deepfabric generate \
  --topic-prompt "Python programming fundamentals" \
  --generation-system-prompt "You are a Python expert" \
  --mode graph \
  --depth 3 \
  --degree 3 \
  --num-samples 9 \
  --batch-size 3 \
  --provider openai \
  --model gpt-4o \
  --output-save-as dataset.jsonl

This generates a topic graph and creates 27 unique nodes, then generates 27 training samples saved to dataset.jsonl, giving you 100% topic coverage.

Configuration

DeepFabric also uses YAML configuration with three main sections and optional shared LLM defaults

[!NOTE]
The following uses mocked tool execution, so will require a runing Spin service, which we provide in a docker image:

docker run -d -p 3000:3000 ghcr.io/always-further/deepfabric/tools-sdk:latest`

Save the following as config.yaml:

# Optional: Shared LLM defaults (inherited by topics and generation)
llm:
  provider: "openai"
  model: "gpt-4o"
  temperature: 0.7

# TOPICS: Generate the topic tree/graph
topics:
  prompt: "Building production-ready REST APIs with Python"
  mode: tree                    # tree | graph
  depth: 3
  degree: 3
  save_as: "topics.jsonl"
  # Optional: Override shared LLM settings
  llm:
    model: "gpt-4o-mini"        # Use cheaper model for topics

# GENERATION: Create training samples from topics
generation:
  system_prompt: |
    You are an expert Python backend developer specializing in REST API design.
    Create practical, production-ready code examples with clear explanations.
    Include error handling, type hints, and follow PEP 8 conventions.
    Use the following tools to read, write, and list files in the virtual filesystem:
    - read_file
    - write_file
    - list_files

  # Additional instructions for sample generation
  instructions: |
    Focus on real-world scenarios developers encounter daily when building REST APIs with Python.
    Include both happy path and edge case handling.
    Provide context on when and why to use specific patterns or libraries.
    Ensure code is modular, testable, and maintainable.

  # Agent mode is implicit when tools are configured
  conversation:
    type: cot      # basic | cot
    reasoning_style: agent      # freetext | agent (for cot)

  # Tool configuration (enables agent mode automatically)
  tools:
    spin_endpoint: "http://localhost:3000"  # Spin service for tool execution
    components:                 # Map component name to tool names
      builtin:                  # Routes to /vfs/execute
        - read_file
        - write_file
        - list_files
    max_per_query: 3            # Maximum tools per query
    max_agent_steps: 5          # Max ReAct reasoning iterations

  # Optional: Seed initial files into the spin before generation, used for tool calling
    scenario_seed:
      files:
        "Dockerfile": |
          FROM python:3.13
          WORKDIR /usr/local/app

          # Install the application dependencies
          COPY requirements.txt ./
          RUN pip install --no-cache-dir -r requirements.txt

          # Copy in the source code
          COPY src ./src
          EXPOSE 8080

          # Setup an app user so the container doesn't run as the root user
          RUN useradd app
          USER app

          CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8080"]
        "main.py": |
          def greet(name):
              return f"Hello, {name}!"

          if __name__ == "__main__":
              print(greet("World"))
        "config.json": |
          {
            "version": "1.0.0",
            "debug": true,
            "max_retries": 3
          }

  # Generation control and retry settings
  max_retries: 3                # Retries for failed generations
  sample_retries: 2             # Retries for validation failures
  max_tokens: 2000              # Max tokens per generation

  # Optional: Override shared LLM settings
  llm:
    temperature: 0.3            # Lower temp for consistent code

# OUTPUT: Final dataset configuration
output:
  # System prompt that goes INTO the training data
  # This is what the trained model will see as its system message
  system_prompt: |
    You are a helpful Python programming assistant specialized in REST API
    development. You provide clear, production-ready code with explanations.
    Always consider security, error handling, and best practices.

  include_system_message: true  # Whether to include system message in output
  num_samples: 4                 # Total training samples to generate
  batch_size: 3                 # Parallel generation batch size
  save_as: "api-dataset.jsonl"

 Optional: Upload to Hugging Face
 huggingface:
   repository: "your-username/api-dataset-training-name"
   tags: ["python", "programming"]

Run generation by sourcing the config.yaml:

deepfabric generate config.yaml

Generate, Train, Evaluate

DeepFabric returns standard HuggingFace datasets, making it easy to integrate with any training framework.

Colab Notebooks:

A quick way of seeing DeepFabric in action is via our notebooks in the notebooks/ folder or on Google Colab:

Qwen4b Blender MCP:

1. Generate Dataset

deepfabric generate config.yaml --output-save-as dataset.jsonl

Or upload to HuggingFace Hub:

deepfabric upload-hf dataset.jsonl --repo your-username/my-dataset

2. Load and Split for Training

from datasets import load_dataset
from transformers import AutoTokenizer

# Load from Hub
dataset = load_dataset("alwaysfurther/deepfabric-generic-tools", split="train")

# Split into train/eval
splits = dataset.train_test_split(test_size=0.1, seed=42)
train_ds = splits["train"]
eval_ds = splits["test"]

# Format using your tokenizer
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B-Instruct")

def format_example(example):
    messages = [{k: v for k, v in msg.items() if v is not None}
                for msg in example["messages"]]
    return {"text": tokenizer.apply_chat_temp

Deepfabric

Install / Use

README