SkillAgentSearch skills...

TiamaT

No description available

Install / Use

/learn @Chaouabti/TiamaT
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<p align="center"> <img src="docs/assets/Dingir.png" alt="TiamaT Logo" width="100"/> </p>

🐉 TiamaT – Toolkit for Integrated Annotation and Machine-learning Assisted Training

TiamaT is a complete, modular pipeline that helps transform raw, unstructured images into fully annotated, machine learning–ready datasets.

Designed for flexibility and reusability, it supports every step of the computer vision training workflow:

  • 📌 Manual annotation with Label Studio
  • 🔧 Dataset formatting and transformation
  • 🧠 Model training and inference using YOLO
  • ✅ Evaluation and correction through human-in-the-loop cycles

Originally built for historical document analysis, TiamaT fits any project where annotations are built incrementally or interactively.

The name is a nod to Tiamat, the Mesopotamian goddess of the ocean and chaos — an appropriate symbol for turning raw data into structured knowledge.


Table of Contents


🌍 Workflow Overview

The TiamaT pipeline covers the full annotation-to-training lifecycle. It is organized as a modular sequence of stages, which can be executed in two different modes depending on your needs:

💫 Execution Modes (Notebook & Script)

You can use TiamaT either as:

  • 📓 Interactive notebooks – ideal for exploration, development, or adjusting specific parameters step by step.
  • ⚙️ Command-line scripts – ideal for automation, production, or batch execution.

Each stage of the pipeline exists in both formats. The scripts are located in src/scripts/ and replicate the logic of the Jupyter notebooks in src/notebooks/.

You can mix both modes depending on your workflow — for instance, prototype in notebook, then automate with scripts.


🧿 Pipeline Steps Overview

[!NOTE] For a detailed description of each stage (inputs, outputs, scripts, tips), see docs/pipeline_overview.md.

| Stage | Description | Notebook | Script | |-------|-------------|----------|--------| | 0 | Launch Label Studio (optional) | 0_Launching_LS.ipynb | Use label-studio CLI | | 1 | Extract annotated training data | 1_Get_training_data.ipynb | extract_training_data.py | | 2 | Compute dataset statistics | 2_Statistics_for_training_data.ipynb | analyze_dataset.py | | 3 | Prepare data and train model | 3_Data_preparation_and_training.ipynb | train_model.py | | 4 | Predict on new images | 4_Predicting_and_checking_YOLO_results.ipynb | predict.py | | 5 | Evaluate model & review corrections | 5_Model_evaluation.ipynb | evaluate_model.py | | 6 | Generate updated ground truth | 6_Generate_new_ground_truth.ipynb | generate_ground_truth.py |

📝 Notebooks are recommended for first-time users and experimentation.

🖥️ Scripts are better suited for iterative workflows and large-scale runs.


🧱 Project Folder Structure

To ensure smooth execution, the TiamaT pipeline expects a specific folder organization. This structure separates raw inputs, manual annotations, training datasets, and model outputs in a modular and reproducible way.

TiamaT/
├── data/                         # Final YOLO-formatted training datasets (images, labels, labels.txt)
├── project/                      # Raw images and annotations (excluded from Git except structure)
│   ├── image_inputs/             # Source images used in the pipeline
│   │   ├── ground_truth_images/       # Manually annotated images used for training
│   │   └── eval_images/               # Images used for inference and manual correction
│   ├── annotations/             # Annotations exported or corrected via Label Studio
│   │   ├── ground_truth/              # Ground truth annotations manually created in LS
│   │   └── prediction_corrections/    # Corrections made after model predictions
├── output/                       # Model training and prediction results
│   └── runs/
│       ├── train/                     # YOLO training runs (auto-generated folders: exp1, exp2, ...)
│       └── predict/                   # Inference outputs (e.g., predicted labels)
├── src/                          # Core source code
│   ├── notebooks/                   # Step-by-step Jupyter notebooks for the full pipeline
│   ├── scripts/                     # Python scripts for CLI-based execution
│   ├── modules/                     # Reusable Python modules (transforms, utils, etc.)
│   └── config.py                    # Shared configuration and path definitions
├── requirements/                 # Installation requirements (pip or conda)
│   ├── tiamat.txt                   # Main pipeline dependencies (YOLO, OpenCV, etc.)
│   └── label_studio.txt             # Label Studio annotation environment
├── .env.example                 # Template for custom environment variables
└── README.md                    # Main project documentation

⭐️ All notebooks and scripts rely on this layout to locate and process data automatically.

[!WARNING] Only project_name can be freely renamed — all other folder names must be preserved for the code to function correctly.


🚀 Running TiamaT

Once your environments are set up and the project structure is ready, you can execute the pipeline either via Jupyter notebooks or Python scripts (see Workflow Overview for the full step mapping).

🐦‍🔥 Iterative Workflow

TiamaT is built around a human-in-the-loop workflow. The model is not evaluated on a classic "test set", but rather through manual correction of its predictions. This makes the pipeline especially useful when no gold-standard ground truth exists upfront.

        ┌────────────────────────────┐
        │ 0. Launch Label Studio     │
        └─────────────┬──────────────┘                                  
                      ▼                 
        ┌────────────────────────────┐           
        │ 1. Extract Training Data   │           
        └─────────────┬──────────────┘           
                      ▼                          
        ┌────────────────────────────┐             
        │ 2. Dataset Statistics      │ ◄─────┐
        └─────────────┬──────────────┘       │
                      ▼                      │
        ┌────────────────────────────┐       │
        │ 3. Train YOLO Model        │       │
        └─────────────┬──────────────┘       │
                      ▼                      │
        ┌────────────────────────────┐       │
        │ 4. Predict with Model      │       │
        └─────────────┬──────────────┘       │
                      ▼                      │
        ┌────────────────────────────┐       │
        │ 5. Review + Correction     │       │
        └─────────────┬──────────────┘       │
                      ▼                      │
        ┌────────────────────────────┐       │
        │6. Generate New Ground Truth│───────┘
        └────────────────────────────┘

           ⤷ Loop back to step 3 to retrain

You can repeat steps 3 to 6 as many times as needed to improve the model’s performance, especially when working with complex or evolving datasets.

📌 You may skip any step if you're starting from partially prepared data or existing corrections.


🌈 Shared Configuration Variables

Some key variables are shared across both notebooks and scripts.
They define core paths, session parameters, and model references, and are typically loaded from a .env file using the python-dotenv package.

| Variable | Description | |----------|-------------| | project_folder | Absolute path to the folder containing your dataset. By default, this is project/, but it can be renamed freely. | | model_folder | Path to the YOLOv8 model directory used for inference or evaluation (e.g., output/runs/train/model_name/) | | pretrained_model | Path to a pre-trained YOLOv8 model (e.g., best.pt) if you want to fine-tune instead of training from scratch |

📌 project_folder is the central directory used throughout the pipeline.
Make sure the structure inside follows the Project Folder Structure for the notebooks to run correctly.

These variables are typically defined in your .env file and automatically loaded into both notebooks and scripts at runtime.


🧩 Installation

TiamaT uses two separate environments:

  • 🖌️ One for annotation and project setup using Label Studio
  • 🐲 One for model training, inference, and evaluation using YOLO

You can install both environments using either Conda or Python virtual environments (venv).


👾 Requirements

Before installing, make sure you have:

  • Python 3.10+
  • Either conda or venv (choose your preferred environment manager)
  • Jupyter Notebook or JupyterLab (only needed for notebook users)
  • Internet access to fetch packages

📦 Dependencies are listed in the requirements/

Related Skills

View on GitHub
GitHub Stars7
CategoryDevelopment
Updated4mo ago
Forks2

Languages

Jupyter Notebook

Security Score

62/100

Audited on Dec 5, 2025

No findings