AgenticData

agentic data generation（under refactor！！！）

Generate Convert Improve

Install / Use

/learn @Max-VibeCoding/AgenticData

About this skill

Quality Score

0/100

README

SeekingData

Professional SFT Data Generation & Harbor Task Management Platform

中文文档

</div>

⚠️ Work in Progress: This project is undergoing a complete refactor with many new features being added. It is not recommended for production use at this time. Please wait for a stable release.

Overview

SeekingData is a cross-platform desktop application that integrates SFT (Supervised Fine-Tuning) data generation and Harbor task management, featuring Material Design 3 for a modern user experience.

Features

SFT Data Generation

| Feature | Description | |---------|-------------| | Single Processing | File upload (PDF/DOCX/TXT), URL extraction, AI-powered generation | | Batch Processing | Bulk URL processing with real-time progress tracking | | Format Converter | Alpaca ↔ OpenAI bidirectional conversion | | CoT Generator | Chain of Thought reasoning data generation | | Image Dataset | Automatic image description generation | | Video Dataset | Video understanding data processing | | Dataset Sharing | One-click upload to HuggingFace |

Harbor Task Management

| Feature | Description | |---------|-------------| | GitHub Task Generator | Auto-generate tasks from GitHub repositories | | Visual Task Builder | Drag-and-drop editing with Monaco Editor | | Task Manager | List, search, view details, export tasks | | Task Validation | Integrated Harbor validation tools |

Tech Stack

Frontend

Framework: React 18 + Vite 5
UI Design: Material Design 3
Styling: TailwindCSS 3.4
State Management: Zustand
Routing: React Router DOM 7
Code Editor: Monaco Editor
Flow Editor: React Flow

Backend

Framework: FastAPI 0.115+
Language: Python 3.12
Validation: Pydantic 2.10+
LLM Integration: LiteLLM 1.40+
Document Processing: Docling 2.0+
Agent Framework: Camel AI 0.2.89
Task Framework: Harbor 0.1.45

Desktop Application

Framework: Electron 33
Packaging: Electron Builder
Platforms: macOS, Windows, Linux

Quick Start

Prerequisites

Node.js 18+
Python 3.12+
uv (Python package manager)
yarn (Node package manager)

Installation

# Clone repository
git clone https://github.com/yourusername/SeekingData.git
cd SeekingData

# Install frontend dependencies
yarn install

# Install backend dependencies
cd backend
uv venv .venv --python 3.12
source .venv/bin/activate  # macOS/Linux
# or .venv\Scripts\activate  # Windows
uv pip install -r requirements.txt

Development

# Terminal 1: Start backend
cd backend
source .venv/bin/activate
uvicorn main:app --reload --port 5001

# Terminal 2: Start frontend
yarn dev

Access the application at: http://localhost:3002

Production Build

# macOS
yarn build:mac

# Windows
yarn build:win

# Linux
yarn build:linux

Configuration

Backend Environment (backend/.env)

# LLM API Configuration
OPENAI_API_KEY=sk-xxx

# GitHub Token (optional, for GitHub task generation)
GITHUB_TOKEN=ghp_xxx

# Application
APP_NAME=SeekingData
APP_VERSION=0.1.0
DEBUG=true

Frontend Settings

Configure via the Settings page in the application:

API Base URL: LLM provider endpoint
API Key: Your secret API key
Model: Model identifier (e.g., qwen/qwen3.5-plus)
Suggestions Count: Number of suggestions per request (1-10)

Project Structure

SeekingData/
├── src/                    # React frontend
│   ├── components/
│   │   ├── sft/           # SFT data generation
│   │   ├── harbor/        # Harbor task management
│   │   ├── ui/            # Material Design 3 components
│   │   └── layout/        # Layout components
│   ├── lib/               # Utilities and stores
│   └── pages/             # Page components
├── backend/               # FastAPI backend
│   ├── agents/           # AI agents (GitHub, etc.)
│   ├── api/routes/       # API endpoints
│   ├── models/           # Pydantic models
│   ├── services/         # Business logic
│   └── tasks/            # Harbor task storage
├── electron/             # Electron main process
├── scripts/              # Build scripts
└── docs/                 # Documentation

API Endpoints

| Method | Endpoint | Description | |--------|----------|-------------| | GET | /api/sft/config | Get current configuration | | POST | /api/sft/config | Save configuration | | POST | /api/sft/generate | Generate SFT data | | POST | /api/sft/batch | Batch URL processing | | POST | /api/sft/convert | Format conversion | | GET | /api/harbor/tasks | List all tasks | | POST | /api/harbor/tasks | Create new task | | GET | /api/harbor/tasks/{id} | Get task details | | POST | /api/harbor/github/generate | Generate from GitHub |

Supported Models

The application supports any LiteLLM-compatible model:

| Provider | Model Examples | |----------|---------------| | OpenAI | gpt-4, gpt-4o, gpt-3.5-turbo | | Qwen | qwen/qwen3.5-plus, qwen/qwen-max | | Moonshot | moonshot/kimi-k2.5 | | Zhipu | zhipu/glm-5, zhipu/glm-4 | | MiniMax | minimax/MiniMax-M2.5 | | DeepSeek | openai/deepseek-v3.2 |

Documentation

Contributing

Contributions are welcome! Please read our contributing guidelines before submitting a pull request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Harbor - Agent task framework
Camel AI - AI agent framework
FastAPI - Modern web framework
React - UI library
Electron - Cross-platform desktop apps
LiteLLM - Unified LLM interface

Made with ❤️ by SeekingX-AILab

</div>

Related Skills

bluebubbles

347.6k

Use when you need to send or manage iMessages via BlueBubbles (recommended iMessage integration). Calls go through the generic message tool with channel="bluebubbles".

node-connect

347.6k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

slack

347.6k

Use when you need to control Slack from OpenClaw via the slack tool, including reacting to messages or pinning/unpinning items in Slack channels or DMs.

claude-opus-4-5-migration

108.4k

Migrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5