SkillAgentSearch skills...

Tork

Tork is a lightweight, distributed workflow engine that runs tasks as simple scripts within Docker containers.

Install / Use

/learn @runabol/Tork

README

<h1 align="center"> <img src="docs/logo.svg" alt="tork" width="300px"> <br> </h1> <p align="center"> <a href="https://opensource.org/licenses/MIT"> <img src="https://img.shields.io/badge/license-MIT-_red.svg"> </a> <a href="https://goreportcard.com/report/github.com/runabol/tork"> <img src="https://goreportcard.com/badge/github.com/runabol/tork"> </a> <a href="https://github.com/runabol/tork/releases"> <img src="https://img.shields.io/github/release/runabol/tork"> </a> <img src="https://github.com/runabol/tork/workflows/ci/badge.svg"> </p> <p align="center"> <a href="#features">Features</a> • <a href="#quick-start">Quick Start</a> • <a href="#installation">Installation</a> • <a href="#architecture">Architecture</a> • <a href="#jobs">Jobs</a> • <a href="#tasks">Tasks</a> • <a href="#configuration">Configuration</a> • <a href="#rest-api">REST API</a> • <a href="#web-ui">Web UI</a> • <a href="#extending-tork">Extend</a> </p>

Tork is a highly-scalable, general-purpose workflow engine. It lets you define jobs consisting of multiple tasks, each running inside its own container. You can run Tork on a single machine (standalone mode) or set it up in a distributed environment with multiple workers.

Features

tork CLI

  • REST API – Submit jobs, query status, cancel/restart
  • Horizontally scalable – Add workers to handle more tasks
  • Task isolation – Tasks run in containers for isolation, idempotency, and resource limits
  • Automatic recovery – Tasks are recovered if a worker crashes
  • Stand-alone and distributed – Run all-in-one or distributed with Coordinator + Workers
  • Retry failed tasks – Configurable retry with backoff
  • MiddlewareHTTP, Job, Task, Node middleware for auth, logging, metrics
  • No single point of failure – Stateless, leaderless coordinators
  • Task timeoutTimeout per task
  • Full-text search – Search jobs via the API
  • Runtime agnosticDocker, Podman, Shell
  • Webhooks – Notify on job/task state changes
  • Pre/Post tasksPre/Post tasks for setup/teardown
  • Expression languageExpressions for conditionals and dynamic values
  • Conditional tasks – Run tasks based on if conditions
  • Parallel tasksParallel Task
  • Each taskEach Task for looping
  • Subjob taskSub-Job Task
  • Task priorityPriority (0–9)
  • SecretsSecrets with auto-redaction
  • Scheduled jobsScheduled jobs with cron
  • Web UITork Web for viewing and submitting jobs

Quick Start

Requirements

  1. A recent version of Docker.
  2. The Tork binary from the releases page.

Set up PostgreSQL

Start a PostgreSQL container:

Note: For production, consider a managed PostgreSQL service for better reliability and maintenance.

docker run -d \
  --name tork-postgres \
  -p 5432:5432 \
  -e POSTGRES_PASSWORD=tork \
  -e POSTGRES_USER=tork \
  -e PGDATA=/var/lib/postgresql/data/pgdata \
  -e POSTGRES_DB=tork postgres:15.3

Run the migration to create the database schema:

TORK_DATASTORE_TYPE=postgres ./tork migration

Hello World

Start Tork in standalone mode:

./tork run standalone

Create hello.yaml:

# hello.yaml
---
name: hello job
tasks:
  - name: say hello
    image: ubuntu:mantic
    run: |
      echo -n hello world
  - name: say goodbye
    image: alpine:latest
    run: |
      echo -n bye world

Submit the job:

JOB_ID=$(curl -s -X POST --data-binary @hello.yaml \
  -H "Content-type: text/yaml" http://localhost:8000/jobs | jq -r .id)

Check status:

curl -s http://localhost:8000/jobs/$JOB_ID
{
  "id": "ed0dba93d262492b8cf26e6c1c4f1c98",
  "state": "COMPLETED",
  ...
}

Running in distributed mode

In distributed mode, the Coordinator schedules work and Workers execute tasks. A message broker (e.g. RabbitMQ) moves tasks between them.

Start RabbitMQ:

docker run \
  -d -p 5672:5672 -p 15672:15672 \
  --name=tork-rabbitmq \
  rabbitmq:3-management

Note: For production, consider a dedicated RabbitMQ service.

Run the coordinator:

TORK_DATASTORE_TYPE=postgres TORK_BROKER_TYPE=rabbitmq ./tork run coordinator

Run one or more workers:

TORK_BROKER_TYPE=rabbitmq ./tork run worker

Submit the same job as before; the coordinator and workers will process it.

Adding external storage

Tasks are ephemeral; container filesystems are lost when a task ends. To share data between tasks, use an external store (e.g. MinIO/S3).

Start MinIO:

docker run --name=tork-minio \
  -d -p 9000:9000 -p 9001:9001 \
  -e MINIO_ROOT_USER=minioadmin \
  -e MINIO_ROOT_PASSWORD=minioadmin \
  minio/minio server /data \
  --console-address ":9001"

Example job with two tasks (write to MinIO, then read back):

name: stateful example
inputs:
  minio_endpoint: http://host.docker.internal:9000
secrets:
  minio_user: minioadmin
  minio_password: minioadmin
tasks:
  - name: write data to object store
    image: amazon/aws-cli:latest
    env:
      AWS_ACCESS_KEY_ID: "{{ secrets.minio_user }}"
      AWS_SECRET_ACCESS_KEY: "{{ secrets.minio_password }}"
      AWS_ENDPOINT_URL: "{{ inputs.minio_endpoint }}"
      AWS_DEFAULT_REGION: us-east-1
    run: |
      echo "Hello from Tork!" > /tmp/data.txt
      aws s3 mb s3://mybucket
      aws s3 cp /tmp/data.txt s3://mybucket/data.txt

  - name: read data from object store
    image: amazon/aws-cli:latest
    env:
      AWS_ACCESS_KEY_ID: "{{ secrets.minio_user }}"
      AWS_SECRET_ACCESS_KEY: "{{ secrets.minio_password }}"
      AWS_ENDPOINT_URL: "{{ inputs.minio_endpoint }}"
      AWS_DEFAULT_REGION: us-east-1
    run: |
      aws s3 cp s3://mybucket/data.txt /tmp/retrieved.txt
      echo "Contents of retrieved file:"
      cat /tmp/retrieved.txt

Installation

Download the Tork binary for your system from the releases page.

Create a directory and unpack:

mkdir ~/tork
cd ~/tork
tar xzvf ~/Downloads/tork_0.1.66_darwin_arm64.tgz
./tork

You should see the Tork banner and help. On macOS you may need to allow the binary in Security & Privacy settings.

PostgreSQL and migration

See Quick Start – Set up PostgreSQL and run:

TORK_DATASTORE_TYPE=postgres ./tork migration

Standalone mode

./tork run standalone

Distributed mode

Configure the broker (e.g. in config.toml):

# config.toml
[broker]
type = "rabbitmq"

[broker.rabbitmq]
url = "amqp://guest:guest@localhost:5672/"

Start RabbitMQ, then:

./tork run coordinator
./tork run worker

Queues

Tasks go to the default queue unless overridden. Workers subscribe to queues; you can run multiple consumers per queue:

# config.toml
[worker.queues]
default = 5
video = 2

[broker]
type = "rabbitmq"

Route a task to a specific queue:

name: transcode a video
queue: video
image: jrottenberg/ffmpeg:3.4-alpine
run: |
  ffmpeg -i https://example.com/some/video.mov output.mp4

Architecture

A workflow is a job: a series of tasks (steps) run in order. Jobs are usually defined in YAML:

---
name: hello job
tasks:
  - name: say hello
    image: ubuntu:mantic
    run: echo -n hello world
  - name: say goodbye
    image: ubuntu:mantic
    run: echo -n bye world

Components:

  • Coordinator – Tracks jobs, dispatches work to workers, handles retries and failures. Stateless and leaderless; does not run tasks.
  • Worker – Runs tasks via a runtime (usually Docker).
  • Broker – Routes messages between Coordinator and Workers.
  • Datastore – Persists job and task state.
  • Runtime – Execution environment for tasks (Docker, Podman, Shell).

Jobs

A job is a list of tasks executed in order.

Simple example

name: hello job
tasks:
  - name: say hello
    var: task1
    image: ubuntu:mantic
    run: |
      echo -n hello world > $TORK_OUTPUT
  - name: say goodbye
    image: ubuntu:mantic
    run: |
      echo -n bye world

Submit:

curl -s -X POST --data-binary @job.yaml \
  -H "Content-type: text/yaml" \
  http://localhost:8000/jobs

Inputs

name: mov to mp4
inputs:
  source: https://example.com/path/to/video.mov
tasks:
  - name: convert the video to mp4
    image: jrottenberg/ffmpeg:3.4-alpine
    env:
      SOURCE_URL: '{{ inputs.source }}'
    run: |
      ffmpeg -i $SOURCE_URL /tmp/output.mp4

Secrets

Use the secrets block for sensitive values (redacted in API responses):

name: my job
secrets:
  api_key: 1111-1111-1111-1111
tasks:
  - name: my task
    image: alpine:latest
    run: curl -X POST -H "API_KEY: $API_KEY" http://example.com
    env:
      API_KEY: '{{secrets.api_key}}'

Defaults

Set defaults for all tasks:

name: my job
defaults:
  retry:
    limit: 2
  limits:
    cpus: 1
    memory: 500m
  timeout: 10m
  queue: highcpu
  priority: 3
tasks:
  - name: my task
    image: alpine:latest
    run: echo hello world

Auto Delete

name: my job
autoDelete:
  after: 6h
tasks:
  - name: my task
    image: alpine:latest
    run: echo hello world

Webhooks

name: my job
webhooks:
  - url: http://example.com/my/webhook
    event: job.StateChange   # or task.StateChange
    headers:
      my-header: somevalue
    if: "{{ job.State == 'COMPLETED' }}"
tasks:
  - name: my task
    image: alpine:latest
    run: echo hello world

Permissions

`

View on GitHub
GitHub Stars796
CategoryDevelopment
Updated3d ago
Forks55

Languages

Go

Security Score

100/100

Audited on Mar 24, 2026

No findings