<h1 align="center"> <img src="docs/logo.svg" alt="tork" width="300px"> <br> </h1> <p align="center"> <a href="https://opensource.org/licenses/MIT"> <img src="https://img.shields.io/badge/license-MIT-_red.svg"> </a> <a href="https://goreportcard.com/report/github.com/runabol/tork"> <img src="https://goreportcard.com/badge/github.com/runabol/tork"> </a> <a href="https://github.com/runabol/tork/releases"> <img src="https://img.shields.io/github/release/runabol/tork"> </a> <img src="https://github.com/runabol/tork/workflows/ci/badge.svg"> </p> <p align="center"> <a href="#features">Features</a> • <a href="#quick-start">Quick Start</a> • <a href="#installation">Installation</a> • <a href="#architecture">Architecture</a> • <a href="#jobs">Jobs</a> • <a href="#tasks">Tasks</a> • <a href="#configuration">Configuration</a> • <a href="#rest-api">REST API</a> • <a href="#web-ui">Web UI</a> • <a href="#extending-tork">Extend</a> </p>

Tork is a highly-scalable, general-purpose workflow engine. It lets you define jobs consisting of multiple tasks, each running inside its own container. You can run Tork on a single machine (standalone mode) or set it up in a distributed environment with multiple workers.

Features

tork CLI

REST API – Submit jobs, query status, cancel/restart
Horizontally scalable – Add workers to handle more tasks
Task isolation – Tasks run in containers for isolation, idempotency, and resource limits
Automatic recovery – Tasks are recovered if a worker crashes
Stand-alone and distributed – Run all-in-one or distributed with Coordinator + Workers
Retry failed tasks – Configurable retry with backoff
Middleware – HTTP, Job, Task, Node middleware for auth, logging, metrics
No single point of failure – Stateless, leaderless coordinators
Task timeout – Timeout per task
Full-text search – Search jobs via the API
Runtime agnostic – Docker, Podman, Shell
Webhooks – Notify on job/task state changes
Pre/Post tasks – Pre/Post tasks for setup/teardown
Expression language – Expressions for conditionals and dynamic values
Conditional tasks – Run tasks based on if conditions
Parallel tasks – Parallel Task
Each task – Each Task for looping
Subjob task – Sub-Job Task
Task priority – Priority (0–9)
Secrets – Secrets with auto-redaction
Scheduled jobs – Scheduled jobs with cron
Web UI – Tork Web for viewing and submitting jobs

Quick Start

Requirements

A recent version of Docker.
The Tork binary from the releases page.

Set up PostgreSQL

Start a PostgreSQL container:

Note: For production, consider a managed PostgreSQL service for better reliability and maintenance.

docker run -d \
  --name tork-postgres \
  -p 5432:5432 \
  -e POSTGRES_PASSWORD=tork \
  -e POSTGRES_USER=tork \
  -e PGDATA=/var/lib/postgresql/data/pgdata \
  -e POSTGRES_DB=tork postgres:15.3

Run the migration to create the database schema:

TORK_DATASTORE_TYPE=postgres ./tork migration

Hello World

Start Tork in standalone mode:

./tork run standalone

Create hello.yaml:

# hello.yaml
---
name: hello job
tasks:
  - name: say hello
    image: ubuntu:mantic
    run: |
      echo -n hello world
  - name: say goodbye
    image: alpine:latest
    run: |
      echo -n bye world

Submit the job:

JOB_ID=$(curl -s -X POST --data-binary @hello.yaml \
  -H "Content-type: text/yaml" http://localhost:8000/jobs | jq -r .id)

Check status:

curl -s http://localhost:8000/jobs/$JOB_ID

{
  "id": "ed0dba93d262492b8cf26e6c1c4f1c98",
  "state": "COMPLETED",
  ...
}

Running in distributed mode

In distributed mode, the Coordinator schedules work and Workers execute tasks. A message broker (e.g. RabbitMQ) moves tasks between them.

Start RabbitMQ:

docker run \
  -d -p 5672:5672 -p 15672:15672 \
  --name=tork-rabbitmq \
  rabbitmq:3-management

Note: For production, consider a dedicated RabbitMQ service.

Run the coordinator:

TORK_DATASTORE_TYPE=postgres TORK_BROKER_TYPE=rabbitmq ./tork run coordinator

Run one or more workers:

TORK_BROKER_TYPE=rabbitmq ./tork run worker

Submit the same job as before; the coordinator and workers will process it.

Adding external storage

Tasks are ephemeral; container filesystems are lost when a task ends. To share data between tasks, use an external store (e.g. MinIO/S3).

Start MinIO:

docker run --name=tork-minio \
  -d -p 9000:9000 -p 9001:9001 \
  -e MINIO_ROOT_USER=minioadmin \
  -e MINIO_ROOT_PASSWORD=minioadmin \
  minio/minio server /data \
  --console-address ":9001"

Example job with two tasks (write to MinIO, then read back):

name: stateful example
inputs:
  minio_endpoint: http://host.docker.internal:9000
secrets:
  minio_user: minioadmin
  minio_password: minioadmin
tasks:
  - name: write data to object store
    image: amazon/aws-cli:latest
    env:
      AWS_ACCESS_KEY_ID: "{{ secrets.minio_user }}"
      AWS_SECRET_ACCESS_KEY: "{{ secrets.minio_password }}"
      AWS_ENDPOINT_URL: "{{ inputs.minio_endpoint }}"
      AWS_DEFAULT_REGION: us-east-1
    run: |
      echo "Hello from Tork!" > /tmp/data.txt
      aws s3 mb s3://mybucket
      aws s3 cp /tmp/data.txt s3://mybucket/data.txt

  - name: read data from object store
    image: amazon/aws-cli:latest
    env:
      AWS_ACCESS_KEY_ID: "{{ secrets.minio_user }}"
      AWS_SECRET_ACCESS_KEY: "{{ secrets.minio_password }}"
      AWS_ENDPOINT_URL: "{{ inputs.minio_endpoint }}"
      AWS_DEFAULT_REGION: us-east-1
    run: |
      aws s3 cp s3://mybucket/data.txt /tmp/retrieved.txt
      echo "Contents of retrieved file:"
      cat /tmp/retrieved.txt

Installation

Download the Tork binary for your system from the releases page.

Create a directory and unpack:

mkdir ~/tork
cd ~/tork
tar xzvf ~/Downloads/tork_0.1.66_darwin_arm64.tgz
./tork

You should see the Tork banner and help. On macOS you may need to allow the binary in Security & Privacy settings.

PostgreSQL and migration

See Quick Start – Set up PostgreSQL and run:

TORK_DATASTORE_TYPE=postgres ./tork migration

Standalone mode

./tork run standalone

Distributed mode

Configure the broker (e.g. in config.toml):

# config.toml
[broker]
type = "rabbitmq"

[broker.rabbitmq]
url = "amqp://guest:guest@localhost:5672/"

Start RabbitMQ, then:

./tork run coordinator
./tork run worker

Queues

Tasks go to the default queue unless overridden. Workers subscribe to queues; you can run multiple consumers per queue:

# config.toml
[worker.queues]
default = 5
video = 2

[broker]
type = "rabbitmq"

Route a task to a specific queue:

name: transcode a video
queue: video
image: jrottenberg/ffmpeg:3.4-alpine
run: |
  ffmpeg -i https://example.com/some/video.mov output.mp4

Architecture

A workflow is a job: a series of tasks (steps) run in order. Jobs are usually defined in YAML:

---
name: hello job
tasks:
  - name: say hello
    image: ubuntu:mantic
    run: echo -n hello world
  - name: say goodbye
    image: ubuntu:mantic
    run: echo -n bye world

Components:

Coordinator – Tracks jobs, dispatches work to workers, handles retries and failures. Stateless and leaderless; does not run tasks.
Worker – Runs tasks via a runtime (usually Docker).
Broker – Routes messages between Coordinator and Workers.
Datastore – Persists job and task state.
Runtime – Execution environment for tasks (Docker, Podman, Shell).

Jobs

A job is a list of tasks executed in order.

Simple example

name: hello job
tasks:
  - name: say hello
    var: task1
    image: ubuntu:mantic
    run: |
      echo -n hello world > $TORK_OUTPUT
  - name: say goodbye
    image: ubuntu:mantic
    run: |
      echo -n bye world

Submit:

curl -s -X POST --data-binary @job.yaml \
  -H "Content-type: text/yaml" \
  http://localhost:8000/jobs

Inputs

name: mov to mp4
inputs:
  source: https://example.com/path/to/video.mov
tasks:
  - name: convert the video to mp4
    image: jrottenberg/ffmpeg:3.4-alpine
    env:
      SOURCE_URL: '{{ inputs.source }}'
    run: |
      ffmpeg -i $SOURCE_URL /tmp/output.mp4

Secrets

Use the secrets block for sensitive values (redacted in API responses):

name: my job
secrets:
  api_key: 1111-1111-1111-1111
tasks:
  - name: my task
    image: alpine:latest
    run: curl -X POST -H "API_KEY: $API_KEY" http://example.com
    env:
      API_KEY: '{{secrets.api_key}}'

Defaults

Set defaults for all tasks:

name: my job
defaults:
  retry:
    limit: 2
  limits:
    cpus: 1
    memory: 500m
  timeout: 10m
  queue: highcpu
  priority: 3
tasks:
  - name: my task
    image: alpine:latest
    run: echo hello world

Auto Delete

name: my job
autoDelete:
  after: 6h
tasks:
  - name: my task
    image: alpine:latest
    run: echo hello world

Webhooks

name: my job
webhooks:
  - url: http://example.com/my/webhook
    event: job.StateChange   # or task.StateChange
    headers:
      my-header: somevalue
    if: "{{ job.State == 'COMPLETED' }}"
tasks:
  - name: my task
    image: alpine:latest
    run: echo hello world

Tork

Install / Use

README