Jobd
simple job queue daemon
Install / Use
/learn @gch1p/JobdREADME
jobd
jobd is a simple job queue daemon that works with persistent queue storage. It uses a MySQL table as a storage backend (for queue input and output).
Currently, MySQL is the only supported storage. Other backends may be easily supported though.
It is by design that jobd never adds nor deletes jobs from storage. It only reads (when a certain request arrives) and updates them (during execution, when job status changes). Succeeded or failed, your jobs are never lost.
jobd consists of 2 parts:
-
jobd is a "worker" daemon that reads jobs from the database, enqueues and launches them. There may be multiple instances of jobd running on multiple hosts. Each jobd instance may have unlimited number of queues (called "targets"), each with its own concurrency limit.
-
jobd-master is a "master" or "central" daemon that simplifies control over many jobd instances. There should be only one instance of jobd-master running. jobd-master is not required for jobd workers to work (they can work without it), but it's very very useful.
In addition, there is a command line utility called jobctl.
Originally, jobd was created as a saner alternative to Gearman. It's been used in production with a large PHP web application on multiple servers for quite some time already, and proven to be stable and efficient. We were also monitoring the memory usage of all our jobd instances for two months, and can confirm there are no leaks.
Table of Contents
- How it works
- Integration example
- Installation
- Usage
- Configuration
- Clients
- Protocol
- License
How it works
Targets
Every jobd instance has its own set of queues, called targets. A name of a
target is an arbitrary string, the length of which should be limited by the size
of target field in the MySQL table.
Each target has its own concurrency limit (the maximum number of jobs that may
be executed simultaneously). Targets are loaded from the config at startup, and
also may be added or removed at runtime, by
add-target(target: string, concurrency: int)
and remove-target(target: string) requests.
The purpose of targets is to logically separate jobs of different kinds by putting them in different queues. For instance, targets can be used to simulate jobs priorities:
[targets]
low = 5
normal = 5
high = 5
The config above defines three targets (or three queues), each with a concurrency
limit of 5.
Or, let's imagine a scenario when you have two kinds of jobs: heavy, resource-consuming, long-running jobs (like video processing) and light, fast and quick jobs (like sending emails). In this case, you could define two targets, like so:
[targets]
heavy = 3
quick = 20
This config would allow running at most 3 heavy and up to 20 quick jobs simultaneously.
:thought_balloon: In the author's opinion, the approach of having different targets (queues) for different kinds of jobs is better than having a single queue with each job having a "priority".
Imagine you had a single queue with maximum number of simultaneously running jobs set to, say, 20. What would happen if you'd add a new job, even with the highest priority possible, when there's already 20 slow jobs running? No matter how high the priority of new job is, it would have to wait.
By defining different targets, jobd allows you to create dedicated queues for such jobs, making sure there's always a room for high-priority tasks to run as early as possible.
Creating jobs
Each job is described by one record in the MySQL table. Here is a table scheme with a minimal required set of fields:
CREATE TABLE `jobs` (
`id` int(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`target` char(16) NOT NULL,
`time_created` int(10) UNSIGNED NOT NULL,
`time_started` int(10) UNSIGNED NOT NULL DEFAULT 0,
`time_finished` int(10) UNSIGNED NOT NULL DEFAULT 0,
`status` enum('waiting','manual','accepted','running','done','ignored') NOT NULL DEFAULT 'waiting',
`result` enum('ok','fail') DEFAULT NULL,
`return_code` tinyint(3) UNSIGNED DEFAULT NULL,
`sig` char(10) DEFAULT NULL,
`stdout` mediumtext DEFAULT NULL,
`stderr` mediumtext DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `status_target_idx` (`status`, `target`, `id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
As you can see:
- Each job has a unique ID. You don't need to care about assigning IDs because
AUTO_INCREMENTis used. - Each job is associated with some target, or, in other words, is put to some queue. More about targets in the Targets section.
- There are
time_created,time_startedandtime_finishedfields, and it's not hard to guess what their meaning. When creating a job, you should fill thetime_createdfield with a UNIX timestamp. jobd will update the two other fields while executing the job. - Each job has a
status.- A job must be created with status set to
waitingormanual. - A status becomes
acceptedwhen jobd reads the job from the table and puts it to a queue, or it might becomeignoredin case of some error, like invalidtarget, or invalidstatuswhen processing a run-manual(ids: int) request. - Right before a job is getting started, its status becomes
running. - Finally, when it's done, it is set to
done.
- A job must be created with status set to
- The
resultfield indicates whether a job completed successfully or not.- It is set to
okif the return code of launched command was0. - Otherwise, it is set to
fail.
- It is set to
- The
return_codefield is filled with the actual return code. - If the job process was killed by a POSIX signal, the signal name is written
to the
sigfield. - stdout and stderr of the process are written to
stdoutandstderrfields, accordingly.
:warning: In a real world, you'll want to have a few more additional fields, like
job_nameorjob_data.<br> Check out the integration example.
To create a new job, it must be added to the table. As mentioned earlier, adding or removing rows from the table is by design outside the jobd's area of responsibility. A user must add jobs to the table manually.
There are two kinds of jobs, in terms of how they are executed: background and manual (or foreground).
- Background jobs are created with
waitingstatus. When jobd gets new jobs from the table (which happens upon receiving apoll(target: strings[]); this process is described in detail in the launching background jobs section), such jobs are added to their queues and get executed at some point, depending on the current queue status and concurrency limit. A user does not have control of the execution flow, the only feedback it has is the fields in the table that are going to be updated before, during and after the execution. At some point,statuswill becomedone,resultand other fields will have their values filled too, and that's it. - Manual, or foreground jobs, is a different story. They must be created with
statusset tomanual. These jobs are processed only upon arun-manual(ids: int[])request. When jobd receives such request, it reads and launches the specified jobs, waits for the results and sends them back to the client in a response. Learn more about it under the launching manual jobs section.
Launching jobs
Launching (or executing) a job means running a command specified in
the config as the launcher, replacing the {id} template with current job id.
For example, if you have this in the config:
launcher = php /home/user/job-launcher.php {id}
and jobd is currently executing a job with id 123, it will launch
php /home/user/job-launcher.php 123.
Launching background jobs
After jobs have been added to storage, jobd must be notified about i
