Workerpool
Offload tasks to a pool of workers on node.js and in the browser
Install / Use
/learn @josdejong/WorkerpoolREADME
workerpool
workerpool offers an easy way to create a pool of workers for both dynamically offloading computations as well as managing a pool of dedicated workers. workerpool basically implements a thread pool pattern. There is a pool of workers to execute tasks. New tasks are put in a queue. A worker executes one task at a time, and once finished, picks a new task from the queue. Workers can be accessed via a natural, promise based proxy, as if they are available straight in the main application.
workerpool runs on Node.js and in the browser.
Features
- Easy to use
- Runs in the browser and on node.js
- Dynamically offload functions to a worker
- Access workers via a proxy
- Cancel running tasks
- Set a timeout on tasks
- Handles crashed workers
- Small: 9 kB minified and gzipped
- Supports transferable objects (only for web workers and worker_threads)
Why
JavaScript is based upon a single event loop which handles one event at a time. Jeremy Epstein explains this clearly:
In Node.js everything runs in parallel, except your code. What this means is that all I/O code that you write in Node.js is non-blocking, while (conversely) all non-I/O code that you write in Node.js is blocking.
This means that CPU heavy tasks will block other tasks from being executed. In case of a browser environment, the browser will not react to user events like a mouse click while executing a CPU intensive task (the browser "hangs"). In case of a node.js server, the server will not respond to any new request while executing a single, heavy request.
For front-end processes, this is not a desired situation. Therefore, CPU intensive tasks should be offloaded from the main event loop onto dedicated workers. In a browser environment, Web Workers can be used. In node.js, child processes and worker_threads are available. An application should be split in separate, decoupled parts, which can run independent of each other in a parallelized way. Effectively, this results in an architecture which achieves concurrency by means of isolated processes and message passing.
Install
Install via npm:
npm install workerpool
Load
To load workerpool in a node.js application (both main application as well as workers):
const workerpool = require('workerpool');
To load workerpool in the browser:
<script src="workerpool.js"></script>
To load workerpool in a web worker in the browser:
importScripts('workerpool.js');
Setting up the workerpool with React or webpack5 requires additional configuration steps, as outlined in the webpack5 section.
Use
Offload functions dynamically
In the following example there is a function add, which is offloaded dynamically to a worker to be executed for a given set of arguments.
myApp.js
const workerpool = require('workerpool');
const pool = workerpool.pool();
function add(a, b) {
return a + b;
}
pool
.exec(add, [3, 4])
.then(function (result) {
console.log('result', result); // outputs 7
})
.catch(function (err) {
console.error(err);
})
.then(function () {
pool.terminate(); // terminate all workers when done
});
Note that both function and arguments must be static and stringifiable, as they need to be sent to the worker in a serialized form. In case of large functions or function arguments, the overhead of sending the data to the worker can be significant.
Dedicated workers
A dedicated worker can be created in a separate script, and then used via a worker pool.
myWorker.js
const workerpool = require('workerpool');
// a deliberately inefficient implementation of the fibonacci sequence
function fibonacci(n) {
if (n < 2) return n;
return fibonacci(n - 2) + fibonacci(n - 1);
}
// create a worker and register public functions
workerpool.worker({
fibonacci: fibonacci,
});
This worker can be used by a worker pool:
myApp.js
const workerpool = require('workerpool');
// create a worker pool using an external worker script
const pool = workerpool.pool(__dirname + '/myWorker.js');
// run registered functions on the worker via exec
pool
.exec('fibonacci', [10])
.then(function (result) {
console.log('Result: ' + result); // outputs 55
})
.catch(function (err) {
console.error(err);
})
.then(function () {
pool.terminate(); // terminate all workers when done
});
// or run registered functions on the worker via a proxy:
pool
.proxy()
.then(function (worker) {
return worker.fibonacci(10);
})
.then(function (result) {
console.log('Result: ' + result); // outputs 55
})
.catch(function (err) {
console.error(err);
})
.then(function () {
pool.terminate(); // terminate all workers when done
});
Worker can also initialize asynchronously:
myAsyncWorker.js
define(['workerpool/dist/workerpool'], function (workerpool) {
// a deliberately inefficient implementation of the fibonacci sequence
function fibonacci(n) {
if (n < 2) return n;
return fibonacci(n - 2) + fibonacci(n - 1);
}
// create a worker and register public functions
workerpool.worker({
fibonacci: fibonacci,
});
});
Examples
Examples are available in the examples directory:
https://github.com/josdejong/workerpool/tree/master/examples
API
The API of workerpool consists of two parts: a function workerpool.pool to create a worker pool, and a function workerpool.worker to create a worker.
pool
A workerpool can be created using the function workerpool.pool:
workerpool.pool([script: string] [, options: Object]) : Pool
When a script argument is provided, the provided script will be started as a dedicated worker. When no script argument is provided, a default worker is started which can be used to offload functions dynamically via Pool.exec. Note that on node.js, script must be an absolute file path like __dirname + '/myWorker.js'. In a browser environment, script can also be a data URL like 'data:application/javascript;base64,...'. This allows embedding the bundled code of a worker in your main application. See examples/embeddedWorker for a demo.
The following options are available:
minWorkers: number | 'max'. The minimum number of workers that must be initialized and kept available. Setting this to'max'will createmaxWorkersdefault workers (see below).maxWorkers: number. The default number of maxWorkers is the number of CPU's minus one. When the number of CPU's could not be determined (for example in older browsers),maxWorkersis set to 3.maxQueueSize: number. The maximum number of tasks allowed to be queued. Can be used to prevent running out of memory. If the maximum is exceeded, adding a new task will throw an error. The default value isInfinity.workerType: 'auto' | 'web' | 'process' | 'thread'.- In case of
'auto'(default), workerpool will automatically pick a suitable type of worker: when in a browser environment,'web'will be used. When in a node.js environment,worker_threadswill be used if available (Node.js >= 11.7.0), elsechild_processwill be used. - In case of
'web', a Web Worker will be used. Only available in a browser environment. - In case of
'process',child_processwill be used. Only available in a node.js environment. - In case of
'thread',worker_threadswill be used. Ifworker_threadsare not available, an error is thrown. Only available in a node.js environment.
- In case of
workerTerminateTimeout: number. The timeout in milliseconds to wait for a worker to cleanup it's resources on termination before stopping it forcefully. Default value is1000.abortListenerTimeout: number. The timeout in milliseconds to wait for abort listener's before stopping it forcefully, triggering cleanup. Default value is1000.forkArgs: String[]. Forprocessworker type. An array passed asargsto child_process.forkforkOpts: Object. Forprocessworker type. An object passed asoptionsto child_process.fork. See nodejs documentation for available options.workerOpts: Object. Forwebworker type. An object passed to the constructor of the web worker. See WorkerOptions specification for available options.workerThreadOpts: Object. Forworkerworker type. An object passed to worker_threads.options. See nodejs documentation for available options.onCreateWorker: Function. A callback that is called whenever a worker is being created. It can be used to allocate resources for e
