Pwrake
Parallel Workflow extension for Rake, runs on multicores, clusters, clouds.
Install / Use
/learn @masa16/PwrakeREADME
Pwrake
Parallel Workflow extension for Rake, runs on multicores, clusters, clouds.
- Author: Masahiro Tanaka
README in Japanese, GitHub Repository, RubyGems
Features
- Pwrake executes a workflow written in Rakefile in parallel.
- The specification of Rakefile is same as Rake.
- The tasks which do not have mutual dependencies are automatically executed in parallel.
- The
multitaskwhich is a parallel task definition of Rake is no more necessary.
- Parallel and distributed execution is possible using a computer cluster which consists of multiple compute nodes.
- Cluster settings: SSH login (or MPI), and the directory sharing using a shared filesystem, e.g., NFS, Gfarm.
- Pwrake automatically connects to remote hosts using SSH. You do not need to start a daemon.
- Remote host names and the number of cores to use are provided in a hostfile.
- Gfarm file system utilizes storage of compute nodes. It provides the high-performance parallel I/O.
- Parallel I/O access to local storage of compute nodes enables scalable increase in the I/O performance.
- Gfarm schedules a compute node to store an output file, to local storage.
- Pwrake schedules a compute node to execute a task, to a node where input files are stored.
- Other supports for Gfarm: Automatic mount of the Gfarm file system, etc.
Requirement
- Ruby version 2.2.3 or later
- UNIX-like OS
- For distributed processing using multiple computers:
- SSH command
- distributed file system (NFS, Gfarm, etc.)
Installation
Install with RubyGems:
$ gem install pwrake
Or download source tgz/zip and expand, cd to subdirectory and install:
$ ruby setup.rb
If you use rbenv, your system may fail to find pwrake command after installation:
-bash: pwrake: command not found
In this case, you need the rehash of command paths:
$ rbenv rehash
Usage
Parallel execution using 4 cores at localhost:
$ pwrake -j 4
Parallel execution using all cores at localhost:
$ pwrake -j
Parallel execution using total 2*2 cores at remote 2 hosts:
-
Share your directory among remote hosts via distributed file system such as NFS, Gfarm.
-
Allow passphrase-less access via SSH in either way:
- Add passphrase-less key generated by
ssh-keygen. (Be careful) - Add passphrase using
ssh-add.
- Add passphrase-less key generated by
-
Make
hostsfile in which remote host names and the number of cores are listed:$ cat hosts host1 2 host2 2 -
Run
pwrakewith an option--hostfileor-F:$ pwrake -F hosts
Sustitute MPI for SSH to start remote worker (Experimental)
-
Setup MPI on your cluster.
-
Install MPipe gem. (requires
mpicc) -
Run
pwrake-mpicommand.$ pwrake-mpi -F hosts
Options
Pwrake command line options (in addition to Rake option)
-F, --hostfile FILE [Pw] Read hostnames from FILE
-j, --jobs [N] [Pw] Number of threads at localhost (default: # of processors)
-L, --log, --log-dir [DIRECTORY] [Pw] Write log to DIRECTORY
--ssh-opt, --ssh-option OPTION
[Pw] Option passed to SSH
--filesystem FILESYSTEM [Pw] Specify FILESYSTEM (nfs|gfarm2fs)
--gfarm [Pw] (obsolete; Start pwrake on Gfarm FS)
-A, --disable-affinity [Pw] Turn OFF affinity (AFFINITY=off)
-S, --disable-steal [Pw] Turn OFF task steal
-d, --debug [Pw] Output Debug messages
--pwrake-conf [FILE] [Pw] Pwrake configuration file in YAML
--show-conf, --show-config [Pw] Show Pwrake configuration options
--report LOGDIR [Pw] Generate `report.html' (Report of workflow statistics) in LOGDIR and exit.
--report-image IMAGE_TYPE [Pw] Gnuplot output format (png,jpg,svg etc.) in report.html.
--clear-gfarm2fs [Pw] Clear gfarm2fs mountpoints left after failure.
pwrake_conf.yaml
-
If
pwrake_conf.yamlexists at current directory, Pwrake reads options from it. -
Example (in YAML form):
HOSTFILE: hosts LOG_DIR: true DISABLE_AFFINITY: true DISABLE_STEAL: true FAILED_TARGET: delete PASS_ENV : - ENV1 - ENV2 -
Option list:
HOSTFILE, HOSTS nil(default, localhost)|filename LOG_DIR, LOG nil(default, No log output)|true(dirname="Pwrake%Y%m%d-%H%M%S")|dirname LOG_FILE default="pwrake.log" TASK_CSV_FILE default="task.csv" COMMAND_CSV_FILE default="command.csv" GC_LOG_FILE default="gc.log" WORK_DIR default=$PWD FILESYSTEM default(autodetect)|gfarm SSH_OPTION SSH option PASS_ENV (Array) Environment variables passed to SSH HEARTBEAT default=240 - Hearbeat interval in seconds RETRY default=1 - The number of task retry HOST_FAILURE default=2 - The number of allowed continuous host failure (since v2.3) FAILED_TARGET rename(default)|delete|leave - Treatment of failed target files FAILURE_TERMINATION wait(default)|kill|continue - Behavior of other tasks when a task is failed QUEUE_PRIORITY LIFO(default)|FIFO|LIHR(LIfo&Highest-Rank-first; obsolete) DISABLE_RANK_PRIORITY false(default)|true - Disable rank-aware task scheduling (since v2.3) RESERVE_NODE false(default)|true - Reserve a node for tasks with ncore>1 (since v2.3) NOACTION_QUEUE_PRIORITY FIFO(default)|LIFO|RAND SHELL_START_INTERVAL default=0.012 (sec) GRAPH_PARTITION false(default)|true REPORT_IMAGE default=png -
Options for Gfarm system:
DISABLE_AFFINITY default=false DISABLE_STEAL default=false GFARM_BASEDIR default="/tmp" GFARM_PREFIX default="pwrake_$USER" GFARM_SUBDIR default='/' MAX_GFWHERE_WORKER default=8 GFARM2FS_COMMAND default='gfarm2fs' GFARM2FS_OPTION default="" GFARM2FS_DEBUG default=false GFARM2FS_DEBUG_WAIT default=1
Task Properties
- Task properties are specified in
descstrings above task definition in Rakefile.
Example of Rakefile:
desc "ncore=4 allow=ourhost*" # desc has no effect on rule in original Rake, but it is used for task property in Pwrake.
rule ".o" => ".c" do
sh "..."
end
(1..n).each do |i|
desc "ncore=2 steal=no" # desc should be inside of loop because it is effective only for the next task.
file "task#{i}" do
sh "..."
end
end
Properties (The leftmost item is default):
ncore=integer|rational - The number of cores used by this task.
exclusive=no|yes - Exclusively execute this task in a single node.
reserve=no|yes - Gives higher priority to this task if ncore>1. (reserve a host)
allow=hostname - Allow this host to execute this task. (accepts wild card)
deny=hostname - Deny this host to execute this task. (accepts wild card)
order=deny,allow|allow,deny - The order of evaluation.
steal=yes|no - Allow task stealing for this task.
retry=integer - The number of retry for this task.
Note for Gfarm
- Gfarm file-affinity scheduling is achieved by
gfwhere-pipescript bundled in the Pwrake package. This script accesseslibgfarm.so.1through Fiddle (a Ruby's standard module) since Pwrake ver.2.2.7. Please set the environment variableLD_LIBRARY_PATHcorrectly to findlibgfarm.so.1.
Scheduling with Graph Partitioning
-
Compile and Install METIS 5.1.0 (http://www.cs.umn.edu/~metis/). This requires CMake.
-
Install RbMetis (https://github.com/masa16/rbmetis) by
gem install rbmetis -- \ --with-metis-include=/usr/local/include \ --with-metis-lib=/usr/local/lib -
Option (
pwrake_conf.yaml):GRAPH_PARTITION: true -
See publication: M. Tanaka and O. Tatebe, “Workflow Scheduling to Minimize Data Movement Using Multi-constraint Graph Partitioning,” in CCGrid 2012
Publications
Acknowledgment
This work is supported by:
- JST CREST, research themes:
- MEXT Promotion of Research for Next Generation IT Infrastructure "Resources Linkage for e-Science (RENKEI)."
Related Skills
node-connect
344.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
99.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
344.4kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
344.4kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
