Slxc
SLURM in Linux Containers
Install / Use
/learn @artpol84/SlxcREADME
slxc
SLURM in Linux Containers
The set of scripts to easily deploy SLURM cluster on one machine using Linux Containers. The goal is SLURM development mostly. Any other ideas/usages :)?
Prerequisites: screen tool.
- Install Linux Containers (LXC)
- In Linux Mint (and probably Ubuntu) need the following packages:
lxc-devlxc-utils
- Configure LXC (the following is Ubuntu/Mint specific, for other distributions check its manuals to use the proper paths and configuration files names):
- Setup LXC networking (
/etc/default/lxc-net):USE_LXC_BRIDGE="true"LXC_DHCP_CONFILE=/etc/lxc/dnsmasq.confLXC_DOMAIN="lxc"
- Change
/etc/lxc/dnsmasq.confadding following line:conf-file=$SLXC_PATH/build/dnsmasq.conf
- If facing problems, check https://github.com/lxc/lxc/pull/285/files (look in /etc/apparmor.d/abstractions/lxc/start-container)
- Install Munge in
MUNGE_PATH(undersomeuser). NOTE! that munge-0.5.11 has problems with user-defined prefix installation (see https://code.google.com/p/munge/issues/detail?id=34 for the details). In the mentioned issue report you may find the patch that temporally fixes this problem. Or you can use more recent versions that have this problem fixed. - [Optional] If the SLURM_USER is not root and you plan to submit jobs as user USER1 != SLURM_USER:
- Apply the patch from SLURM directory:
patch -p1 < <slxc_path>/patch/start_from_user.patch
- Install SLURM in
SLURM_PATH(undersomeuser). Make additional directorys in slurm's prefix:
mkdir $SLURM_PATH/var $SLURM_PATH/etc
- Configure SLURM and put its configuration in
$SLURM_PATH/etc/slurm.conf. While configuring select your favorite domain names for the frontend and compute nodes. Here we will usefrontendandcnX. - Put SLURM and Munge installation paths to
$SLXC_PATH/slxc.conf. - Set
SLURM_USERtosomeuserin$SLXC_PATH/slxc.conf. - Create cluster machines with
slxc-new-node.sh. The only argument ofslxc-new-node.shis machine hostname. NOTE that you must use the same frontend/compute nodes names as in$SLURM_PATH/etc/slurm.conf.
- Create frontend first (let's call it "frontend" for example ):
$SLXC_PATH/slxc-new-node.sh frontend
- Create node machines (cn1, cn2, ..., cnN):
$ for i in $(seq 1 N); do $SLX_PATH/slxc-new-node.sh cn$i; done
- [Optional] Add Munge and SLURM installation paths to your PATH environment variable.
And
export SLURM_CONF=$SLURM_PATH/etc/slurm.confto letsinfo,sbatchand others know how to reachslurmctld. - Restart lxc-net service (for Ubuntu/Mint):
$ sudo service lxc-net restart
- [Optional] If the SLURM_USER is not root and you plan to submit jobs as user USER1 != SLURM_USER:
- Setup SLURM capabilities:
$ sudo ./slurm-set-capabilities.sh
- Start your cluster:
$ sudo ./slxc-run-cluster.sh
- Verify that everything is OK (both tools should show all your virtual "machines" running):
$ sudo screen -ls$ sudo lxc-ls --active
- Now you can attach to any machine with
$ sudo lxc-attach -n $nodename
- [Optional] If you plan use PMIx plugin, then required to be set the temporary directory of PMIx through value of environment SLURM_PMIX_TMPDIR. This path shouldn't be equal to shared directory between virtual containers. The env required to set before
srunuse.
- Set PMIx tmp dir:
$ export SLURM_PMIX_TMPDIR=$SLURM_PATH/var/spool
- To shutdown your cluster use
$ ./slxc-stop-cluster.sh- NOTE: that it may take a while. You can speedup this process by setting
LXC_SHUTDOWN_TIMEOUTin/etc/default/lxc(for Ubuntu and Mint)
That seems to be all. Enjoy!
Related Skills
node-connect
348.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
348.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
348.5kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
