Envkernel
Run jupyter kernels in different environments (conda, virtualenv, docker, singularity, Lmod)
Install / Use
/learn @NordicHPC/EnvkernelREADME
Switch environments before running Jupyter kernels
Sometimes, one needs to execute Jupyter kernels in a different environment. Say you want to execute the kernel in a conda environment (that's easy, but actually misses setting certain environment variables). Or run it inside a Docker container. One could manually adjust the kernelspec files to set environment variables or run commands before starting the kernel, but envkernel automates this process.
envkernel is equally usable for end users (on their own systems or clusters) to easily access environments in Jupyter, or sysadmins deploying this access on systems they administer.
In general, there are two passes: First, install the kernel, e.g.:
envkernel virtualenv --name=my-venv /path/to/venv. This parses some
options and writes a kernelspec file with the the --name you
specify. When Jupyter tries to start this kernel, it will execute the
next phase. When Jupyter tries to run the kernel, the kernelspec file
will re-execute envkernel in the run mode, which does whatever is
needed to set up the environment (in this case, sets PATH to the
/path/to/venv/bin/ that is needed). Then it starts the normal
IPython kernel.
Available modes:
conda: Activate a conda environment first.virtualenv: Activate a virtualenv first.docker: Run the kernel in a Docker container.singularity: Run the kernel in a singularity container.Lmod: Activate Lmod modules first.
Installation
Available on the PiPI: pip install envkernel.
Or, you can install latest from Github in the usual way: pip install https://github.com/NordicHPC/envkernel/archive/master.zip
This is a single-file script and can be copied directly and added to
PATH as well. By design, there are no dependencies except the basic
Jupyter client (not notebook or any UI), and that is only needed at
kernel-setup time, not at kernel-runtime. The script must be
available both when a kernel is set up, and
each time the kernel is started (and currently assumes they are in the
same location).
General usage and common arguments
General invocation:
envkernel [mode] [envkernel options] [mode-specific-options]
General arguments usable by all classes during the setup phase:
These options directly map to normal Jupyter kernel install options:
mode:singularity,docker,lmod, or whatever mode is desired.--name $name: Name of kernel to install (required).--user: Install kernel into user directory.--sys-prefix: Install to the current Python'ssys.prefix(the Python which is running envkernel).--prefix: same as normal kernel install option.--display-name NAME: Human-readable name.--replace: Replace existing kernel (Jupyter option, unsure what this means).--language: What language to tag this kernel (defaultpython).
These are envkernel-specific options:
--verbose,-v: Print more debugging information when installing the kernel. It is always in verbose mode when actually running the kernel.--python: Python interpreter to use when invoking inside the environment. (Defaultpython. Unlike other kernels, this defaults to a relative path because the point of envkernel is to set up PATH properly.) If this is the special valueSELF, this will be replaced with the value ofsys.executableof the Python running envkernel.--kernel=NAME: Auto-set--languageand--kernel-cmdto that needed for these well-known kernels. Options includeipykernel(the default),ir, orimatlab. But all of these hard-code a kernel command line and could possibly be wrong some day.--kernel-cmd: a string which is the kernel to start - space separated, no shell quoting, it will be split when saving. The default ispython -m ipykernel_launcher -f {connection_file}, which is suitable for IPython. For example, to start an R kernel in the environment useR --slave -e IRkernel::main() --args {connection_file}as the value to this, being careful with quoting the spaces only once. To find what the strings should be, copy form some existing kernels.--kernel=NAMEincludes shortcut for some popular kernels.--kernel-template: An already-installed kernel name which is used as a template for the new envkernel. This is searched using the normal Jupyter search paths. This kernel json file is loaded and used as a template for all kernel options (--language,--kernel-cmd, etc). Also, any other file in this directory (such as logos) are copied to the new kernel (like kernel.js in irkernel).--kernel-make-path-relativeremoves an absolute path from the kernel command (mainly useful with--kernel-template). This would be useful, for example, where you are setting up an lmod install and the absolute path of the module might change, but you want it to always run Python relative to that module anyway.--env=NAME=VALUE. Set these environment variables when running the kernel. These are actually just saved in thekernel.jsonfile under theenvkey, which is used by Jupyter itself. So, this is just a shorthand for adding variables there, it is not used at the envkernel stage at all.
Order of precedence of options (later in the list overrides earlier):
--kernel-template, --kernel, --kernel-cmd, --language,
--python, --display-name.
Conda
The Conda envkernel will activate Conda environments (set the PATH,
CPATH, LD_LIBRARY_PATH, and LIBRARY_PATH environment variables).
This is done manually, if anyone knows a better way to do this, please
inform us.
Conda example
This will load the anaconda environment before invoking an IPython
kernel using the name python, which will presumably be the one
inside the anaconda3 environment.
envkernel conda --name=conda-anaconda3 /path/to/anaconda3
Conda mode arguments
General invocation:
envkernel conda --name=NAME [envkernel options] conda-env-full-path
conda-env-full-path: Full path to the conda environment to load.
Virtualenv
This operates identically to conda mode, but with name virtualenv
on virtualenvs.
Virtualenv example
envkernel virtualenv --name=conda-anaconda3 /path/to/anaconda3
Docker
Docker is a containerization system that runs as a system service.
Note: docker has not been fully tested, but has been reported to work.
Docker example
envkernel docker --name=NAME --pwd --bind /m/jh/coursedata/:/coursedata /path/to/image.simg
Docker mode arguments
General invocation:
envkernel docker --name=NAME [envkernel options] [docker options] [image]
-
image: Required positional argument: name of docker image to run. -
--pwd: Bind-mount the current working directory and use it as the current working directory inside the notebook. This is usually useful. -
A few more yet-undocumented and untested arguments...
Any unknown argument is passed directly to the docker run call, and
thus can be any normal Docker argument. If ,copy is included in the
--mount command options, the directory will be copied before
mounting. This may be useful if the directory is on a network mount
which the root docker can't access. It is recommended to always use
the form of options with =, such as --option=X, rather than
separating them with a space, to avoid problems with argument/option
detection.
Singularity
Singularity is a containerization system somewhat similar to Docker, but designed for user-mode usage without root, and with a mindset of using user software instead of system services.
Singularity example
envkernel singularity --name=NAME --contain --bind /m/jh/coursedata/:/coursedata /path/to/image.simg
Singularity mode arguments
General invocation:
envkernel singularity --name=NAME [envkernel options] [singularity options] [image]
-
image: Required positional argument: name of singularity image to run. -
--pwd: Bind-mount the current working directory and use it as the current working directory inside the notebook. This may happen by default if you don't--contain.
Any unknown argument is passed directly to the singularity exec
call, and thus can be any normal Singularity arguments. It is
recommended to always use the form of options with =, such as
--bind=X, rather than separating them with a space, to avoid
problems with argument/option detection. The most useful Singularity
options are (nothing envkernel specific here):
-
--containor-c: Don't share any filesystems by default. -
--bind src:dest[:ro]: Bind mountsrcfrom the host todestin the container.:rois optional, and defaults torw. -
--cleanenv: Clean all environment before executing. -
--netor-n: Run in new network namespace. This does NOT work with Jupyter kernels, because localhost must currently be shared. So don't use this unless we create proper net gateway.
Lmod
The Lmod envkernel will load/unload Lmod modules before running a normal IPython kernel.
Using envkernel is better than the naive (but functional) method of modifying a kernel to invoke a particular Python binary, because that will invoke the right Python interpreter but not set relevant other environment variables (so, for example, subprocesses won't be in the right environment).
Lmod example
This will run module purge and then module load anaconda3 before
invoking an IPython kernel using the name python, which will
presumably be the one inside the anaconda3 environment.
envkernel lmod --name=anaconda3 --purge anaconda3
Lmod mode arguments
General invocation:
envkernel lmod --name=N
