Alcove
Control plane for system processes
Install / Use
/learn @msantos/AlcoveREADME
alcove
alcove is:
- a control plane for system processes
- an interface for system programming
- a library for building containerized services
alcove runs a stand-alone Unix process to communicate with the Erlang VM using standard input and output through Erlang ports. prx provides a higher level library that maps the alcove Unix processes to Erlang processes.
Build
rebar3 compile
Linux: Statically Linking Using musl
To build a statically linked executable:
sudo apt install musl-dev musl-tools
# clone the kernel headers somewhere
export MUSL_INCLUDE=/tmp
git clone https://github.com/sabotage-linux/kernel-headers.git $MUSL_INCLUDE/kernel-headers
# then compile
./musl-wrapper rebar3 do clean, compile
Tests
To run tests, see Setting Up Privileges.
rebar3 do clean, compile, ct
Generate Code
To regenerate Erlang and C code:
make gen
Overview
When alcove is started, it enters an event loop:
{ok, Drv} = alcove_drv:start().
Similar to a shell, alcove waits for a command. For example, alcove can be requested to fork(2):
{ok, Child1} = alcove:fork(Drv, []).
A new process is created in a parent/child relationship:
beam.smp
|-erl_child_setup
| `-alcove
| `-alcove
Processes are arranged in a pipeline:
-
a pipeline is a list of 0 or more integers representing the process IDs
By default, pipelines are limited to a length of 16 processes. The pipeline length can be increased using getopt/3 up to the system limits.
-
unlike in a shell, each successive process in the pipeline is forked from the previous process
-
like a shell pipeline, the stdout of a process is connected to the stdin of the next process in the pipeline using a FIFO
The child process is addressed via the pipeline using a list of PIDs:
{ok, Child2} = alcove:fork(Drv, [Child1]),
Child2 = alcove:getpid(Drv, [Child1, Child2]).
An empty pipeline refers to the port process:
{ok, Child3} = alcove:fork(Drv, []).
Finally, we can replace the event loop with a system executable by calling exec(3):
ok = alcove:execvp(Drv, [Child1, Child2], "/bin/cat", ["/bin/cat"]).
The process tree now looks like:
beam.smp
|-erl_child_setup
| `-alcove
| |-alcove
| | `-cat
| `-alcove
We can interact with the process via stdin, stdout and stderr:
alcove:stdin(Drv, [Child1, Child2], "hello process\n"),
[<<"hello process\n">>] = alcove:stdout(Drv, [Child1, Child2]).
Setting Up Privileges
- sudo
sudo visudo -f /etc/sudoers.d/99_alcove
<user> ALL = NOPASSWD: /path/to/alcove/priv/alcove
Defaults!/path/to/alcove/priv/alcove !requiretty
When starting alcove, pass in the exec option:
{ok, Drv} = alcove_drv:start([{exec, "sudo -n"}]).
- setuid
chown root:root priv/alcove
chmod u+s priv/alcove
-
Linux: file capabilities
See capabilities(7) and setcap(8).
Creating a chroot
The standard Unix way of sandboxing a process is by doing a chroot(2). The chroot process involves:
- running as root
- setting process limits
- changing the root directory to limit the process view of the filesystem
- changing to an unprivileged user
- running the sandboxed code
See chrootex.erl.
We'll create a chroot using an interface like:
-spec sandbox(port(), [iodata()]) -> non_neg_integer().
sandbox(Drv, ["/bin/sh", "-i"]).
The function returns the system PID of the child process, creating an interactive shell we access through standard I/O.
The port will need root privileges to call chroot(2).
{ok, Drv} = alcove_drv:start([{exec, "sudo -n"}]).
We'll use setrlimit(2) to set some process limits.
setlimits(Drv, Child) ->
% Disable writing to files
ok = alcove:setrlimit(
Drv,
[Child],
rlimit_fsize,
#alcove_rlimit{cur = 0, max = 0}
),
% Limit to one process
ok = alcove:setrlimit(
Drv,
[Child],
rlimit_nproc,
#alcove_rlimit{cur = 1, max = 1}
),
% Disable opening new file descriptors
{ok, NFD} = alcove:getrlimit(Drv, [Child], rlimit_nofile),
ok = alcove:setrlimit(Drv, [Child], rlimit_nofile, #alcove_rlimit{
cur = NFD#alcove_rlimit.cur, max = NFD#alcove_rlimit.cur
}).
Next we call chroot(2), drop root privileges and set the user and group to a random, high UID/GID that is unlikely to conflict with an existing system user:
chroot(Drv, Child, Path) ->
ok = alcove:chroot(Drv, [Child], Path),
ok = alcove:chdir(Drv, [Child], "/").
drop_privs(Drv, Child, Id) ->
ok = alcove:setgid(Drv, [Child], Id),
ok = alcove:setuid(Drv, [Child], Id).
id() ->
16#f0000000 + rand:uniform(16#ffff).
Tying it all together:
% The default is to run the cat command using a statically linked
% executable because the chroot limits the view of the host filesystem.
sandbox(Drv) ->
sandbox(Drv, ["/bin/busybox", "cat"]).
sandbox(Drv, Argv) ->
{Path, Arg0, Args} = argv(Argv),
{ok, Child} = alcove:fork(Drv, []),
setlimits(Drv, Child),
chroot(Drv, Child, Path),
drop_privs(Drv, Child, id()),
ok = alcove:execvp(Drv, [Child], Arg0, [Arg0, Args]),
Child.
% Set the program path for the chroot
argv([Arg0, Args]) ->
Path = filename:dirname(Arg0),
Progname = filename:join(["/", filename:basename(Arg0)]),
{Path, Progname, Args}.
Compile and run the example:
make eg
rebar shell
1> {ok, Drv} = chrootex:start().
{ok,<0.229.0>}
2> Cat = chrootex:sandbox(Drv).
31831
3> alcove:stdin(Drv, [Cat], "test test\n").
4> alcove:stdout(Drv, [Cat]).
[<<"test test\n">>]
We can test the limits of the sandbox by using a shell instead of herding cats:
5> Sh = chrootex:sandbox(Drv, ["/bin/busybox", "sh"]).
31861
% Test the shell is working
6> alcove:stdin(Drv, [Sh], "echo hello\n").
ok
7> alcove:stdout(Drv, [Sh]).
[<<"hello\n">>]
% Attempt to create a file
6> alcove:stdin(Drv, [Sh], "> foo\n").
ok
7> alcove:stderr(Drv, [Sh]).
[<<"sh: can't create foo: Too many open files\n">>]
% Try to fork a new process
8> alcove:stdin(Drv, [Sh], "ls\n").
9> alcove:stderr(Drv, [Sh]).
[<<"sh: can't fork\n">>]
Creating a Container Using Linux Namespaces
Namespaces are the basis for linux containers. New namespaces are created using clone(2). We'll rewrite the chroot example to run inside a namespace and use another Linux feature, cgroups, to limit the system resources available to the process.
See nsex.erl.
-
set process limits using cgroups (see cpuset(7))
When the port is started, we'll create a new cgroup just for our application and, whenever a sandboxed process is forked, we'll add it to this cgroup.
start() ->
{ok, Drv} = alcove_drv:start([{exec, "sudo -n"}]),
% Create a new cgroup for our processes
ok = alcove_cgroup:create(Drv, [], <<"alcove">>),
% Set the CPUs these processes are allowed to run on. For example,
% if there are 4 available CPUs, any process in this cgroup will only
% be able to run on CPU 0
{ok, 1} = alcove_cgroup:set(
Drv,
[],
<<"cpuset">>,
<<"alcove">>,
<<"cpuset.cpus">>,
<<"0">>
),
{ok, 1} = alcove_cgroup:set(
Drv,
[],
<<"cpuset">>,
<<"alcove">>,
<<"cpuset.mems">>,
<<"0">>
),
% Set the amount of memory available to the process
% Total memory, including swap. We allow this to fail, because some
% systems may not have a swap partition/file
alcove_cgroup:set(
Drv,
[],
<<"memory">>,
<<"alcove">>,
<<"memory.memsw.limit_in_bytes">>,
<<"16m">>
),
% Total memory
{ok, 3} = alcove_cgroup:set(
Drv,
[],
<<"memory">>,
<<"alcove">>,
<<"memory.limit_in_bytes">>,
<<"16m">>
),
Drv.
setlimits(Drv, Child) ->
% Add our process to the "alcove" cgroup
{ok, _} = alcove_cgroup:set(
Drv,
[],
<<>>,
<<"alcove">>,
<<"tasks">>,
integer_to_list(Child)
).
sandbox(Drv, Argv) ->
{Path, Arg0, Args} = argv(Argv),
{ok, Child} = alcove:clone(Drv, [], [
% IPC
clone_newipc,
% network
clone_newnet,
% mounts
clone_newns,
% PID, Child is PID 1 in the namespace
clone_newpid,
% hostname
clone_newuts
]),
setlimits(Drv, Child),
chroot(Drv, Child, Path),
drop_privs(Drv, Child, id()),
ok = alcove:execvp(Drv, [Child], Arg0, [Arg0, Args]),
Child.
Operating System Support
Functions marked as operating system specific
Related Skills
node-connect
343.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
92.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
343.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
343.3kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
