Hkube
🐟 High Performance Computing over Kubernetes - Core Repo 🎣
Install / Use
/learn @kube-HPC/HkubeREADME
<!-- omit in toc -->
HKube is a cloud-native open source framework to run distributed pipeline of algorithms built on Kubernetes.
HKube optimally utilizing pipeline's resources, based on user priorities and heuristics.
Features <!-- omit in toc -->
-
Distributed pipeline of algorithms
- Receives DAG graph as input and automatically parallelizes your algorithms over the cluster.
- Manages the complications of distributed processing, keep your code simple (even single threaded).
-
Language Agnostic - As a container based framework designed to facilitate the use of any language for your algorithm.
-
Batch Algorithms - Run algorithms as a batch - instances of the same algorithm in order to accelerate the running time.
-
Optimize Hardware Utilization
- Containers automatically placed based on their resource requirements and other constraints, while not sacrificing availability.
- Mixes critical and best-effort workloads in order to drive up utilization and save resources.
- Efficient execution and clustering by heuristics which uses pipeline and algorithm metrics with combination of user requirements.
-
Build API - Just upload your code, you don't have to worry about building containers and integrating them with HKube API.
-
Cluster Debugging
- Debug a part of a pipeline based on previous results.
- Debug a single algorithm on your IDE, while the rest of the algorithms running in the cluster.
-
Jupyter Integration - Scale your jupyter running tasks Jupyter with hkube.
User Guide <!-- omit in toc -->
<!-- TOC -->- Installation
- APIs
- API Usage Example
Installation
Dependencies
HKube runs on top of Kubernetes so in order to run HKube we have to install it's prerequisites.
-
Kubernetes - Install Kubernetes or Minikube or microk8s.
-
Helm - HKube installation uses Helm, follow the installation guide.
Helm
-
Add the HKube Helm repository to
helm:helm repo add hkube http://hkube.io/helm/ -
Configure a docker registry for builds
Create avalues.yamlfile for custom helm values
build_secret:
# pull secret is only needed if docker hub is not accessible
pull:
registry: ''
namespace: ''
username: ''
password: ''
# enter your docker hub / other registry credentials
push:
registry: '' # can be left empty for docker hub
namespace: '' # registry namespace - usually your username
username: ''
password: ''
-
Install HKube chart
helm install hkube/hkube -f ./values.yaml --name my-release
This command installs HKube in a minimal configuration for development. Check production-deployment.
APIs
There are three ways to communicate with HKube: Dashboard, REST API and CLI.
UI Dashboard
Dashboard is a web-based HKube user interface. Dashboard supports every functionality HKube has to offer.

REST API
HKube exposes it's functionality with REST API.
- API Spec
- Swagger-UI - locally
{yourDomain}/hkube/api-server/swagger-ui
CLI
hkubectl is HKube command line tool.
hkubectl [type] [command] [name]
# More information
hkubectl --help
Download hkubectl latest version.
curl -Lo hkubectl https://github.com/kube-HPC/hkubectl/releases/latest/download/hkubectl-linux \
&& chmod +x hkubectl \
&& sudo mv hkubectl /usr/local/bin/
For mac replace with hkubectl-macos
For Windows download hkubectl-win.exe
Config hkubectl with your running Kubernetes.
# Config
hkubectl config set endpoint ${KUBERNETES-MASTER-IP}
hkubectl config set rejectUnauthorized false
Make sure
kubectlis configured to your cluster.HKube requires that certain pods will run in privileged security permissions, consult your Kubernetes installation to see how it's done.
API Usage Example
The Problem
We want to solve the next problem with given input and a desired output:
- Input: Two numbers
N,k. - Desired Output: A number
Mso: <div style="text-align:center"><img src="https://latex.codecogs.com/svg.latex?M&space;=&space;\sum_{i=1}^N&space;k\cdot&space;i" title="M = \sum_{i=1}^N k\cdot i" /></div>
For example: N=5, k=2 will result: <div style="text-align:center"><img src="https://latex.codecogs.com/svg.latex?2\cdot1+2\cdot&space;2&space;+&space;2\cdot&space;3&space;+&space;2\cdot&space;4&space;+&space;2\cdot&space;5&space;=&space;2&space;+&space;4&space;+6+8+10&space;=&space;30&space;=&space;M" title="2\cdot1+2\cdot 2 + 2\cdot 3 + 2\cdot 4 + 2\cdot 5 = 2 + 4 +6+8+10 = 30 = M" /></div>
Solution
We will solve the problem by running a distributed pipeline of three algorithms: Range, Multiply and Reduce.
Range Algorithm
Creates an array of length N.
N = 5
5 -> [1,2,3,4,5]
Multiply Algorithm
Multiples the received data from Range Algorithm by k.
k = 2
[1,2,3,4,5] * (2) -> [2,4,6,8,10]
Reduce Algorithm
The algorithm will wait until all the instances of the Multiply Algorithm will finish then it will summarize the received data together .
[2,4,6,8,10] -> 30
Building a Pipeline
We will implement the algorithms using various languages and construct a pipeline from them using HKube.

Pipeline Descriptor
The pipeline descriptor is a JSON object which describes and defines the links between the nodes by defining the dependencies between them.
{
"name": "numbers",
"nodes": [
{
"nodeName": "Range",
"algorithmName": "range",
"input": ["@flowInput.data"]
},
{
"nodeName": "Multiply",
"algorithmName": "multiply",
"input": ["#@Range", "@flowInput.mul"]
},
{
"nodeName": "Reduce",
"algorithmName": "reduce",
"input": ["@Multiply"]
}
],
"flowInput": {
"data": 5,
"mul": 2
}
}
Note the
flowInput:data= N = 5,mul= k = 2
Node dependencies
HKube allows special signs in nodes input for defining the pipeline execution flow.
In our case we used:
(@) — References input parameters for the algorithm.
(#) — Execute nodes in parallel and reduce the results into single node.
(#@) — By combining # and @ we can create a batch processing on node results.

JSON Breakdown
We created a pipeline name numbers.
"name":"numbers"
The pipeline is defined by three nodes.
"nodes":[
{
"nodeName":"Range",
"algorithmName":"range",
"input":["@flowInput.data"]
},
{
"nodeName":"Multiply",
"algorithmName":"multiply",
"input":["#@Range","@flowInput.mul"]
},
{
"nodeName":"Reduce",
"algorithmName":"reduce",
"input":["@Multiply"]
},
]
In HKube, the linkage between the nodes is done by defining the algorithm inputs. Multiply will be run after Range algorithm because of the input dependency between them.
Keep in mind that HKube will transport the results between the nodes automatically for doing it HKube currently support two different types of transportation layers object storage and files system.

