Volcano
A Cloud Native Batch System (Project under CNCF)
Install / Use
/learn @volcano-sh/VolcanoREADME
Volcano is a Kubernetes-native batch scheduling system, extending and enhancing the capabilities of the standard kube-scheduler. It provides a comprehensive set of features specifically designed to manage and optimize various batch and elastic workloads, including Artificial Intelligence (AI) / machine learning (ML) / deep learning (DL), bioinformatics / genomics, and other "Big Data" applications.
These workloads commonly leverage AI, Big Data, and HPC frameworks such as Spark, Flink, Ray, TensorFlow, PyTorch, Argo, MindSpore, PaddlePaddle, Kubeflow, MPI, Horovod, MXNet, KubeGene, and others, with which Volcano offers robust integration.
Volcano incorporates over fifteen years of collective experience in operating diverse high-performance workloads at scale across multiple systems and platforms. It combines proven best practices and innovative concepts from the open-source community to deliver a powerful and flexible scheduling solution.
As of 2025, Volcano has seen widespread adoption across numerous industries globally, including Internet/Cloud, Finance, Manufacturing, and Medical sectors. Many organizations and institutions are not only end-users but also active contributors to the project. Hundreds of contributors actively participate in code commits, pull request reviews, issue discussions, documentation updates, and design proposals. We encourage your participation in the ongoing development and growth of the Volcano project.
[!NOTE] the scheduler is built based on kube-batch; refer to #241 and #288 for more detail.

Volcano is an incubating project of the Cloud Native Computing Foundation (CNCF). Please consider joining the CNCF if you are an organization that wants to take an active role in supporting the growth and evolution of the cloud native ecosystem.
Overall Architecture

Talks
- Intro: Kubernetes Batch Scheduling @ KubeCon 2019 EU
- Volcano 在 Kubernetes 中运行高性能作业实践 @ ArchSummit 2019
- Volcano:基于云原生的高密计算解决方案 @ Huawei Connection 2019
- Improving Performance of Deep Learning Workloads With Volcano @ KubeCon 2019 NA
- Batch Capability of Kubernetes Intro @ KubeCon 2019 NA
- Optimizing Knowledge Distillation Training With Volcano @ KubeCon 2021 EU
- Exploration About Mixing Technology of Online Services and Offline Jobs Based On Volcano @ KubeCon 2021 China
- Volcano - Cloud Native Batch System for AI, Big Data and HPC @ KubeCon 2022 EU
- How to Leverage Volcano to Improve the Resource Utilization of AI Pharmaceuticals, Autonomous Driving, and Smart Buildings @ KubeCon 2023 EU
- Run Your AI Workloads and Microservices on Kubernetes More Easily and Efficiently @ KubeCon 2023 China
- Optimize LLM Workflows with Smart Infrastructure Enhanced by Volcano @ KubeCon 2024 China
- How Volcano Enable Next Wave of Intelligent Applications @ KubeCon 2024 China
- Leverage Topology Modeling and Topology-Aware Scheduling to Accelerate LLM Training @ KubeCon 2024 China
Ecosystem
- Spark Operator
- Native Spark
- Flink
- KubeRay
- PyTorch
- TensorFlow
- kubeflow/training-operator
- kubeflow/arena
- MPI
- Horovod
- PaddlePaddle
- Cromwell
- MindSpore
- MXNet
- Argo
- KubeGene
Use Cases
- Why Spark chooses Volcano as built-in batch scheduler on Kubernetes?
- ING Bank: How Volcano empowers its big data analytics platform
- Using Volcano as a custom scheduler for Apache Spark on Amazon EMR on EKS
- Deploy Azure Machine Learning extension on AKS or Arc Kubernetes cluster
- Practical Tips for Preventing GPU Fragmentation for Volcano Scheduler
- Using Volcano in Large-Scale, Distributed Offline Computing
- OpenI-Octopus: How to Avoid Resource Preemption in Kubernetes Clusters
- How Does Volcano Empower a Content Recommendation Engine in Xiaohongshu
- How Ruitian Used Volcano to Run Large-Scale Offline HPC Jobs
- Integrating Volcano into the Leinao Cloud OS
- HPC on Volcano: How Containers Support HPC Applications in the Meteorological Industry
- iQIYI:Volcano-based Cloud Native Migration Practices
- PaddlePaddle Distributed Training on Volcano
Quick Start Guide
Prerequisites
- Kubernetes 1.12+ with CRD support
You can try Volcano by one of the following two ways.
[!NOTE]
- For Kubernetes v1.17 and above, use CRDs under config/crd/bases (recommended)
- For Kubernetes v1.16 and below, use CRDs under config/crd/v1beta1 (deprecated)
Install with YAML files
Install Volcano on an existing Kubernetes cluster. This way is both available for x86_64 and arm64 architecture.
kubectl apply -f https://raw.githubusercontent.com/volcano-sh/volcano/master/installer/volcano-development.yaml
Enjoy! Volcano will create the following resources in volcano-system namespace.
NAME READY STATUS RESTARTS AGE
pod/volcano-admission-5bd5756f79-dnr4l 1/1 Running 0 96s
pod/volcano-controllers-687948d9c8-nw4b4 1/1 Running 0 96s
pod/volcano-scheduler-94998fc64-4z8kh 1/1 Running 0 96s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/volcano-admission-service ClusterIP 10.98.152.108 <none> 443/TCP 96s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/volcano-admission 1/1 1 1 96s
deployment.apps/volcano-controllers 1/1 1 1 96s
deployment.apps/volcano-scheduler 1/1 1 1 96s
NAME DESIRED CURRENT READY AGE
re
Related Skills
node-connect
338.7kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
xurl
338.7kA CLI tool for making authenticated requests to the X (Twitter) API. Use this skill when you need to post tweets, reply, quote, search, read posts, manage followers, send DMs, upload media, or interact with any X API v2 endpoint.
frontend-design
83.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
338.7kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
