Heavydb
HeavyDB (formerly MapD/OmniSciDB)
Install / Use
/learn @heavyai/HeavydbREADME
HeavyDB (formerly OmniSciDB)
HeavyDB is an open source SQL-based, relational, columnar database engine that leverages the full performance and parallelism of modern hardware (both CPUs and GPUs) to enable querying of multi-billion row datasets in milliseconds, without the need for indexing, pre-aggregation, or downsampling. HeavyDB can be run on hybrid CPU/GPU systems (Nvidia GPUs are currently supported), as well as on CPU-only systems featuring X86, Power, and ARM (experimental support) architectures. To achieve maximum performance, HeavyDB features multi-tiered caching of data between storage, CPU memory, and GPU memory, and an innovative Just-In-Time (JIT) query compilation framework.
For usage info, see the product documentation, and for more details about the system's internal architecture, check out the developer documentation. Further technical discussion can be found on the HEAVY.AI Community Forum.
The repository includes a number of third party packages provided under separate licenses. Details about these packages and their respective licenses is at ThirdParty/licenses/index.md.
Downloads and Installation Instructions
HEAVY.AI provides pre-built binaries for Linux for stable releases of the project:
| Distro | Package type | CPU/GPU | Repository | Docs | | --- | --- | --- | --- | --- | | CentOS | RPM | CPU | https://releases.heavy.ai/os/yum/stable/cpu | https://docs.heavy.ai/installation-and-configuration/installation/installing-on-centos/centos-yum-gpu-ee | | CentOS | RPM | GPU | https://releases.heavy.ai/os/yum/stable/cuda | https://docs.heavy.ai/installation-and-configuration/installation/installing-on-centos/centos-yum-gpu-ee | | Ubuntu | DEB | CPU | https://releases.heavy.ai/os/apt/dists/stable/cpu | https://docs.heavy.ai/installation-and-configuration/installation/installing-on-ubuntu/centos-yum-gpu-ee | | Ubuntu | DEB | GPU | https://releases.heavy.ai/os/apt/dists/stable/cuda | https://docs.heavy.ai/installation-and-configuration/installation/installing-on-ubuntu/centos-yum-gpu-ee | | * | tarball | CPU | https://releases.heavy.ai/os/tar/heavyai-os-latest-Linux-x86_64-cpu.tar.gz | | | * | tarball | GPU | https://releases.heavy.ai/os/tar/heavyai-os-latest-Linux-x86_64.tar.gz | |
Developing HeavyDB: Table of Contents
Links
- Developer Documentation
- Doxygen-generated Documentation
- Product Documentation
- Release Notes
- Community Forum
- HEAVY.AI Homepage
- HEAVY.AI Blog
- HEAVY.AI Downloads
License
This project is licensed under the Apache License, Version 2.0.
The repository includes a number of third party packages provided under separate licenses. Details about these packages and their respective licenses is at ThirdParty/licenses/index.md.
Contributing
In order to clarify the intellectual property license granted with Contributions from any person or entity, HEAVY.AI must have a Contributor License Agreement ("CLA") on file that has been signed by each Contributor, indicating agreement to the Contributor License Agreement. After making a pull request, a bot will notify you if a signed CLA is required and provide instructions for how to sign it. Please read the agreement carefully before signing and keep a copy for your records.
Building
If this is your first time building HeavyDB, install the dependencies mentioned in the Dependencies section below.
HeavyDB uses CMake for its build system.
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=debug ..
make -j 4
The following cmake/ccmake options can enable/disable different features:
-DCMAKE_BUILD_TYPE=release- Build type and compiler options to use. Options areDebug,Release,RelWithDebInfo,MinSizeRel, and unset.-DENABLE_ASAN=off- Enable address sanitizer. Default isoff.-DENABLE_AWS_S3=on- Enable AWS S3 support, if available. Default ison.-DENABLE_CUDA=off- Disable CUDA. Default ison.-DENABLE_CUDA_KERNEL_DEBUG=off- Enable debugging symbols for CUDA kernels. Will dramatically reduce kernel performance. Default isoff.-DENABLE_DECODERS_BOUNDS_CHECKING=off- Enable bounds checking for column decoding. Default isoff.-DENABLE_IWYU=off- Enable include-what-you-use. Default isoff.-DENABLE_JIT_DEBUG=off- Enable debugging symbols for the JIT. Default isoff.-DENABLE_ONLY_ONE_ARCH=off- Compile GPU code only for the host machine's architecture, speeding up compilation. Default isoff.-DENABLE_PROFILER=off- Enable google perftools. Default isoff.-DENABLE_STANDALONE_CALCITE=off- Require standalone Calcite server. Default isoff.-DENABLE_TESTS=on- Build unit tests. Default ison.-DENABLE_TSAN=off- Enable thread sanitizer. Default isoff.-DENABLE_CODE_COVERAGE=off- Enable code coverage symbols (clang only). Default isoff.-DPREFER_STATIC_LIBS=off- Static link dependencies, if available. Default isoff. Only works on CentOS.
Testing
HeavyDB uses Google Test as its main testing framework. Tests reside under the Tests directory.
The sanity_tests target runs the most common tests. If using Makefiles to build, the tests may be run using:
make sanity_tests
AddressSanitizer
AddressSanitizer can be activated by setting the ENABLE_ASAN CMake flag in a fresh build directory. At this time CUDA must also be disabled. In an empty build directory run CMake and compile:
mkdir build && cd build
cmake -DENABLE_ASAN=on -DENABLE_CUDA=off ..
make -j 4
Finally run the tests:
export ASAN_OPTIONS=alloc_dealloc_mismatch=0:handle_segv=0
make sanity_tests
ThreadSanitizer
ThreadSanitizer can be activated by setting the ENABLE_TSAN CMake flag in a fresh build directory. At this time CUDA must also be disabled. In an empty build directory run CMake and compile:
mkdir build && cd build
cmake -DENABLE_TSAN=on -DENABLE_CUDA=off ..
make -j 4
We use a TSAN suppressions file to ignore warnings in third party libraries. Source the suppressions file by adding it to your TSAN_OPTIONS env:
export TSAN_OPTIONS="suppressions=/path/to/heavydb/config/tsan.suppressions"
Finally run the tests:
make sanity_tests
Generating Packages
HeavyDB uses CPack to generate packages for distribution. Packages generated on CentOS with static linking enabled can be used on most other recent Linux distributions.
To generate packages on CentOS (assuming starting from top level of the heavydb repository):
mkdir build-package && cd build-package
cmake -DPREFER_STATIC_LIBS=on -DCMAKE_BUILD_TYPE=release ..
make -j 4
cpack -G TGZ
The first command creates a fresh build directory, to ensure there is nothing left over from a previous build.
The second command configures the build to prefer linking to the dependencies' static libraries instead of the (default) shared libraries, and to build using CMake's release configuration (enables compiler optimizations). Linking to the static versions of the libraries libraries reduces the number of dependencies that must be installed on target systems.
The last command generates a .tar.gz package. The TGZ can be replaced with, for example, RPM or DEB to generate a .rpm or .deb, respectively.
Using
The startheavy wrapper script may be used to start HeavyDB in a testing environment. This script performs the following tasks:
- initializes the
datastorage directory viainitdb, if required - starts the main HeavyDB server,
heavydb - offers to download and import a sample dataset, using the
insert_sample_datascript
Assuming you are in the build directory, and it is a subdirectory of the heavydb repository, startheavy may be run by:
../startheavy
Starting Manually
It is assumed that the following commands are run from inside the build directory.
Initialize the data storage directory. This command only needs to be run once.
mkdir data && ./bin/initdb data
Start the HeavyDB server:
./bin/heavydb
If desired, insert a sample dataset by running the insert_sample_data script in a new terminal:
../insert_sample_data
You can now start using the database. The heavysql utility may be used to interact with the database from the command line:
./bin/heavysql -p HyperInteractive
where HyperInteractive is the default password. The default user admin is assumed if not provided.
Code Style
Contributed code should compile without generating warnings by recent compilers on most Linux distributions. Changes to the code should follow the C++ Core Guidelines.
clang-format
A .clang-format style configuration, based on the Chromium style guide, is provided at the top level of the repository. Please format your code using a recent version (8.0+ preferred) of ClangFormat before submitting.
To use:
clan
Related Skills
oracle
334.1kBest practices for using the oracle CLI (prompt + file bundling, engines, sessions, and file attachment patterns).
prose
334.1kOpenProse VM skill pack. Activate on any `prose` command, .prose files, or OpenProse mentions; orchestrates multi-agent workflows.
Command Development
82.1kThis skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
Plugin Structure
82.1kThis skill should be used when the user asks to "create a plugin", "scaffold a plugin", "understand plugin structure", "organize plugin components", "set up plugin.json", "use ${CLAUDE_PLUGIN_ROOT}", "add commands/agents/skills/hooks", "configure auto-discovery", or needs guidance on plugin directory layout, manifest configuration, component organization, file naming conventions, or Claude Code plugin architecture best practices.
