Vek

SIMD Accelerated vector functions for Go

Generate Convert Improve

Install / Use

/learn @viterin/Vek

About this skill

Quality Score

0/100

README

vek | SIMD Vector Functions

vek is a collection of SIMD accelerated vector functions for Go.

Most modern CPUs have special SIMD instructions (Single Instruction, Multiple Data) to process data in parallel, but there is currently no way to use them in a pure Go program. vek implements a large number of common vector operations in SIMD accelerated assembly code and wraps them in a simple Go API. vek supports most modern x86 CPUs and falls back to a pure Go implementation on unsupported platforms.

Features

Fast, average speedups of 10x for float32 vectors
Fallback to pure Go on unsupported platforms
Support for float64, float32 and bool vectors
Zero allocation variations of each function

Installation

go get -u github.com/viterin/vek

Getting Started

Simple Arithmetic Example

Vectors are represented as plain old floating point slices, there are no special data types in vek. All operations on float64 vectors reside in the vek package. It contains all the basic arithmetic operations:

package main

import (
	"fmt"
	"github.com/viterin/vek"
)

func main() {
	x := []float64{0, 1, 2, 3, 4}

	// Multiply a vector by itself element-wise
	y := vek.Mul(x, x)
	fmt.Println(x, y) // [0 1 2 3 4] [0 1 4 9 16]

	// Multiply each element by a number
	y = vek.MulNumber(x, 2)
	fmt.Println(x, y) // [0 1 2 3 4] [0 2 4 6 8]
}

Working With 32-Bit Vectors

The vek32 package contains float32 versions of each operation:

package main

import (
	"fmt"
	"github.com/viterin/vek/vek32"
)

func main() {
	// Add a float32 number to each element
	x := []float32{0, 1, 2, 3, 4}
	y := vek32.AddNumber(x, 2)

	fmt.Println(x, y) // [0 1 2 3 4] [2 3 4 5 6]
}

Comparisons and Selections

Floating point vectors can be compared to other vectors or numbers. The result is a bool vector indicating where the comparison holds true. bool vectors can be used to select matching elements, count matches and more:

package main

import (
	"fmt"
	"github.com/viterin/vek"
)

func main() {
	x := []float64{0, 1, 2, 3, 4, 5}
	y := []float64{5, 4, 3, 2, 1, 0}

	// []bool indicating where x < y (less than)
	m := vek.Lt(x, y)
	fmt.Println(m)            // [true true true false false false]
	fmt.Println(vek.Count(m)) // 3

	// []bool indicating where x >= 2 (greater than or equal)
	m = vek.GteNumber(x, 2)
	fmt.Println(m)          // [false false true true true true]
	fmt.Println(vek.Any(m)) // true

	// Selection of non-zero elements less than y
	z := vek.Select(x,
		vek.And(
			vek.Lt(x, y),
			vek.NeqNumber(x, 0),
		),
	)
	fmt.Println(z) // [1 2]
}

Creating and Converting Vectors

vek has a number of functions to construct new vectors and convert between vector types efficiently:

package main

import (
	"fmt"
	"github.com/viterin/vek"
	"github.com/viterin/vek/vek32"
)

func main() {
	// Vector with number repeated n times
	x := vek.Repeat(2, 5)
	fmt.Println(x) // [2 2 2 2 2]

	// Vector ranging from a to b (excl.) in steps of 1
	x = vek.Range(-2, 3)
	fmt.Println(x) // [-2 -1 0 1 2]

	// Conversion from float64 to int32
	xi32 := vek.ToInt32(x)
	fmt.Println(xi32) // [-2 -1 0 1 2]

	// Conversion from int32 to float32
	x32 := vek32.FromInt32(xi32)
	fmt.Println(x32) // [-2 -1 0 1 2]
}

Avoiding Allocations

By default, functions allocate a new array to store the result. Append _Inplace to a function to do the operation inplace, overriding the data of the first argument slice with the result. Append _Into to write the result into a target slice.

package main

import (
	"fmt"
	"github.com/viterin/vek"
)

func main() {
	x := []float64{0, 1, 2, 3, 4}
	vek.AddNumber_Inplace(x, 2)

	y := make([]float64, len(x))
	vek.AddNumber_Into(y, x, 2)

	fmt.Println(x, y) // [2 3 4 5 6] [4 5 6 7 8]
}

SIMD Acceleration

SIMD Acceleration is enabled by default on supported platforms, which is any x86/amd64 CPU with the AVX2 and FMA extensions. Use vek.Info() to see if hardware acceleration is enabled. Turn it off or on with vek.SetAcceleration(). Acceleration is currently disabled by default on mac as I have no machine to test it on.

package main

import (
	"fmt"
	"github.com/viterin/vek"
)

func main() {
	fmt.Printf("%+v", vek.Info())
	// {CPUArchitecture:amd64 CPUFeatures:[AVX2 FMA ..] Acceleration:true}
}

API

| | |:--------------------------------| | Arithmetic | vek.Add(x, y) | | vek.AddNumber(x, a) | | vek.Sub(x, y) | | vek.SubNumber(x, a) | | vek.Mul(x, y) | | vek.MulNumber(x, a) | | vek.Div(x, y) | | vek.DivNumber(x, a) | | vek.Abs(x) | | vek.Neg(x) | | vek.Inv(x) | | Aggregates | vek.Sum(x) | | vek.CumSum(x) | | vek.Prod(x) | | vek.CumProd(x) | | vek.Mean(x) | | vek.Median(x) | | vek.Quantile(x, q) | | Distance | vek.Dot(x, y) | | vek.Norm(x) | | vek.Distance(x, y) | | vek.ManhattanNorm(x) | | vek.ManhattanDistance(x, y) | | vek.CosineSimilarity(x, y) | | Matrices | vek.MatMul(x, y, n) | vek.Mat4Mul(x, y) | | Special | vek.Sqrt(x) | | vek.Pow(x, y) | | vek.Round(x), Floor(x), Ceil(x) | | Special (32-bit only) | vek32.Sin(x) | | vek32.Cos(x) | | vek32.Exp(x) | | vek32.Log(x), Log2(x), Log10(x) | | Comparison | vek.Min(x) | | vek.ArgMin(x) | | vek.Minimum(x, y) | | vek.MinimumNumber(x, a) | | vek.Max(x) | | vek.ArgMax(x) | | vek.Maximum(x, y) | | vek.MaximumNumber(x, a) | | vek.Find(x, a) | vek.Lt(x, y) | | vek.LtNumber(x, a) | | vek.Lte(x, y) | | vek.LteNumber(x, a) | | vek.Gt(x, y) | | vek.GtNumber(x, a) | | vek.Gte(x, y) | | vek.GteNumber(x, a) | | vek.Eq(x, y) | | vek.EqNumber(x, a) | | vek.Neq(x, y) | | vek.NeqNumber(x, a) | | Boolean description | ----------------------------------------------:| | | element-wise addition | add number to each element | element-wise subtraction | subtract number from each element | element-wise multiplication | multiply each element by number | element-wise division | divide each element by number | absolute values | additive inverses | multiplicative inverses | | | sum of elements | cumulative sum | product of elements | cumulative product | mean | median | q-th quantile, 0 <= q <= 1 | | | dot product | euclidean norm (length) | euclidean distance | sum of absolute values | sum of absolute differences | cosine similarity | | | | multiply m-by-n and n-by-p matrix (row-major) | specialization for 4 by 4 matrices | | | square root of each element | element-wise power | round to nearest, lesser or greater integer | | | sine of each element | cosine of each element | exponential function | natural, base 2 and base 10 logarithms | | | minimum value | first index of the minimum value | element-wise minimum values | minimum of each element and number | maximum value | first index of the maximum value | element-wise maximum values | maximum of each element and number | | first index of number, -1 if not found | element-wise less than | less than number | element-wise less than or equal | less than or equal to number | element-wise greater than | greater than number | element-wise greater than or equal | greater than or equal to number | element-wise equality | equal to number | element-wise non-equality | not equal to number | |

Related Skills

node-connect

349.2k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

109.5k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

349.2k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

349.2k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。