Float

80-bit IEEE 754 extended double precision floating-point library for Go

Generate Convert Improve

Install / Use

/learn @jenska/Float

About this skill

Quality Score

0/100

README

80-bit IEEE 754 extended double precision floating-point library for Go

The float package is a software implementation of floating-point arithmetics that conforms to the 80-bit IEEE 754 extended double precision floating-point format

This package is derived from the original SoftFloat package and was implemented as a basis for a Motorola M68881/M68882 FPU emulation in pure Go

Installation

go get github.com/jenska/float@v1.0.0

Requirements

Go 1.22 or later

Development

This project includes a Makefile for common development tasks:

# Show all available commands
make help

# Development workflow (format, vet, test)
make dev

# Run tests with coverage report
make coverage

# Run benchmarks
make bench

# Clean build artifacts
make clean

Available Make Targets

make all - Run fmt, vet, and test
make build - Verify the project compiles
make test - Run all tests
make bench - Run benchmarks
make coverage - Generate coverage report
make fmt - Format code
make vet - Run go vet
make clean - Clean artifacts
make dev - Development workflow
make ci - CI workflow

CI/CD

This project uses GitHub Actions for continuous integration and deployment:

Workflows

CI (.github/workflows/ci.yml): Runs on every push and PR
- Tests on multiple Go versions (1.21, 1.22, 1.23)
- Tests on multiple platforms (Linux, macOS, Windows)
- Runs linting and static analysis
- Generates and uploads coverage reports
- Validates builds
Release (.github/workflows/release.yml): Runs on version tags
- Creates GitHub releases
- Generates release artifacts
- Publishes coverage reports
CodeQL (.github/workflows/codeql.yml): Security analysis
- Runs weekly and on pushes/PRs
- Performs security and quality analysis
Dependabot (.github/dependabot.yml): Automated dependency updates
- Weekly Go module updates
- Weekly GitHub Actions updates

Status Badges

Add these badges to your README:

[![CI](https://github.com/jenska/float/actions/workflows/ci.yml/badge.svg)](https://github.com/jenska/float/actions/workflows/ci.yml)
[![Go Report Card](https://goreportcard.com/badge/github.com/jenska/float)](https://goreportcard.com/report/github.com/jenska/float)
[![codecov](https://codecov.io/gh/jenska/float/branch/main/graph/badge.svg)](https://codecov.io/gh/jenska/float)
[![Go Reference](https://pkg.go.dev/badge/github.com/jenska/float.svg)](https://pkg.go.dev/github.com/jenska/float)

package main

import (
    "fmt"
    "github.com/jenska/float"
)

func main() {
    // Create extended precision values
    a := float.X80Pi
    b := float.NewFromFloat64(2.0)

    // Perform calculations with higher precision
    result := a.Mul(b)
    fmt.Printf("2π = %s\n", result.String())

    // Use in mathematical computations
    sqrt2 := float.X80Sqrt2
    computation := sqrt2.Mul(sqrt2).Sub(float.X80One)
    fmt.Printf("sqrt(2)² - 1 = %s\n", computation.String())
}

Features

Full IEEE 754 Compliance: Proper handling of 80-bit extended precision
Complete Arithmetic Operations: Add, Sub, Mul, Div, Rem, Sqrt, Ln, Atan, Sin, Cos, Tan
Type Conversions: To/from int32, int64, float32, float64
String Formatting: Binary, decimal, and hexadecimal representations
Exception Handling: IEEE 754 exception flags with customizable handlers
High Performance: Optimized bit-level operations
Thread Safe: Safe for concurrent use (with proper exception handling)

Example

package float_test

import (
    "fmt"
    "github.com/jenska/float"
)

func ExampleX80() {
    pi := float.X80Pi
    pi2 := pi.Add(pi)
    sqrtpi2 := pi2.Sqrt()
    epsilon := sqrtpi2.Mul(sqrtpi2).Sub(pi2)
    fmt.Println(epsilon)
    // Output: -0.000000000000000000433680868994
}

func ExampleExceptionHandling() {
    // Set up exception handling
    float.SetExceptionHandler(func(exc int) {
        fmt.Printf("Exception raised: %x\n", exc)
    })

    // This will raise an exception
    result := float.X80Zero.Ln()
    fmt.Printf("Result: %v\n", result)

    // Check what exceptions occurred
    if float.HasException(float.ExceptionDivbyzero) {
        fmt.Println("Division by zero occurred")
    }

    // Clear exceptions
    float.ClearExceptions()
}

API Reference

Types

X80

The main type representing an 80-bit extended precision floating-point number.

type X80 struct {
    high uint16  // Sign (1 bit) + Exponent (15 bits)
    low  uint64  // Integer bit (1 bit) + Fraction (63 bits)
}

Constants

Predefined Values

X80Zero - Zero
X80One - One
X80MinusOne - Negative one
X80Pi - π (3.1415926535897932384626433832795...)
X80E - e (2.7182818284590452353602874713526...)
X80Ln2 - ln(2)
X80Log2E - log₂(e)
X80Sqrt2 - √2
X80InfPos - Positive infinity
X80InfNeg - Negative infinity
X80NaN - Not a number

Exception Flags

ExceptionInvalid - Invalid operation
ExceptionDenormal - Denormalized number
ExceptionDivbyzero - Division by zero
ExceptionOverflow - Result too large
ExceptionUnderflow - Result too small
ExceptionInexact - Inexact result

Rounding Modes

RoundNearestEven - Round to nearest, ties to even
RoundToZero - Round toward zero
RoundDown - Round toward negative infinity
RoundUp - Round toward positive infinity

Methods

Arithmetic Operations

Add(b X80) X80 - Addition
Sub(b X80) X80 - Subtraction
Mul(b X80) X80 - Multiplication
Div(b X80) X80 - Division
Rem(b X80) X80 - Remainder
Sqrt() X80 - Square root
Ln() X80 - Natural logarithm
Atan() X80 - Arctangent
Sin() X80 - Sine
Cos() X80 - Cosine
Tan() X80 - Tangent

Comparison Operations

Eq(b X80) bool - Equal
Lt(b X80) bool - Less than
Le(b X80) bool - Less than or equal
Gt(b X80) bool - Greater than
Ge(b X80) bool - Greater than or equal

Conversion Operations

ToInt32() int32 - Convert to 32-bit integer
ToInt32RoundZero() int32 - Convert to 32-bit integer with round-toward-zero semantics
ToInt64() int64 - Convert to 64-bit integer
ToInt64RoundZero() int64 - Convert to 64-bit integer with round-toward-zero semantics
ToFloat32() float32 - Convert to 32-bit float
ToFloat64() float64 - Convert to 64-bit float
String() string - Convert to decimal string
Format(fmt byte, prec int) string - Formatted string

Utility Methods

IsNaN() bool - Check if NaN
IsInf() bool - Check if infinity
IsSignalingNaN() bool - Check if signaling NaN

Functions

Creation Functions

NewFromFloat64(f float64) X80 - Create from float64
NewFromBytes(b []byte, order binary.ByteOrder) X80 - Create from bytes
Int32ToFloatX80(i int32) X80 - Create from int32
Int64ToFloatX80(i int64) X80 - Create from int64
Float32ToFloatX80(f float32) X80 - Create from float32
Float64ToFloatX80(f float64) X80 - Create from float64

Exception Handling

SetExceptionHandler(handler ExceptionHandler) - Set exception callback
GetExceptionHandler() ExceptionHandler - Get current handler
GetExceptions() int - Get current exception flags
HasException(flag int) bool - Check specific exception
HasAnyException() bool - Check if any exceptions
ClearExceptions() - Clear all exceptions
ClearException(flag int) - Clear specific exception

Supported Operations

Basic arithmetic: Add, Sub, Mul, Div, Rem
Rounding: RoundToInt
Square root: Sqrt
Logarithm: Ln (natural logarithm)
Arctangent: Atan
Comparisons: Eq, Lt, Le, Gt, Ge
Conversions: to/from int32, int64, float32, float64
Formatting: String formatting with various bases

Performance & Accuracy

Accuracy

This library implements IEEE 754 compliant 80-bit extended precision arithmetic. The transcendental functions (Ln, Atan) use series expansions with sufficient terms to achieve high accuracy:

Ln: Accurate to within 1 ULP (Unit in the Last Place) for most inputs
Atan: Accurate to within 1 ULP for most inputs
Sqrt: Bit-exact results for exact squares

Performance Characteristics

Arithmetic operations are optimized for speed while maintaining accuracy
Series expansions are tuned for convergence speed vs precision trade-offs
Memory layout is optimized for 64-bit architectures
No dynamic memory allocation during computation

Benchmarks

Run benchmarks with:

go test -bench=.

Typical performance on modern hardware:

Basic arithmetic: ~10-20 ns per operation
Transcendental functions: ~50-200 ns per operation
Conversions: ~20-50 ns per operation

Advanced Usage

Custom Exception Handling

package main

import (
    "fmt"
    "github.com/yourusername/float"
)

func customHandler(exc int) {
    if exc & float.ExceptionOverflow != 0 {
        fmt.Println("Overflow detected!")
    }
    if exc & float.ExceptionUnderflow != 0 {
        fmt.Println("Underflow detected!")
    }
}

func main() {
    // Set custom exception handler
    float.SetExceptionHandler(customHandler)
    
    // Operations that may cause exceptions
    a := float.NewFromFloat64(1e308)
    b := float.NewFromFloat64(1e308)
    result := a.Mul(b) // May overflow
    
    fmt.Printf("Result: %s\n", result.String())
}

Working with Raw Bytes

package main

import (
    "encoding/binary"
    "fmt"
    "git

Related Skills

node-connect

345.4k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

xurl

345.4k

A CLI tool for making authenticated requests to the X (Twitter) API. Use this skill when you need to post tweets, reply, quote, search, read posts, manage followers, send DMs, upload media, or interact with any X API v2 endpoint.

frontend-design

104.6k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

345.4k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).