SkillAgentSearch skills...

Float

80-bit IEEE 754 extended double precision floating-point library for Go

Install / Use

/learn @jenska/Float
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

80-bit IEEE 754 extended double precision floating-point library for Go

CI Go Report Card codecov Go Reference

The float package is a software implementation of floating-point arithmetics that conforms to the 80-bit IEEE 754 extended double precision floating-point format

This package is derived from the original SoftFloat package and was implemented as a basis for a Motorola M68881/M68882 FPU emulation in pure Go

Installation

go get github.com/jenska/float@v1.0.0

Requirements

  • Go 1.22 or later

Development

This project includes a Makefile for common development tasks:

# Show all available commands
make help

# Development workflow (format, vet, test)
make dev

# Run tests with coverage report
make coverage

# Run benchmarks
make bench

# Clean build artifacts
make clean

Available Make Targets

  • make all - Run fmt, vet, and test
  • make build - Verify the project compiles
  • make test - Run all tests
  • make bench - Run benchmarks
  • make coverage - Generate coverage report
  • make fmt - Format code
  • make vet - Run go vet
  • make clean - Clean artifacts
  • make dev - Development workflow
  • make ci - CI workflow

CI/CD

This project uses GitHub Actions for continuous integration and deployment:

Workflows

  • CI (.github/workflows/ci.yml): Runs on every push and PR

    • Tests on multiple Go versions (1.21, 1.22, 1.23)
    • Tests on multiple platforms (Linux, macOS, Windows)
    • Runs linting and static analysis
    • Generates and uploads coverage reports
    • Validates builds
  • Release (.github/workflows/release.yml): Runs on version tags

    • Creates GitHub releases
    • Generates release artifacts
    • Publishes coverage reports
  • CodeQL (.github/workflows/codeql.yml): Security analysis

    • Runs weekly and on pushes/PRs
    • Performs security and quality analysis
  • Dependabot (.github/dependabot.yml): Automated dependency updates

    • Weekly Go module updates
    • Weekly GitHub Actions updates

Status Badges

Add these badges to your README:

[![CI](https://github.com/jenska/float/actions/workflows/ci.yml/badge.svg)](https://github.com/jenska/float/actions/workflows/ci.yml)
[![Go Report Card](https://goreportcard.com/badge/github.com/jenska/float)](https://goreportcard.com/report/github.com/jenska/float)
[![codecov](https://codecov.io/gh/jenska/float/branch/main/graph/badge.svg)](https://codecov.io/gh/jenska/float)
[![Go Reference](https://pkg.go.dev/badge/github.com/jenska/float.svg)](https://pkg.go.dev/github.com/jenska/float)
package main

import (
    "fmt"
    "github.com/jenska/float"
)

func main() {
    // Create extended precision values
    a := float.X80Pi
    b := float.NewFromFloat64(2.0)

    // Perform calculations with higher precision
    result := a.Mul(b)
    fmt.Printf("2π = %s\n", result.String())

    // Use in mathematical computations
    sqrt2 := float.X80Sqrt2
    computation := sqrt2.Mul(sqrt2).Sub(float.X80One)
    fmt.Printf("sqrt(2)² - 1 = %s\n", computation.String())
}

Features

  • Full IEEE 754 Compliance: Proper handling of 80-bit extended precision
  • Complete Arithmetic Operations: Add, Sub, Mul, Div, Rem, Sqrt, Ln, Atan, Sin, Cos, Tan
  • Type Conversions: To/from int32, int64, float32, float64
  • String Formatting: Binary, decimal, and hexadecimal representations
  • Exception Handling: IEEE 754 exception flags with customizable handlers
  • High Performance: Optimized bit-level operations
  • Thread Safe: Safe for concurrent use (with proper exception handling)

Example

package float_test

import (
    "fmt"
    "github.com/jenska/float"
)

func ExampleX80() {
    pi := float.X80Pi
    pi2 := pi.Add(pi)
    sqrtpi2 := pi2.Sqrt()
    epsilon := sqrtpi2.Mul(sqrtpi2).Sub(pi2)
    fmt.Println(epsilon)
    // Output: -0.000000000000000000433680868994
}

func ExampleExceptionHandling() {
    // Set up exception handling
    float.SetExceptionHandler(func(exc int) {
        fmt.Printf("Exception raised: %x\n", exc)
    })

    // This will raise an exception
    result := float.X80Zero.Ln()
    fmt.Printf("Result: %v\n", result)

    // Check what exceptions occurred
    if float.HasException(float.ExceptionDivbyzero) {
        fmt.Println("Division by zero occurred")
    }

    // Clear exceptions
    float.ClearExceptions()
}

API Reference

Types

X80

The main type representing an 80-bit extended precision floating-point number.

type X80 struct {
    high uint16  // Sign (1 bit) + Exponent (15 bits)
    low  uint64  // Integer bit (1 bit) + Fraction (63 bits)
}

Constants

Predefined Values

  • X80Zero - Zero
  • X80One - One
  • X80MinusOne - Negative one
  • X80Pi - π (3.1415926535897932384626433832795...)
  • X80E - e (2.7182818284590452353602874713526...)
  • X80Ln2 - ln(2)
  • X80Log2E - log₂(e)
  • X80Sqrt2 - √2
  • X80InfPos - Positive infinity
  • X80InfNeg - Negative infinity
  • X80NaN - Not a number

Exception Flags

  • ExceptionInvalid - Invalid operation
  • ExceptionDenormal - Denormalized number
  • ExceptionDivbyzero - Division by zero
  • ExceptionOverflow - Result too large
  • ExceptionUnderflow - Result too small
  • ExceptionInexact - Inexact result

Rounding Modes

  • RoundNearestEven - Round to nearest, ties to even
  • RoundToZero - Round toward zero
  • RoundDown - Round toward negative infinity
  • RoundUp - Round toward positive infinity

Methods

Arithmetic Operations

  • Add(b X80) X80 - Addition
  • Sub(b X80) X80 - Subtraction
  • Mul(b X80) X80 - Multiplication
  • Div(b X80) X80 - Division
  • Rem(b X80) X80 - Remainder
  • Sqrt() X80 - Square root
  • Ln() X80 - Natural logarithm
  • Atan() X80 - Arctangent
  • Sin() X80 - Sine
  • Cos() X80 - Cosine
  • Tan() X80 - Tangent

Comparison Operations

  • Eq(b X80) bool - Equal
  • Lt(b X80) bool - Less than
  • Le(b X80) bool - Less than or equal
  • Gt(b X80) bool - Greater than
  • Ge(b X80) bool - Greater than or equal

Conversion Operations

  • ToInt32() int32 - Convert to 32-bit integer
  • ToInt32RoundZero() int32 - Convert to 32-bit integer with round-toward-zero semantics
  • ToInt64() int64 - Convert to 64-bit integer
  • ToInt64RoundZero() int64 - Convert to 64-bit integer with round-toward-zero semantics
  • ToFloat32() float32 - Convert to 32-bit float
  • ToFloat64() float64 - Convert to 64-bit float
  • String() string - Convert to decimal string
  • Format(fmt byte, prec int) string - Formatted string

Utility Methods

  • IsNaN() bool - Check if NaN
  • IsInf() bool - Check if infinity
  • IsSignalingNaN() bool - Check if signaling NaN

Functions

Creation Functions

  • NewFromFloat64(f float64) X80 - Create from float64
  • NewFromBytes(b []byte, order binary.ByteOrder) X80 - Create from bytes
  • Int32ToFloatX80(i int32) X80 - Create from int32
  • Int64ToFloatX80(i int64) X80 - Create from int64
  • Float32ToFloatX80(f float32) X80 - Create from float32
  • Float64ToFloatX80(f float64) X80 - Create from float64

Exception Handling

  • SetExceptionHandler(handler ExceptionHandler) - Set exception callback
  • GetExceptionHandler() ExceptionHandler - Get current handler
  • GetExceptions() int - Get current exception flags
  • HasException(flag int) bool - Check specific exception
  • HasAnyException() bool - Check if any exceptions
  • ClearExceptions() - Clear all exceptions
  • ClearException(flag int) - Clear specific exception

Supported Operations

  • Basic arithmetic: Add, Sub, Mul, Div, Rem
  • Rounding: RoundToInt
  • Square root: Sqrt
  • Logarithm: Ln (natural logarithm)
  • Arctangent: Atan
  • Comparisons: Eq, Lt, Le, Gt, Ge
  • Conversions: to/from int32, int64, float32, float64
  • Formatting: String formatting with various bases

Performance & Accuracy

Accuracy

This library implements IEEE 754 compliant 80-bit extended precision arithmetic. The transcendental functions (Ln, Atan) use series expansions with sufficient terms to achieve high accuracy:

  • Ln: Accurate to within 1 ULP (Unit in the Last Place) for most inputs
  • Atan: Accurate to within 1 ULP for most inputs
  • Sqrt: Bit-exact results for exact squares

Performance Characteristics

  • Arithmetic operations are optimized for speed while maintaining accuracy
  • Series expansions are tuned for convergence speed vs precision trade-offs
  • Memory layout is optimized for 64-bit architectures
  • No dynamic memory allocation during computation

Benchmarks

Run benchmarks with:

go test -bench=.

Typical performance on modern hardware:

  • Basic arithmetic: ~10-20 ns per operation
  • Transcendental functions: ~50-200 ns per operation
  • Conversions: ~20-50 ns per operation

Advanced Usage

Custom Exception Handling

package main

import (
    "fmt"
    "github.com/yourusername/float"
)

func customHandler(exc int) {
    if exc & float.ExceptionOverflow != 0 {
        fmt.Println("Overflow detected!")
    }
    if exc & float.ExceptionUnderflow != 0 {
        fmt.Println("Underflow detected!")
    }
}

func main() {
    // Set custom exception handler
    float.SetExceptionHandler(customHandler)
    
    // Operations that may cause exceptions
    a := float.NewFromFloat64(1e308)
    b := float.NewFromFloat64(1e308)
    result := a.Mul(b) // May overflow
    
    fmt.Printf("Result: %s\n", result.String())
}

Working with Raw Bytes

package main

import (
    "encoding/binary"
    "fmt"
    "git

Related Skills

View on GitHub
GitHub Stars4
CategoryDevelopment
Updated3d ago
Forks1

Languages

Go

Security Score

90/100

Audited on Mar 29, 2026

No findings