SkillAgentSearch skills...

FuzzyMatch

Fuzzy string matches at full speed

Install / Use

/learn @ordo-one/FuzzyMatch

README

FuzzyMatch

A high-performance fuzzy string matching library for Swift.

License codecov Documentation

FuzzyMatch was developed for searching financial instrument databases — stock tickers, fund names, ISINs — where typo tolerance, prefix-aware ranking, and sub-millisecond latency matter. The same qualities make it well suited to any domain with a large, heterogeneous candidate set: code identifiers, file names, product catalogs, contact lists, or anything else a user might search with imprecise input.

Full API documentation is available on the Swift Package Index.

Features

  • Two Matching Modes - Damerau-Levenshtein edit distance (default, best typo handling) and Smith-Waterman local alignment (~1.7x faster, multi-word AND semantics)
  • Multi-Stage Prefiltering - Fast rejection of non-matching candidates using length bounds, character bitmasks, and trigrams
  • Zero Dependencies - Pure Swift implementation with no external dependencies
  • Zero-Allocation Hot Path - Reusable buffers eliminate allocations during scoring
  • Thread-Safe - Full Sendable compliance for concurrent usage
  • Configurable Scoring - Adjustable edit distance thresholds, score weights, and match preferences
  • Word Boundary Bonuses - Intelligent scoring that rewards matches at camelCase and snake_case boundaries
  • Subsequence Matching - Match abbreviations like "gubi" to "getUserById"
  • Acronym Matching - Match word-initial abbreviations like "bms" to "Bristol-Myers Squibb"
  • Highlight Ranges - Get [Range<String.Index>] for matched characters in scored results, with full support for typos, transpositions, and Unicode normalization

Installation

Swift Package Manager

Add FuzzyMatch to your Package.swift:

dependencies: [
    .package(url: "https://github.com/ordo-one/FuzzyMatch.git", from: "1.0.0")
]

Then add it to your target dependencies:

.target(
    name: "YourTarget",
    dependencies: ["FuzzyMatch"]
)

Quick Start

import FuzzyMatch

let matcher = FuzzyMatcher()

// One-shot scoring — simplest API
if let match = matcher.score("getUserById", against: "getUser") {
    print("score=\(match.score), kind=\(match.kind)")
}

// Top-N matching — returns sorted results
let query = matcher.prepare("config")
let top3 = matcher.topMatches(
    ["appConfig", "configManager", "database", "userConfig"],
    against: query,
    limit: 3
)
for result in top3 {
    print("\(result.candidate): \(result.match.score)")
}

Try It — Interactive Search App

The Examples/FuzzySearch/ directory contains a macOS app for exploring how FuzzyMatch works interactively. It loads a 271K financial instrument corpus and live-searches as you type, showing the top results with highlighted matched characters. Switch between Edit Distance and Smith-Waterman algorithms to see how they rank differently, tweak all algorithm parameters in the inspector panel and see results update live, or use File > Open (Cmd+O) to load your own newline-delimited data.

<img width="1371" height="917" alt="FuzzySearch example app" src="https://github.com/user-attachments/assets/9d7dc9a0-8b63-4968-b857-0dff2cec7956" />

Open the Xcode project and hit Run:

open Examples/FuzzySearch/FuzzySearch.xcodeproj

Usage

Convenience API

For quick exploration, prototyping, or when scoring a small number of candidates:

let matcher = FuzzyMatcher()

// One-shot: prepare + score in a single call
if let match = matcher.score("getUserById", against: "usr") {
    print("Score: \(match.score)")
}

// Top-N: returns the best matches sorted by score
let query = matcher.prepare("config")
let top5 = matcher.topMatches(candidates, against: query, limit: 5)

// All matches: returns every match sorted by score
let all = matcher.matches(candidates, against: query)

Note: Convenience methods allocate a new buffer per call. For high-throughput or latency-sensitive use, see High-Performance API below.

High-Performance API (Zero-Allocation Hot Path)

For scoring many candidates against the same query — the recommended path for production use, interactive search, and batch processing:

let matcher = FuzzyMatcher()

// 1. Prepare the query once (precomputes bitmask, trigrams, etc.)
let query = matcher.prepare("getUser")

// 2. Create a reusable buffer (eliminates allocations in the scoring loop)
var buffer = matcher.makeBuffer()

// 3. Score candidates — zero heap allocations per call
let candidates = ["getUserById", "getUsername", "setUser", "fetchData"]
for candidate in candidates {
    if let match = matcher.score(candidate, against: query, buffer: &buffer) {
        print("\(candidate): score=\(match.score), kind=\(match.kind)")
    }
}

Output:

getUserById: score=0.9988, kind=prefix
getUsername: score=0.9988, kind=prefix
setUser: score=0.9047619047619048, kind=prefix

UTF-8 API (Maximum Throughput)

For the highest possible throughput, use score(utf8:against:buffer:) with pre-extracted UTF-8 bytes. This @inlinable method enables cross-module inlining that the String overload cannot achieve on Swift 6.0 (where String.withUTF8 is non-inlinable), delivering 50-100% higher throughput depending on the algorithm:

let matcher = FuzzyMatcher()
let query = matcher.prepare("getUser")
var buffer = matcher.makeBuffer()

for var candidate in candidates {
    candidate.withUTF8 { utf8 in
        if let match = matcher.score(utf8: utf8, against: query, buffer: &buffer) {
            print("score=\(match.score)")
        }
    }
}

Note: This performance gap is a Swift 6.0 limitation. When the library adopts Swift 6.2+ Span, the String API will recover full throughput and this method may be deprecated.

Custom Configuration

// Edit distance mode with custom tuning
let config = MatchConfig(
    minScore: 0.5,
    algorithm: .editDistance(EditDistanceConfig(
        maxEditDistance: 3,        // Allow up to 3 edits (default: 2)
        prefixWeight: 2.0,        // Boost prefix matches (default: 1.5)
        substringWeight: 0.8,     // Weight for substring matches (default: 1.0)
        wordBoundaryBonus: 0.12,  // Bonus for word boundary matches (default: 0.1)
        consecutiveBonus: 0.06,   // Bonus for consecutive matches (default: 0.05)
        gapPenalty: .affine(open: 0.04, extend: 0.01)  // Gap penalty model
    ))
)
let matcher = FuzzyMatcher(config: config)

// Smith-Waterman mode with custom tuning
let swConfig = MatchConfig(
    algorithm: .smithWaterman(SmithWatermanConfig(
        penaltyGapStart: 5,
        bonusBoundary: 10,
        bonusCamelCase: 7
    ))
)
let swMatcher = FuzzyMatcher(config: swConfig)

Scoring Bonuses

FuzzyMatcher uses intelligent scoring bonuses to improve ranking quality:

  • Word Boundary Bonus: Matches at camelCase transitions (getUserById), snake_case boundaries (get_user), and after digits receive a bonus
  • Consecutive Bonus: Characters that match consecutively in the candidate receive a bonus
  • Gap Penalty: Gaps between matched characters incur a penalty. Two models available:
    • .affine(open:extend:) (default) - Starting a gap costs more than continuing one
    • .linear(perCharacter:) - Each gap character costs the same
  • First Match Bonus: Matches starting early in the candidate receive a bonus that decays with position

This means queries like "gubi" will rank "getUserById" higher than "debugging" because the query characters match at word boundaries.

// Disable bonuses for pure edit-distance scoring
let noBonusConfig = MatchConfig(
    algorithm: .editDistance(EditDistanceConfig(
        wordBoundaryBonus: 0.0,
        consecutiveBonus: 0.0,
        gapPenalty: .none,
        firstMatchBonus: 0.0
    ))
)

// Use linear gap penalty instead of affine
let linearConfig = MatchConfig(
    algorithm: .editDistance(EditDistanceConfig(
        gapPenalty: .linear(perCharacter: 0.01)
    ))
)

Concurrent Usage

FuzzyMatcher is fully thread-safe. Each task should use its own buffer:

let matcher = FuzzyMatcher()
let query = matcher.prepare("getData")
let candidates = loadLargeCandidateList()

// Process concurrently using Swift TaskGroup
let workerCount = 8
let chunkSize = (candidates.count + workerCount - 1) / workerCount

await withTaskGroup(of: [ScoredMatch].self) { group in
    for start in stride(from: 0, to: candidates.count, by: chunkSize) {
        let end = min(start + chunkSize, candidates.count)
        let chunk = candidates[start..<end]
        group.addTask {
            var buffer = matcher.makeBuffer()  // Each task gets its own buffer
            return chunk.compactMap { candidate in
                matcher.score(candidate, against: query, buffer: &buffer)
            }
        }
    }

    // Collect results from all tasks
    for await taskMatches in group {
        // Handle matches...
    }
}

Highlighting Matched Characters

After scoring, use attributedHighlight() to get a styled AttributedString for UI display. Call it only for visible results (typically ~10-20), not the full

Related Skills

View on GitHub
GitHub Stars133
CategoryDevelopment
Updated5h ago
Forks4

Languages

Swift

Security Score

100/100

Audited on Mar 25, 2026

No findings