Yetisearch
YetiSearch is a powerful, pure-PHP search engine library designed for modern PHP applications. This initial release provides a complete full-text search solution with advanced features typically found only in dedicated search servers, all while maintaining the simplicity of a PHP library with zero external service dependencies.
Install / Use
/learn @yetidevworks/YetisearchREADME
YetiSearch
A powerful, pure-PHP search engine library with advanced full-text search capabilities, designed for modern PHP applications.
Important: Requires SQLite FTS5 (full‑text search) support in your PHP’s SQLite library. See “Requirements” for a quick check.
Table of Contents
- Features
- Requirements
- Installation
- Quick Start
- Example Applications
- Usage Examples
- Configuration
- Advanced Features
- Architecture
- Testing
- API Reference
- Performance
- Future Features
- Contributing
- License
- Type-Ahead Setup
- Weighted FTS and Prefix (Optional)
- Suggestions
- Synonyms
- DSL (Domain Specific Language)
- CLI
Features
- 🔍 Full-text search powered by SQLite FTS5 with BM25 relevance scoring
- 📄 Automatic document chunking for indexing large documents
- 🎯 Smart result deduplication - shows best match per document by default
- 🌍 Multi-language support with built-in stemming for multiple languages
- ⚡ Lightning-fast indexing and searching with SQLite backend
- 🔧 Flexible architecture with interfaces for easy extension
- 📊 Advanced scoring with intelligent field boosting and exact match prioritization
- 🎨 Search highlighting with customizable tags
- 🔤 Advanced fuzzy matching with automatic typo correction and multi-algorithm consensus scoring (Trigram, Jaro-Winkler, Levenshtein, Phonetic, Keyboard Proximity)
- 🎯 Enhanced multi-word matching for more accurate search results
- 🏆 Smart result ranking prioritizing exact matches over fuzzy matches
- 📈 Faceted search and aggregations support
- 📍 Geo-spatial search with R-tree indexing for location-based queries
- 🚀 Zero dependencies except PHP extensions and small utility packages
- 💾 Persistent storage with automatic database management
- 🔐 Production-ready with comprehensive test coverage
- ✨ NEW: Multi-column FTS with native BM25 field weighting (enabled by default)
- ✨ NEW: Two-pass search for enhanced primary field prioritization (optional)
- ✨ NEW: Improved fuzzy consistency - exact matches always rank higher
- ✨ NEW: DSL Support - Natural language query syntax and JSON API-compliant URL parameters
- ✨ NEW: Query Result Caching - 10-100x faster repeated searches with automatic invalidation
- ✨ NEW: Enhanced Fuzzy Search - Modern typo correction with multi-algorithm consensus scoring (phonetic, keyboard proximity, trigram, Levenshtein, Jaro-Winkler)
Requirements
Important: SQLite FTS5 required
-
YetiSearch uses SQLite FTS5 virtual tables for full‑text search and BM25 ranking. Your PHP build must link against a SQLite library compiled with FTS5 (ENABLE_FTS5).
-
SQLite 3.24.0 or higher is required (for
ON CONFLICTupsert support). SQLite 3.35.0+ is recommended for best performance (usesRETURNINGclause to avoid extra queries). -
Quick check:
php scripts/check_sqlite_features.phpshould report "FTS5: OK" and show the SQLite version. On macOS, Homebrew PHP typically includes a recent SQLite with FTS5; some system PHP builds may not. -
PHP 7.4 or higher
-
SQLite 3.24.0 or higher (3.35.0+ recommended)
-
SQLite3 PHP extension
-
PDO PHP extension with SQLite driver
-
Mbstring PHP extension
-
JSON PHP extension
Installation
Install YetiSearch via Composer:
composer require yetidevworks/yetisearch
Quick Start
<?php
use YetiSearch\YetiSearch;
// Initialize YetiSearch with configuration
$config = [
'storage' => [
'path' => '/path/to/your/search.db'
]
];
$search = new YetiSearch($config);
// Create an index
$indexer = $search->createIndex('pages');
// Index a document
$indexer->insert([
'id' => 'doc1',
'content' => [
'title' => 'Introduction to YetiSearch',
'body' => 'YetiSearch is a powerful search engine library for PHP applications...',
'url' => 'https://example.com/intro',
'tags' => 'search php library'
]
]);
// Search for documents
$results = $search->search('pages', 'powerful search');
// Search with fuzzy matching enabled (automatic typo correction)
$fuzzyResults = $search->search('pages', 'powerfull serch', ['fuzzy' => true]);
// Automatically corrects typos: "powerfull serch" → "powerful search"
// Display results
foreach ($results['results'] as $result) {
echo $result['title'] . ' (Score: ' . $result['score'] . ")\n";
echo $result['excerpt'] . "\n\n";
}
Example Applications
The examples/ directory contains fully working demonstrations of YetiSearch features:
🏢 Apartment Search Tutorial
Complete real-world example of a property search application:
- File:
examples/apartment-search-simple.php - Features demonstrated:
- Structured content indexing (title, description)
- Metadata fields (price, bedrooms, bathrooms, sqft, location)
- Geo-spatial search with radius filtering
- Price range and feature filtering
- DSL queries with natural language syntax
- Fluent query builder interface
- Distance calculations and sorting
Run it:
php examples/apartment-search-simple.php
🔍 Other Examples
- Enhanced fuzzy search:
examples/enhanced-fuzzy-search.php- Modern typo correction with multi-algorithm consensus - Pre-chunked indexing:
examples/pre-chunked-indexing.php- Custom document chunking with semantic boundaries - Type-ahead search:
examples/type-ahead.php- Interactive as-you-type search - Geo facets and k-NN:
examples/geo-facets-knn.php- Distance faceting and nearest neighbors - DSL examples:
examples/dsl-metadata-example.php- Query builder demonstrations - Custom fields:
examples/apartment-search-tutorial.php- Extended version with amenities
Usage Examples
Basic Indexing
use YetiSearch\YetiSearch;
$search = new YetiSearch([
'storage' => ['path' => './search.db']
]);
$indexer = $search->createIndex('articles');
// Index a single document
$document = [
'id' => 'article-1',
'content' => [
'title' => 'Getting Started with PHP',
'body' => 'PHP is a popular general-purpose scripting language...',
'author' => 'John Doe',
'category' => 'Programming',
'tags' => 'php programming tutorial'
],
'metadata' => [
'date' => time()
]
];
$indexer->insert($document);
// Index multiple documents
$documents = [
[
'id' => 'article-2',
'content' => [
'title' => 'Advanced PHP Techniques',
'body' => 'Let\'s explore advanced PHP programming techniques...',
'author' => 'Jane Smith',
'category' => 'Programming',
'tags' => 'php advanced tips'
]
],
[
'id' => 'article-3',
'content' => [
'title' => 'PHP Performance Optimization',
'body' => 'Optimizing PHP applications for better performance...',
'author' => 'Bob Johnson',
'category' => 'Performance',
'tags' => 'php performance optimization'
]
]
];
$indexer->insert($documents);
// Flush to ensure all documents are written
$indexer->flush();
Advanced Indexing
// Configure indexer with custom settings
$indexer = $search->createIndex('products', [
'fields' => [
'name' => ['boost' => 3.0, 'store' => true],
'description' => ['boost' => 1.0, 'store' => true],
'brand' => ['boost' => 2.0, 'store' => true],
'sku' => ['boost' => 1.0, 'store' => true, 'index' => false],
'price' => ['boost' => 1.0, 'store' => true, 'index' => false]
],
'chunk_size' => 500, // Smaller chunks for product descriptions
'chunk_overlap' => 50, // Overlap between chunks
'batch_size' => 100 // Process 100 documents at a time
]);
// Index products with metadata
$product = [
'id' => 'prod-123',
'content' => [
'
