Hayabusa
Hayabusa: Simple and Fast Full-Text Search Engine for Massive System Log Data
Install / Use
/learn @hirolovesbeer/HayabusaREADME
Hayabusa
Hayabusa: A Simple and Fast Full-Text Search Engine for Massive System Log Data
Concept
- Pure python implement
- Parallel SQLite processing engine
- SQLite3 FTS(Full Text Search)
- Core-scale architecture
Architecture
-
Design of the directory structure
- By specifying a search range of time in ”the directory path + yyyy + mm + dd + hh + min.db”, the search program can select the search time systematically.
/targetdir/yyyy/mm/dd/hh/min.db -
StoreEngine
- sample code
import os.path import sqlite3 db_file = ’test.db’ log_file = ’1m.log’ if not os.path.exists(db_file): conn = sqlite3.connect(db_file) conn.execute("CREATE VIRTUAL TABLE SYSLOG USING FTS3(LOGS)"); conn.close() conn = sqlite3.connect(db_file) with open(log_file) as fh: lines = [[line] for line in fh] conn.executemany(’INSERT INTO SYSLOG VALUES ( ? )’, lines) conn.commit() -
SearchEngine
- sample command
$ python search_engine.py -h usage: search_engine.py [-h] [--time TIME] [--match MATCH] [-c] [-s] [-v] optional arguments: -h, --help show this help message and exit --time TIME time explain regexp(YYYY/MM/DD/HH/MIN). eg: 2017/04/27/10/* --match MATCH matching keyword. eg: noc or 'noc Login' -e exact match -c count -s sum -v verbose $ python search_engine.py --time 2017/05/11/13/* --match 'keyword' -c -
Architecture image

Search condition
-
case-insensitive
- no distinguish uppercase or lowercase
-
Exact match
-e --match '192.168.0.1' -
AND
--match 'Hello World' -
OR
--match 'Hello OR World' -
NOT
--match 'Hello World -Wide' -
PHRASE
--match '"Hello World"' --match '\"192.168.0.1\"' <- IP address case(same as -e flag) --match '\"192.168.0.1\" src sent' <- PHRASE + AND search -
asterisk(*)
--match 'H* World' -
HAT
--match '^Hello World'
Development environment
- CentOS 7.3
- Python 3.5.1(use anaconda packages)
- SQLite3(version 3.9.2)
Dependency softwares
- Python 3
- SQLite3
- GNU Parallel
Benchmark
Compare with Apache Spark
-
Hayabusa and Spark time comparison

-
Comarison of distributes Spark environment and the stand-alone Hayabusa

Related Skills
node-connect
353.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
353.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
353.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
