Textalyzer
Analyze key metrics like number of words, readability, complexity, code duplication, … of any kind of text
Install / Use
/learn @ad-si/TextalyzerREADME
Textalyzer
Analyze key metrics like number of words, readability, complexity, etc. of any kind of text.
CLI | Web
--- | ---
| 
Usage
# Word frequency histogram
textalyzer histogram <path> [<additional paths...>]
# Word frequency histogram (case-sensitive)
textalyzer histogram --case-sensitive <path> [<additional paths...>]
# Find duplicated code blocks (default: minimum 3 non-empty lines)
textalyzer duplication <path> [<additional paths...>]
# Find duplications with at least 5 non-empty lines
textalyzer duplication --min-lines=5 <path> [<additional paths...>]
# Include single-line duplications
textalyzer duplication --min-lines=1 <path> [<additional paths...>]
# Output duplications as JSON
textalyzer duplication --json <path> [<additional paths...>]
Example JSON output:
[{
"content": "<duplicated text block>",
"locations": [
{ "path": "file1.txt", "line": 12 },
{ "path": "file2.txt", "line": 34 }
]
}, {
"content": "<another duplicated block>",
"locations": [
{ "path": "file1.txt", "line": 56 },
{ "path": "file3.txt", "line": 78 }
]
}]
The duplication command analyzes files for duplicated text blocks. It can:
- Analyze multiple files or recursively scan directories
- Filter duplications based on minimum number of non-empty lines with
--min-lines=N(default: 2) - Detect single-line duplications when using
--min-lines=1 - Rank duplications by number of consecutive lines
- Show all occurrences with file and line references
- Utilize multithreaded processing for optimal performance on all available CPU cores
- Use memory mapping for efficient processing of large files with minimal memory overhead
- Output duplication data as JSON with
--json
Related
- jscpd - Copy/paste detector for programming source code.
- megalinter - Code quality and linter tool.
- pmd - Source code analysis tool.
- qlty - Code quality and security analysis tool.
- superdiff - Find duplicate code blocks in files.
- wf - Command line utility for counting word frequency.
Rewrite in Rust
This CLI tool was originally written in JavaScript and was later rewritten in Rust to improve the performance.
Before:
hyperfine --warmup 3 'time ./cli/index.js examples/1984.txt'
Benchmark #1: time ./cli/index.js examples/1984.txt
Time (mean ± σ): 390.3 ms ± 15.6 ms [User: 402.6 ms, System: 63.5 ms]
Range (min … max): 366.7 ms … 425.7 ms
After:
hyperfine --warmup 3 'textalyzer histogram examples/1984.txt'
Benchmark #1: textalyzer histogram examples/1984.txt
Time (mean ± σ): 40.4 ms ± 2.5 ms [User: 36.0 ms, System: 2.7 ms]
Range (min … max): 36.9 ms … 48.7 ms
Pretty impressive 10x performance improvement! 😁
Related Skills
node-connect
340.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
84.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
340.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
84.2kCommit, push, and open a PR
