CsvCharacterExtractor
Find all unique characters of a csv sorted by their columns
Install / Use
/learn @JohannesDeml/CsvCharacterExtractorREADME
CSV Character Extractor

Functionality
Extract all unique characters of each column of a csv file, combine and manipulate results and store the results in text files for further usage.
This tool was developed to create font assets for TextmeshPro in Unity. Creating textures with just the character you need is essential for languages like Chinese, Japanese or Korean.
Input
- Input file can be defined in config.xml, default value is "in/example.csv"
- Languages are defined in columns, first column defines the language name (see example.csv)
- Column
IDandDescriptionwill be ignored - Newline character (\n\r) and all emojis will be ignored
Output
- Text files are created for each language and named "ColumnName.txt". Output path can be defined in config.xml
- One file per column, expecting to have one language per column
Requirements
Download
Run
- Windows: Doubleclick Run.bat
- Windows, Mac, Linux: Run
java -jar CsvCharacterExtractor.jarin the terminal
Config usage
- With the config you can set the in and out path as well as characters that should be always or never included. Take a look at the example config
- Paths can be relative, e.g.
in/example.csv - Paths can be absolute, e.g.
C:/Users/UserName/Documents/LanguageCharacterFiles/ - Use forward slashes only
/ - Automatically add lower and upper case charaters to the unique characters file
- Create union files of multiple separate columns
Example
Roadmap
- Document code
- Add information on how to build the project
Third Party Libraries
- https://github.com/uniVocity/univocity-parsers (Apache 2.0 License)
- https://github.com/vdurmont/emoji-java (MIT License)
Related Skills
node-connect
347.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
107.8kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
347.0kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
347.0kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
