Utf8
A library for UTF-8 in PHP providing UTF-8 aware functions to mirror PHP's own string functions. The library doesn't require PHP mbstring extension although, if found, it will use it to gain performance.
Install / Use
/learn @fluxbb/Utf8README
PHP-UTF8
Introduction
PHP-UTF-8 is a UTF-8 aware library of functions mirroring PHP's own string functions. Does not require PHP mbstring extension though will use it, if found, for a (small) performance gain.
The project was initially on sourceforge where it died due to lack of development and support. This project has been forked and moved to github.com so that many more people can actually contribute with more ease.
Use the issue tracker here on github.com, to post about problems and feature requests.
Please feel free to fork and get back to us with fork requests for optimizations and new features.
Documentation & Usage Information
Using the php-utf8 library is quite easy. Just include the php-utf8.php and
any additional functions that you may need from the functions folder.
Sample Code:
// get the core functions included ...
require('php-utf8_path/php-utf8.php');
// ... and any other functions/*.php or utils/*.php files you may need.
require('php-utf8_path/functions/trim.php');
Make sure that you are confident about using the library by reading Character Sets / Character Encoding Issues and Handling UTF-8 with PHP.
Use these functions only if you really need them & you understand why you need to use them.
In particular, do not blindly replace all use of PHP's string functions which functions found here. Most of the time you will not need to, and you will be introducing a significant performance overhead to your application.
Most of the functions here are not operating defensively, mainly for performance
reasons. For example there is no extensive parameter checking and it is assumed
that they are fed with well formed UTF-8. This is particularly relevant when is
comes to catching badly formed UTF-8. You should screen input on the outer perimeter
with help from functions in the utils/validation.php and utils/bad.php files.
Throughout the library all ASCII characters (control characters included) are treated as valid throughout the library. Make sure you take the appropriate measures before outputting into XML since it can become ill-formed with some control characters. more info
Licensing
The initial code of PHP-UTF8 is published under LGPL. Please find a copy of the license in the LICENSE file.
Parts of the code in this library come from other places, under different licenses. The authors involved have been contacted (see below). Attribution for which code came from elsewhere can be found in the source code itself.
- Andreas Gohr / Chris Smith of Dokuwiki. There is a fair degree of collaboration/exchange of ideas and code between Dokuwiki's UTF-8 library and phputf8. Although Dokuwiki is released under GPL, its UTF-8 library is released under LGPL, hence no conflict with phputf8
- Henri Sivonen (site) has also given permission for his code to be released under the terms of the LGPL. He ported a Unicode / UTF-8 converter from the Mozilla codebase to PHP, which is re-used in php-utf8.
Related Skills
node-connect
347.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
107.8kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
347.0kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
347.0kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
