Cld3
Bindings to Google's Compact Language Detector 3
Install / Use
/learn @ropensci/Cld3README
cld3
R Wrapper for Google's Compact Language Detector 3
Google's Compact Language Detector 3 is a neural network model for language identification and the successor of CLD2 (available from) CRAN. This version is still experimental and uses a novell algorithm with different properties and outcomes. For more information see: https://github.com/google/cld3#readme
Example
The function detect_language() is vectorised and guesses the the language of each string in text or returns NA if the language could not reliably be determined.
> library(cld3)
> example(cld3)
cld3> # Vectorized best guess
cld3> detect_language(c("To be or not to be?", "Ce n'est pas grave.", "猿も木から落ちる"))
[1] "en" "fr" "ja"
The function detect_language_multi() is not vectorised and detects all languages inside the entire character vector as a whole.
cld3> # Multiple languages in one text
cld3> detect_language_mixed("This piece of text is in English. Този текст е на Български.", size = 3)
language probability reliable proportion
1 bg 0.9173891 TRUE 0.5853658
2 en 0.9999790 TRUE 0.4146341
3 und 0.0000000 FALSE 0.0000000
Installation
Binary packages for OS-X or Windows can be installed directly from CRAN:
install.packages("cld3")
Installation from source on Linux or OSX requires Google's Protocol Buffers library. On Debian or Ubuntu install libprotobuf-dev and protobuf-compiler:
sudo apt-get install -y libprotobuf-dev protobuf-compiler
On Fedora we need protobuf-devel:
sudo yum install protobuf-devel
On CentOS / RHEL we install [protobuf-devel](https://src.fedoraproject.org/rpms/protobuf via EPEL:
sudo yum install epel-release
sudo yum install protobuf-devel
On OS-X use protobuf from Homebrew:
brew install protobuf
Related Skills
node-connect
336.9kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
83.0kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
336.9kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
83.0kCommit, push, and open a PR
