SkillAgentSearch skills...

Finnsyll

Finnish syllabifier and compound segmenter

Install / Use

/learn @tsnaomi/Finnsyll
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

.. image:: https://travis-ci.org/tsnaomi/finnsyll.svg?branch=master :target: https://travis-ci.org/tsnaomi/finnsyll

FinnSyll


FinnSyll is a Python library that syllabifies words according to Finnish syllabification principles. It is also equipped with a Finnish compound splitter. More details/docs to come.

Installation

::

    $ pip install FinnSyll

Basic usage

First, instantiate a FinnSyll object. ::

    >>> from finnsyll import FinnSyll
    >>> f = FinnSyll()

To syllabify: ::

    >>> f.syllabify('runoja')
    ['ru.no.ja']  # internal syllable boundaries are indicated with '.'

To segment compounds: ::

    >>> f.split('sosiaalidemokraattien')
    'sosiaali=demokraattien'  # internal word boundaries are indicated with '='

Optional arguments

The syllabifier can be customized along two different parameters: variation and compound splitting.

variation

Instantiating a FinnSyll object with variation=True (default) will allow the syllabifier to return multiple syllabifications if variation is predicted. When variation=True, the syllabifier will return a list. Setting variation to False will cause the syllabifier to return a string containing the first predicted syllabification.

Variation: ::

    >>> f = FinnSyll(variation=True) 
    >>> f.syllabify('runoja')
    ['ru.no.ja']
    >>> f.syllabify('vapaus')
    ['va.pa.us', 'va.paus']

No variation: ::

    >>> f = FinnSyll(variation=False)
    >>> f.syllabify('runoja')
    'ru.no.ja'
    >>> f.syllabify('vapaus')
    'va.pa.us'

split_compounds

When instantiating a FinnSyll object with split_compounds=True (default), the syllabifier will first attempt to split the input into constituent words before syllabifying it. This forces the syllabifier to insert a syllable boundary in between identified constituent words. The syllabifier will skip this step if split_compounds is set to False.

Compound splitting: ::

    >>> f = FinnSyll(split_compounds=True) 
    >>> f.syllabify('rahoituserien')  # rahoitus=erien
    ['ra.hoi.tus.e.ri.en']

No compound splitting: ::

    >>> f = FinnSyll(split_compounds=False) 
    >>> f.syllabify('rahoituserien')
    ['ra.hoi.tu.se.ri.en']  # incorrect  

Related Skills

View on GitHub
GitHub Stars9
CategoryDevelopment
Updated2y ago
Forks2

Languages

Python

Security Score

70/100

Audited on Jan 19, 2024

No findings