Lexrankr

LexRank for Korean.

Generate Convert Improve

Install / Use

/learn @theeluwin/Lexrankr

About this skill

Quality Score

0/100

README

lexrankr

Clustering based multi-document selective text summarization using LexRank algorithm.

This repository is a source code for the paper 설진석, 이상구. "lexrankr: LexRank 기반 한국어 다중 문서 요약." 한국정보과학회 학술발표논문집 (2016): 458-460.

Mostly designed for Korean, but not limited to.
Click here to see how to install KoNLPy properly.
Check out textrankr, which is a simpler summarizer using TextRank.

Installation

pip install lexrankr

Tokenizers

Tokenizers are not included. You have to implement one by yourself.

Example:

from typing import List

class MyTokenizer:
    def __call__(self, text: str) -> List[str]:
        tokens: List[str] = text.split()
        return tokens

한국어의 경우 KoNLPy를 사용하는 방법이 있습니다.

from typing import List
from konlpy.tag import Okt

class OktTokenizer:
    okt: Okt = Okt()

    def __call__(self, text: str) -> List[str]:
        tokens: List[str] = self.okt.pos(text, norm=True, stem=True, join=True)
        return tokens

Usage

from typing import List
from lexrankr import LexRank

# 1. init
mytokenizer: MyTokenizer = MyTokenizer()
lexrank: LexRank = LexRank(mytokenizer)

# 2. summarize (like, pre-computation)
lexrank.summarize(your_text_here)

# 3. probe (like, query-time)
summaries: List[str] = lexrank.probe()
for summary in summaries:
    print(summary)

Test

Use docker.

docker build -t lexrankr -f Dockerfile .
docker run --rm -it lexrankr

Related Skills

node-connect

347.6k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

108.4k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

347.6k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

347.6k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。