Yast
YAST - Yet Another SPLADE or Sparse Trainer
Install / Use
/learn @hotchpotch/YastREADME
YAST - Yet Another SPLADE or Sparse Trainer 🚀
Welcome to YAST! This open-source project provides a powerful and flexible SPLADE (Sparse Lexical and Expansion) trainer. Built to integrate seamlessly with Huggingface's Trainer API, YAST allows you to leverage cutting-edge sparse retrieval techniques based on various SPLADE-related research papers. Our goal is to offer an accessible tool for training these models. YAST is licensed under the permissive MIT License.
⚠️ Important Notice
Please note that YAST is currently an experimental project. This means you might encounter breaking changes introduced from time to time. To ensure a stable experience, we highly recommend forking this repository and working with a specific revision (commit hash).
Development Setup
This project uses uv for dependency management and requires Python 3.11.
Prerequisites
- Python 3.11+
- uv package manager
Quick Start
# Clone the repository
git clone https://github.com/hotchpotch/yast.git
cd yast
# Create virtual environment and install dependencies
uv venv --python 3.11 .venv
uv sync --extra dev
# Activate virtual environment (optional - you can use uv run instead)
source .venv/bin/activate
# Run training example
uv run python -m yast.run examples/japanese-splade/toy.yaml
Optional: Flash Attention 2 for Performance
For improved training speed, install Flash Attention 2:
uv pip install --no-deps flash-attn --no-build-isolation
uv pip install einops
Note: Requires a compatible CUDA GPU and may take time to compile.
Training a Japanese SPLADE Model
For details on training a Japanese SPLADE model, please see the Japanese SPLADE example. This document is written in Japanese (日本語で書かれています). If you don't read Japanese, online translation tools can be helpful for understanding the content.
Related Blog Posts (Content in Japanese)
Here are some blog posts related to this project, written in Japanese:
- 高性能な日本語SPLADE(スパース検索)モデルを公開しました
- SPLADE モデルの作り方・日本語SPLADEテクニカルレポート
- 情報検索モデルで最高性能(512トークン以下)・日本語版SPLADE v2をリリース
💡 Related Work
Another project, YASEM (Yet Another Splade | Sparse Embedder), offers a more user-friendly implementation for working with SPLADE models.
🙏 Acknowledgments
We thank the researchers behind the original SPLADE papers for their outstanding contributions to this field.
References
- SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking
- SPLADE v2: Sparse Lexical and Expansion Model for Information Retrieval
- From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective
- An Efficiency Study for SPLADE Models
- A Static Pruning Study on Sparse Neural Retrievers
- SPLADE-v3: New baselines for SPLADE
- Minimizing FLOPs to Learn Efficient Sparse Representations
License
This project is licensed under the MIT License. See the LICENSE file for full license details.
Copyright (c) 2024 Yuichi Tateno (@hotchpotch)
Related Skills
node-connect
351.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
110.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
351.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
351.2kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
