RewindMCP
Pythonic library of Rewind.ai SQLite Database. Includes CLI and MCP interfaces.
Install / Use
/learn @pedramamini/RewindMCPREADME
RewindDB
A Python library for interfacing with the Rewind.ai SQLite database.
Changelog
2025-07-04 - Voice Export & Training Data Features
- NEW:
--export-own-voiceCLI option for exporting user's voice transcripts organized by day - NEW:
--speech-sourcefilter to separate user voice (me) from other speakers (others) - NEW: Multi-format export support: text, JSON, and audio file export
- NEW:
--export-format audiowith--audio-export-dirfor exporting actual M4A audio files - NEW:
my-words.shscript for generating word clouds from your voice data - ENHANCED: RewindDB core library now supports speech source filtering
- USE CASE: Perfect for collecting clean voice training data for LLM fine-tuning
- FILTER: Text exports contain only user's voice (no other speakers), audio exports contain full conversations
Project Overview
RewindDB is a Python library that provides a convenient interface to the Rewind.ai SQLite database. Rewind.ai is a personal memory assistant that captures audio transcripts and screen OCR data in real-time. This project allows you to programmatically access and search through this data, making it possible to retrieve past conversations, find specific information mentioned in meetings, or analyze screen content from previous work sessions.
The project consists of three main components:
- A core Python library (
rewinddb) for direct database access - Command-line tools for transcript retrieval, keyword searching, screen OCR data retrieval, and activity tracking
- An MCP STDIO server that exposes these capabilities to GenAI models through the standardized Model Context Protocol
The main purpose of this project, for me, was to connect Rewind to my Raycast:
Installation
Prerequisites
- Python 3.6+
Install from Source
# clone the repository
git clone https://github.com/pedramamini/RewindMCP.git
cd RewindMCP
# install the package and dependencies
pip install .
Manual Installation
# install the package in development mode
pip install -e .
Configuration
RewindDB uses a .env file to store database connection parameters. This approach avoids hardcoding sensitive information like database paths and passwords in the source code.
Setting Up the .env File
- Create a
.envfile in your project directory or in your home directory as~/.rewinddb.env - Add the following configuration parameters:
DB_PATH=/path/to/your/rewind/database.sqlite3
DB_PASSWORD=your_database_password
For example:
DB_PATH=/Users/username/Library/Application Support/com.memoryvault.MemoryVault/db-enc.sqlite3
DB_PASSWORD=your_database_password_here
Custom .env File Location
You can also specify a custom location for your .env file when using the library or CLI tools:
# in python code
db = rewinddb.RewindDB(env_file="/path/to/custom/.env")
# with cli tools
python transcript_cli.py --relative "1 hour" --env-file /path/to/custom/.env
python search_cli.py "meeting" --env-file /path/to/custom/.env
python ocr_cli.py --relative "1 hour" --env-file /path/to/custom/.env
python activity_cli.py --relative "1 day" --env-file /path/to/custom/.env
# with mcp server
python mcp_stdio.py --env-file /path/to/custom/.env
CLI Tools
transcript_cli.py
Retrieve audio transcripts from the Rewind.ai database with advanced voice filtering and export capabilities.
Basic Transcript Retrieval
# get transcripts from the last hour
python transcript_cli.py --relative "1 hour"
# get transcripts from the last 5 hours
python transcript_cli.py --relative "5 hours"
# get transcripts from a specific time range
python transcript_cli.py --from "2023-05-11 13:00:00" --to "2023-05-11 17:00:00"
# enable debug output
python transcript_cli.py --relative "7 days" --debug
# use a custom .env file
python transcript_cli.py --relative "1 hour" --env-file /path/to/custom/.env
Voice Source Filtering
# filter for only your own voice
python transcript_cli.py --relative "1 hour" --speech-source me
# filter for other speakers only
python transcript_cli.py --relative "1 day" --speech-source others
# filter works with any time range
python transcript_cli.py --from "2025-07-01" --to "2025-07-02" --speech-source me
Voice Export for Training Data 🎙️
Perfect for collecting clean voice training data for LLM fine-tuning
# export your voice transcripts organized by day (text format)
python transcript_cli.py --export-own-voice "2025-01-01 to 2025-07-04"
# export as JSON with metadata
python transcript_cli.py --export-own-voice "2025-01-01 to 2025-07-04" --export-format json --save-to my_voice.json
# export actual audio files organized by day
python transcript_cli.py --export-own-voice "2025-01-01 to 2025-07-04" --export-format audio --audio-export-dir ./my_voice_audio
# generate word cloud from your voice data (requires wordcloud library)
pip install wordcloud matplotlib # install dependencies
./my-words.sh # automatically uses last 6 months of your voice data
Key Features:
- Clean Training Data: Text exports contain only YOUR voice, filtered out other speakers
- Audio Export: M4A files organized by day with transcript summaries
- Multiple Formats: Text (readable), JSON (structured), Audio (original files)
- Day Organization: Perfect for chronological training data or analysis
- Word Cloud: Quick visualization of your most-used words with
my-words.sh
search_cli.py
Search for keywords across both audio transcripts and screen OCR data.
# search for a keyword with default time range (7 days)
python search_cli.py "meeting"
# search with a specific time range
python search_cli.py "project" --from "2023-05-11 13:00:00" --to "2023-05-11 17:00:00"
# search with a relative time period
python search_cli.py "presentation" --relative "1 day"
# adjust context size and enable debug output
python search_cli.py "python" --context 5 --debug
# use a custom .env file
python search_cli.py "meeting" --env-file /path/to/custom/.env
ocr_cli.py
Retrieve screen OCR (Optical Character Recognition) data from the Rewind.ai database. This tool allows you to see what text was visible on your screen during specific time periods, providing complete OCR text content rather than just metadata about frames and nodes.
# get OCR data from the last hour
python ocr_cli.py --relative "1 hour"
# get OCR data from the last 5 hours (supports short form)
python ocr_cli.py --relative "5h"
# get OCR data from a specific time range
python ocr_cli.py --from "2023-05-11 13:00:00" --to "2023-05-11 17:00:00"
# get OCR data for today only
python ocr_cli.py --from "2023-05-11" --to "2023-05-11"
# get OCR data for specific hours today
python ocr_cli.py --from "13:00" --to "17:00"
# list all applications that have OCR data
python ocr_cli.py --list-apps
# filter OCR data by specific application
python ocr_cli.py --relative "1 day" --app "com.apple.Safari"
# enable debug output and use custom .env file
python ocr_cli.py --relative "7 days" --debug --env-file /path/to/custom/.env
# display times in UTC instead of local time
python ocr_cli.py --relative "1 day" --utc
Key Features:
- Time formats: Supports relative time ("1 hour", "5h", "30m", "2d", "1w") and absolute time ranges
- Application filtering: Use
--list-appsto see available applications, then--appto filter by specific app - Flexible time input: Accepts various formats including date-only, time-only, and full datetime strings
- Text extraction: Shows actual text content that was visible on screen, organized by timestamp and application
activity_cli.py
Display comprehensive activity tracking data from the Rewind.ai database, including computer usage patterns, application usage statistics, and calendar meetings.
# get activity data for the last day
python activity_cli.py --relative "1 day"
# get activity data for the last 5 hours (supports short form)
python activity_cli.py --relative "5h"
# get activity data from a specific time range
python activity_cli.py --from "2023-05-11 13:00:00" --to "2023-05-11 17:00:00"
# get activity data for today only
python activity_cli.py --from "2023-05-11" --to "2023-05-11"
# get activity data for specific hours today
python activity_cli.py --from "13:00" --to "17:00"
# enable debug output and use custom .env file
python activity_cli.py --relative "1 week" --debug --env-file /path/to/custom/.env
# display times in UTC instead of local time
python activity_cli.py --relative "1 day" --utc
Key Features:
- Active Hours: Shows when your computer was actively being used, with hourly and daily breakdowns
- Application Usage: Displays top applications by usage time with visual charts
- Calendar Meetings: Shows meeting statistics, duration, and distribution by time of day
- Visual Charts: Includes simple ASCII bar charts for easy data visualization
- Time Zone Support: Displays times in local timezone by default, with UTC option available
MCP STDIO Server
The Model Context Protocol (MCP) server exposes RewindDB functionality to GenAI models through the standardized MCP STDIO protocol. This implementation is fully MCP-compliant and works with MCP clients like Claude, Raycast, and other AI assistants.
Quick Start
# start the STDIO MCP server
python mcp_stdio.py
# enable debug logging
python mcp_stdio.py --debug
# use a custom .env file
python mcp_stdio.py --env-file /path/to/custom/.env
Available Tools
The MCP server provides the following tools:
get_transcripts_relative: Get audio transcripts from a relative time period (e.g., "
