SkillAgentSearch skills...

HeartMuse

Local AI music generator with smart lyrics: Gradio web UI for HeartMuLa + Ollama/OpenAI, tags, history, and high-fidelity audio.

Install / Use

/learn @strnad/HeartMuse

README

HeartMuse - AI Music Generator with Smart Lyrics

HeartMuse is an intuitive web-based interface for creating high-quality AI-generated music completely locally on your machine. It combines the power of HeartMuLa (state-of-the-art open-source music generation model) with intelligent lyrics generation using local LLMs, giving you complete creative control without relying on cloud services.

HeartMuse Interface

Listen to an Example

"Firewall Warrior" — a funny song about IT security, fully generated by HeartMuse (HeartMuLa 3B RL)

https://github.com/strnad/heartmuse/raw/master/docs/examples/firewall_warrior.mp3

Get Started in 2 Minutes

# 1. Clone and install
git clone https://github.com/strnad/heartmuse.git
cd heartmuse
./install.sh      # Linux/macOS
# or: install.bat   # Windows

# 2. Run
./run.sh          # Linux/macOS
# or: run.bat       # Windows

Open http://localhost:7860 and start creating!

That's it! The installer automatically creates a virtual environment, clones the HeartMuLa library, and installs all dependencies. AI models download automatically on first generation.


Features at a Glance

  • Smart Text Generation - AI writes lyrics, titles, tags, and descriptions via Ollama (local) or OpenAI
  • HeartMuLa Music Generation - two model variants (RL and Base), up to 240s songs
  • Style Transfer - upload reference audio to influence the musical style (MuQ-MuLan)
  • Audio Transcription - extract lyrics from existing audio (HeartTranscriptor)
  • Batch Variants - generate 1-10 variations of the same song in one go
  • Edit Instructions - refine generated text with natural language commands
  • History & Playlist - browse, replay, and manage all past generations
  • Live Memory Monitor - real-time GPU/RAM usage tracking
  • Seed Control - reproduce exact results with fixed seeds
  • 100% Local - everything runs on your machine with Ollama (or use OpenAI API)

Why HeartMuse?

Flexible Text Generation - Total Creative Control

HeartMuse features a modular, field-by-field generation system that adapts to your creative workflow. Every field has its own "Generate/Enhance" checkbox, giving you granular control over what AI generates and what you write yourself.

Four Independent Fields, Endless Combinations

| Field | What It Does | Generate/Enhance Checkbox | |-------|-------------|---------------------------| | Description | Your creative brief for the song | AI can expand vague ideas into detailed descriptions | | Title | Song name | AI suggests catchy, thematic titles | | Lyrics | Full song text with sections | AI writes/extends verses, choruses, bridges | | Tags | Music production tags for HeartMuLa | AI suggests genre, mood, instruments, tempo |

Each checkbox is independent - mix and match any combination:

  • All four checked - AI generates everything from scratch
  • Only lyrics checked - AI writes lyrics, you control title and tags
  • Title + tags checked - You write lyrics, AI handles metadata
  • Nothing checked - Use exactly what you entered, no AI changes

Context-Aware Intelligence

The magic happens when you provide partial content:

+----------------------------------+--------------------------------+
| You provide:                     | AI understands:                |
+----------------------------------+--------------------------------+
| Description: "upbeat summer"     | -> Uses as creative direction  |
| Title: (empty)                   | -> Will generate               |
| Lyrics: "Feel the sunshine"      | -> Context for generation      |
| Tags: (empty)                    | -> Will generate               |
|                                                                   |
| Checkboxes: [x] Title  [x] Tags  [ ] Lyrics                       |
|                                                                   |
| Result: AI generates title and tags that match YOUR lyrics        |
|         and description. Your lyrics stay untouched.              |
+-------------------------------------------------------------------+

Unchecked fields with values become context - AI reads them but won't modify them. This ensures coherent results that respect your creative input.

Smart Lyrics Preservation

When extending existing lyrics, HeartMuse offers two levels of protection:

| Syntax | Protection Level | Use Case | |--------|-----------------|----------| | "exact text" | 100% locked - never changed, character-for-character | Your signature lines, hooks | | regular text | Preserved - meaning kept, minor polish allowed | Draft verses you want improved |

Example:

[chorus]
"This exact line will never change!"
This line might get slight improvements

Result: The quoted line is untouchable. The unquoted line may be refined while keeping its meaning.

Extend, Don't Replace

When "Generate/Enhance" is checked for lyrics that already have content:

  • AI adds new sections (verses, bridge, outro)
  • AI completes unfinished sections
  • AI preserves everything you wrote
  • AI never deletes your content
  • AI never rewrites existing sections from scratch

Edit Instructions

Use the Edit Instructions field to give natural language commands for refining generated content:

Examples:
- "Change the name Eva to Ela"
- "Add two more verses"
- "Rework the chorus to be more upbeat"
- "Replace all references to rain with sunshine"

The AI applies your edits while preserving the rest of the content. Quoted text remains protected even during edits.

Duration-Aware Lyrics

Set your target song length, and AI adjusts lyrics accordingly:

| Duration | AI Behavior | |----------|-------------| | Under 60s | Short and punchy - 1-2 sections | | 60-120s | Standard structure - verse, chorus, verse | | Over 120s | Extended - multiple verses, bridge, outro |

From-Scratch Creative Mode

Leave all fields empty and click "Generate Text" - AI creates a completely original song concept:

  • Invents a unique theme and story
  • Writes complete lyrics with proper structure
  • Suggests a fitting title
  • Recommends appropriate musical tags

Each generation is different - use this for inspiration or when you want to be surprised!


Real-World Workflow Examples

Example 1: Full AI Generation

Input:   (everything empty, all checkboxes checked)
Output:  Complete original song - title, lyrics, tags, ready for music generation

Example 2: Your Lyrics, AI Metadata

Input:   Your complete lyrics in the Lyrics field
         [ ] Description  [ ] Lyrics  [x] Title  [x] Tags
Output:  AI suggests a perfect title and musical tags that match YOUR lyrics

Example 3: Expand a Hook

Input:   Lyrics: "[chorus]\nDancing in the moonlight"
         [x] Lyrics checked
Output:  AI keeps your chorus and adds verses, bridge, outro around it

Example 4: Protected Refrain + AI Verses

Input:   Lyrics:
         [chorus]
         "We are the champions, my friend"
         "And we'll keep on fighting till the end"

         [verse]
         (something about struggle and victory)

         [x] Lyrics checked

Output:  Quoted chorus: 100% unchanged
         Verse placeholder: AI writes full lyrics about struggle and victory

Example 5: Enhance Vague Description

Input:   Description: "sad piano song"
         [x] Description checked
Output:  Description expanded to: "A melancholic piano ballad with introspective
         lyrics about lost love, featuring gentle arpeggios and emotional vocal
         delivery in a minor key"

Example 6: Iterative Refinement

Round 1: Generate title + lyrics from description
Round 2: Use Edit Instructions: "Make the chorus more energetic"
Round 3: Tweak tags manually, generate music
Round 4: Generate 3 batch variants, pick the best one

Style Transfer (Experimental)

Upload a reference audio track to influence the musical style of your generation. HeartMuse uses MuQ-MuLan to extract a style embedding from the reference and inject it into the HeartMuLa generation process.

  • Style Strength slider (0-10x) - control how strongly the reference influences output
  • Runs on CPU - no GPU memory impact, works alongside HeartMuLa
  • Supports common audio formats (MP3, WAV, FLAC, etc.)

Works best with clear, well-produced reference tracks. The model captures high-level style characteristics (genre, mood, instrumentation) rather than copying melodies.


Audio Transcription

The Transcribe tab lets you extract lyrics from existing audio recordings using HeartTranscriptor (Whisper-based model).

  • Upload any audio file and get transcribed lyrics
  • Click "Send to Generator" to use transcribed lyrics as a starting point
  • For best results, use source-separated vocal tracks (e.g., via Demucs)

Toggle with TRANSCRIPTION=true/false in .env


Batch Generation & Reproducibility

  • Batch Variants (1-10) - generate multiple versions from the same lyrics/tags in one run
  • Seed Control - set a specific seed to reproduce exact results, or use -1 for random
  • Post-generation Statistics - view timing breakdown (text gen, style extraction, music gen per variant), GPU peak VRAM, model variant, and seed value

History & Playlist

The History tab keeps all your generations organized:

  • Playlist Player - sequential or shuffle playback with next/prev controls and seeking
  • History Cards - title, description, tags, audio player, and generation parameters for each song
  • Actions - Load to Generator (reuse settings), Load for Edit (reuse settings + seed), Delete
  • Pagination - browse through all past generations, 10 per page

All generations are stored as MP3 + JSON metadata in the output/ directory.

--

Related Skills

View on GitHub
GitHub Stars13
CategoryDevelopment
Updated13d ago
Forks4

Languages

Python

Security Score

95/100

Audited on Mar 19, 2026

No findings