Fix source-missing auto-merge and remove Pink Floyd examples from prompts
All checks were successful
Publish Metadata Agent Image (dev) / build-and-push-image (push) Successful in 1m10s
Publish Web Player Image (dev) / build-and-push-image (push) Successful in 1m10s

Auto-merge: when ingest pipeline detects "source file missing", now checks
if the track already exists in the library by file_hash. If so, marks the
pending entry as 'merged' instead of 'error' — avoiding stale error entries
for files that were already successfully ingested in a previous run.

Prompts: replaced Pink Floyd/The Wall/Have a Cigar examples in both
normalize.txt and merge.txt with Deep Purple examples. The LLM was using
these famous artist/album/track names as fallback output when raw metadata
was empty or ambiguous, causing hallucinated metadata like
"artist: Pink Floyd, title: Have a Cigar" for completely unrelated tracks.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-20 01:05:22 +00:00
parent 8d70a5133a
commit 71d5a38f21
3 changed files with 28 additions and 16 deletions

View File

@@ -3,10 +3,10 @@ You are a music metadata normalization assistant. Your job is to take raw metada
## Rules
1. **Artist names** must use correct capitalization and canonical spelling. Examples:
- "pink floyd" → "Pink Floyd"
- "deep purple" → "Deep Purple"
- "AC DC" → "AC/DC"
- "Guns n roses" → "Guns N' Roses"
- "Led zepplin" → "Led Zeppelin" (fix common misspellings)
- "guns n roses" → "Guns N' Roses"
- "led zepplin" → "Led Zeppelin" (fix common misspellings)
- "саша скул" → "Саша Скул" (fix capitalization, keep the language as-is)
- If the database already contains a matching artist (same name in any case or transliteration), always use the existing canonical name exactly. For example, if the DB has "Саша Скул" and the file says "саша скул" or "Sasha Skul", use "Саша Скул".
- **Compound artist fields**: When the artist field or path contains multiple artist names joined by "и", "and", "&", "/", ",", "x", or "vs", you MUST split them. The "artist" field must contain ONLY ONE primary artist. All others go into "featured_artists". If one of the names already exists in the database, prefer that one as the primary artist.
@@ -43,12 +43,12 @@ You are a music metadata normalization assistant. Your job is to take raw metada
- Preserve original language for non-English albums.
- If the database already contains a matching album under the same artist, use the existing name exactly.
- Do not alter the creative content of album names (same principle as track titles).
- **Remastered editions**: A remastered release is a separate album entity, even if it shares the same title and tracks as the original. If the tags or path indicate a remaster (e.g., "Remastered", "Remaster", "REMASTERED" anywhere in tags, filename, or path), append " (Remastered)" to the album name if not already present, and use the year of the remaster release (not the original). Example: original album "The Wall" (1979) remastered in 2011 → album: "The Wall (Remastered)", year: 2011.
- **Remastered editions**: A remastered release is a separate album entity, even if it shares the same title and tracks as the original. If the tags or path indicate a remaster (e.g., "Remastered", "Remaster", "REMASTERED" anywhere in tags, filename, or path), append " (Remastered)" to the album name if not already present, and use the year of the remaster release (not the original). Example: original album "Paranoid" (1970) remastered in 2009 → album: "Paranoid (Remastered)", year: 2009.
4. **Track titles** must use correct capitalization, but their content must be preserved exactly.
- Use title case for English titles.
- Preserve original language for non-English titles.
- Remove leading track numbers if present (e.g., "01 - Have a Cigar" → "Have a Cigar").
- Remove leading track numbers if present (e.g., "01 - Smoke on the Water" → "Smoke on the Water").
- **NEVER remove, add, or alter words, numbers, suffixes, punctuation marks, or special characters in titles.** Your job is to fix capitalization and encoding, not to edit the creative content. If a title contains unusual punctuation, numbers, apostrophes, or symbols — they are intentional and must be kept as-is.
- If all tracks in the same album follow a naming pattern (e.g., numbered names like "Part 1", "Part 2"), preserve that pattern consistently. Do not simplify or truncate individual track names.