Skip to content
Mostly Mortals

Mostly Mortals

A D&D campaign companion that turns raw session audio into transcripts, illustrated scenes, animated recaps, and narrated 'Previously on...' intros through a multi-model AI pipeline.

Astro OpenAI Whisper GPT Image 2.0 Seedance 2.0 Nano Banana 2 ElevenLabs Vercel live

A weekly tabletop session ends. A few hours later, the players open a web page that reads like a TV recap: a written session report, illustrated scene gallery in a consistent 2D animation style, per-scene animated clips, a quest log with updated status, and a narrated “Previously on…” intro waiting for next week. Nobody wrote a single note by hand. This is Mostly Mortals, a companion built for my D&D group of seven players running a Level 10 campaign at mostly-mortals.bydom.io.

What gets produced

Each session ships as an episode. The site renders a player-facing recap, a separate Best Moments page (quotes, table culture, the off-topic gold), and a DM-only summary kept off the public nav. The Session Gallery holds 8 to 10 illustrated scenes per session in a consistent western fantasy animation style, viewable through a GLightbox carousel. Per-scene Seedance video clips animate the same key moments. A Quest Log accordion tracks every active thread with status badges (Current, Active, Escalated, On Hold, Completed) and updates per session. The Recap page plays the next session’s “Previously on…” narration mixed with an ambient music bed. The campaign site itself runs on Astro with a “Grimoire” theme, currently v1.12.0.

The pipeline behind it

Raw Zoom H2 audio drops into a folder, and a series of slash commands and Node scripts move it through the chain: combine and compress with ffmpeg, transcribe via OpenAI Whisper cloud (with an auto-generated vocabulary prompt built from the knowledgebase so character names, locations, and spells get spelled right), extract structured content with a three-phase human-in-the-loop validation, generate scene images with GPT Image 2.0 (Nano Banana 2 and Pro available as alternates), animate each scene with Seedance 2.0 i2v on inference.sh, draft and review the narration script, generate voice with ElevenLabs v3 in a custom “Old Adventurer” voice, mix in an ambient track from the ElevenLabs Music API, and deploy through the Vercel CLI. I review and correct the extracted facts at the structured extraction step before anything gets generated downstream. The AI does the work; editorial control stays human.

Solving the consistency problem

The thing that makes a multi-month illustrated archive hold together is not the image model, it is the anchoring. A locked character portrait registry plus a versioned style block in scripts/lib/style.js is the single source of truth for visual canon. When a scene’s prompt mentions a character by name, the script auto-includes that character’s locked portrait as an images.edit input reference, anchoring both likeness and style across every session and every model. Add a new character to the registry once, and every future scene featuring them stays on-model. This is the part that turns a script chain into a system.

At-the-table tools

The same knowledgebase powers a live game assistant during play. /dm answers combat tactics, spell rule questions, NPC recall, and inventory lookups in real time at the table. /dm note: captures quick observations that get timestamped, saved, and folded back into the next session’s Whisper vocabulary prompt to improve transcription accuracy. The system runs both during the session and after it.

What it represents

Seven AI services chained into one production pipeline: Whisper, Claude, GPT Image 2.0, Nano Banana 2, Seedance 2.0, ElevenLabs voice, ElevenLabs Music. Each link can fail independently without breaking the chain, and each has a fallback (WhisperX for noisy chunks, alternate image models, multi-shot keyframe anchoring for video continuity). Built for a real group running a real campaign, not a demo. Eighteen months of session history, fully illustrated, animated, and narrated, automated end-to-end, with editorial review built in at the points that matter.

← Back to projects