GitHub - wesm/transcript-playbook

A guide to creating AI-generated transcripts with summaries and key quotes from talks and podcasts using Whisper and Claude.

See https://wesmckinney.com/presentations for example.

Overview

This workflow takes a video or podcast URL and produces a formatted markdown transcript with:

YAML frontmatter for metadata
A first-person summary of key topics
Extracted "money quotes"
Full verbatim transcript with speaker labels

Prerequisites

Required Tools

# yt-dlp for downloading audio from YouTube/Vimeo/etc
brew install yt-dlp

# ffmpeg for audio compression (if needed)
brew install ffmpeg

# Python 3.8+ with these packages
pip install openai pyyaml

API Keys

OpenAI API key for Whisper transcription (set as OPENAI_API_KEY environment variable)

Complete Workflow

Step 1: Get Video Metadata

First, extract metadata from the video URL:

yt-dlp --print "%(title)s|||%(upload_date)s|||%(duration)s" "VIDEO_URL"

Important: The upload date may differ from the actual talk date. Verify the actual event date from the video description or event website.

Step 2: Download Audio

# Download as MP3 (quality 5 is good balance of size/quality)
yt-dlp -x --audio-format mp3 --audio-quality 5 -o "/tmp/talk-audio.%(ext)s" "VIDEO_URL"

Or use the provided script:

./scripts/download-audio.sh "VIDEO_URL" "output-name"

Step 3: Check File Size

Whisper API has a 25MB limit. Check and compress if needed:

ls -lh /tmp/talk-audio.mp3

# If > 25MB, compress:
ffmpeg -i /tmp/talk-audio.mp3 -b:a 64k -ac 1 /tmp/talk-audio-compressed.mp3

Step 4: Transcribe with Whisper

Using the OpenAI Whisper API:

python scripts/transcribe.py /tmp/talk-audio.mp3 > /tmp/raw-transcript.txt

Or manually via the API:

from openai import OpenAI

client = OpenAI()
with open("/tmp/talk-audio.mp3", "rb") as audio_file:
    transcript = client.audio.transcriptions.create(
        model="whisper-1",
        file=audio_file,
        response_format="text"
    )
print(transcript)

Step 5: Save Raw Transcript

Always save the raw transcript before formatting:

cp /tmp/raw-transcript.txt transcripts/raw/YYYY-MM-DD-event-slug-raw.txt

Step 6: Generate Summary with Claude

Use Claude to create a first-person summary. Provide the raw transcript and this prompt:

Please analyze this transcript and create:

1. A SUMMARY section written in first person ("I", "my") that covers the key topics discussed. Organize into subsections if there are distinct topics. Be factual - avoid grandiose language like "groundbreaking" or "revolutionary".

2. A KEY QUOTES section with 3-5 impactful direct quotes from the transcript, formatted as blockquotes with context.

3. A cleaned TRANSCRIPT with speaker names in bold followed by colons.

Raw transcript:
[paste transcript here]

Step 7: Create Formatted Transcript File

Create the file at transcripts/YYYY-MM-DD-event-slug.md:

---
title: "Talk Title"
date: YYYY-MM-DD
event: "Event Name"
location: "City, State/Country"
video_url: "https://..."
video_type: "Talk"
transcribed: YYYY-MM-DD
---

*This transcript and summary were AI-generated and may contain errors.*

## Summary

[First-person summary here]

## Key Quotes

> "Quote text here" — Context or speaker

## Transcript

**Speaker Name:** Dialogue text...

**Other Speaker:** Response text...

See templates/transcript-template.md for a complete template.

Step 8: Add to talks.yml (if applicable)

If you're integrating with a Quarto blog like the original:

- date: 'YYYY-MM-DD'
  type: podcast  # or: talk, interview, keynote, tutorial
  role: guest    # for podcasts: guest or co-host
  event: "Event Name"
  title: "Talk Title"
  location: Remote
  links:
  - type: Video
    url: https://...

The date must match the transcript filename exactly for auto-linking to work.

Step 9: Preview and Verify

quarto preview
# Check that transcript renders correctly
# Verify links work

File Naming Convention

Pattern: YYYY-MM-DD-event-slug.md

Use the actual talk date, not the upload date
Use lowercase with hyphens for the slug
Keep slugs short but descriptive

Examples:

2024-05-15-talk-python-to-me-pandas.md
2023-09-20-pycon-keynote.md
2022-03-21-gresearch-interview.md

YAML Frontmatter Reference

Field	Required	Description
`title`	Yes	Talk or episode title
`date`	Yes	Actual talk date (YYYY-MM-DD)
`event`	Yes	Event, conference, or podcast name
`location`	Yes	City, State/Country or "Remote"
`video_url`	No*	URL to video/podcast
`video_type`	No*	Talk, Keynote, Podcast, Interview, Tutorial
`slides_url`	No*	URL to slides (if no video)
`transcribed`	No	Date transcript was created

*Use either video_url + video_type OR slides_url, not both.

Summary Writing Guidelines

Voice and Tone

First person: Write as if you gave the talk ("I discussed...", "My approach...")
Factual: Focus on what was actually said, not interpretation
No puffery: Avoid "groundbreaking", "revolutionary", "transformative", etc.
Organized: Use subsections (### Topic) for distinct themes

Structure

## Summary

Brief overview paragraph of the main topics covered.

### First Major Topic

Details about this topic...

### Second Major Topic

Details about this topic...

Key Quotes Formatting

## Key Quotes

> "The exact quote from the transcript" — Context about when/why this was said

> "Another impactful quote" — Speaker attribution if multiple speakers

Transcript Formatting

Speaker names in bold followed by colon
Each speaker turn on its own paragraph
Preserve natural speech (can clean up minor filler words)
Use *[brackets]* for non-speech elements: *[laughter]*, *[applause]*

## Transcript

**Host:** Welcome to the show. Today we're talking about data science.

**Guest:** Thanks for having me. I'm excited to discuss this topic.

*[Brief pause]*

**Host:** Let's start with your background.

Quality Checklist

Before publishing, verify:

Filename follows YYYY-MM-DD-event-slug.md pattern
Date is actual talk date (not upload date)
All required YAML fields present
AI disclaimer included
Summary is in first person
Summary avoids grandiose language
Key quotes use blockquote format
Speaker names are bolded in transcript
No obvious transcription errors or gaps
Raw transcript saved in raw/ folder

Troubleshooting

Whisper API 25MB Limit

Compress audio to reduce file size:

ffmpeg -i input.mp3 -b:a 64k -ac 1 output.mp3

Missing Transcript Sections

If Whisper output has gaps (repeated symbols, [inaudible]):

Re-run transcription on that section
Manually transcribe from video
Note gaps with [inaudible] markers

Directory Structure

your-project/
├── transcripts/
│   ├── raw/                    # Raw transcript backups
│   │   └── YYYY-MM-DD-*.txt
│   ├── _metadata.yml           # Quarto metadata (optional)
│   ├── transcript-styles.css   # Custom styles (optional)
│   └── YYYY-MM-DD-*.md         # Formatted transcripts
├── scripts/
│   ├── download-audio.sh
│   ├── transcribe.py
│   └── format-transcript.py
└── templates/
    └── transcript-template.md

Cost Considerations

Whisper API: ~$0.006 per minute of audio
Claude: Varies by usage for summary generation

A 1-hour talk costs approximately $0.36 for Whisper transcription.

License

These scripts and templates are provided as-is for educational purposes.