GitHub - WeLiveToServe/cli-whisper-push-to-transcript

Whisper Push to Transcript is a Python project for capturing audio notes, transcribing them with OpenAI Whisper, and producing clean text outputs. It is intended as a simple foundation for voice-driven coding, documentation, and automation workflows.

Project Story

Whisper Push to Transcript started as an idea to take notes in audio format without needing a dedicated app or manually transferring files between recorders and other tools. The solution was built around a command-line interface.

The repository began with three core files:

a Python script for recording
a Python script for transcription
a Python script that combines recording and transcription into one flow

Initial scaffolding and coding work was done with ChatGPT+. Later, after Claude released Sonnet 4.5, development was migrated into that environment, which helped clean and improve the structure. Although casual, the project has already proven useful.

Typical usage flow:

Run flow_rec_TRX_Claude.py
Press Yes to begin
Hold spacebar to record; release to pause; press again to resume
Press backspace to finish recording
Choose whether to transcribe using OpenAI Whisper
Optionally run post-processing for cleaner output

This project represents my first end-to-end app and GitHub repository.

Features

Record audio locally from microphone
Push audio files to OpenAI Whisper for transcription
Clean and normalize transcripts for readability
Pause/resume audio capture with keyboard controls
Simple CLI-driven workflow

Repository Layout

whisper-push-to-transcript/
├── README.md
├── requirements.txt
├── .gitignore
├── src/
│   └── whisper_transcript/
│       ├── __init__.py
│       ├── transcriber.py       # handles Whisper API calls and returns raw text
│       ├── postprocess.py       # cleanup functions: normalize text, punctuation
│       ├── utils.py             # helpers: paths, save, logging
│       └── cli.py               # CLI entry point
├── poc/
│   └── record.py                # proof-of-concept microphone recorder
├── flow_rec_TRX_Claude.py       # integrated record+transcribe flow with keyboard control
└── tests/
    ├── test_transcriber.py
    ├── test_postprocess.py
    └── test_utils.py

Installation

Clone the repository:

git clone https://github.com/WeLiveToServe/whisper-push-to-transcript.git
cd whisper-push-to-transcript

Create and activate a virtual environment:

python -m venv .venv
.venv\Scripts\activate    # Windows
source .venv/bin/activate # Linux/Mac

Install dependencies:
```
pip install -r requirements.txt
```

Set your OpenAI API key:

$env:OPENAI_API_KEY="your_key_here"     # PowerShell
export OPENAI_API_KEY="your_key_here"   # Linux/Mac

Usage

Run the combined flow

python flow_rec_TRX_Claude.py

Hold spacebar to record
Release spacebar to pause
Press spacebar again to resume
Press backspace to stop and choose to transcribe
Transcript saved and optionally post-processed

Transcribe an existing file

python -m whisper_transcript.cli --file samples/test.wav --output transcript.txt

Proof-of-concept recorder

Roadmap

Add real-time transcription (streaming microphone input)
Prefect flow integration for automated pipelines
Advanced post-processing (summaries, code block extraction, task lists)
Support for local Whisper models (e.g., whisper.cpp)

License

MIT License