| title | CORE Discovery Framework |
|---|---|
| description | An AI-powered product discovery coaching platform built on the CORE methodology (Capture, Orient, Refine, Execute) with a Next.js frontend, FastAPI backend, pluggable AI agents, and bidirectional Vertex integration. |
| ms.date | 2026-04-09 |
Overview
CORE Discovery is a full-stack application that guides product teams through structured discovery engagements using the CORE methodology. Each phase builds on the previous one, creating a connected chain of evidence, insights, and decisions.
The four phases work together:
- Capture: Probe the system, gather evidence, map stakeholders. Generate tailored interview questions and analyze meeting transcripts. Extracted evidence flows automatically into the Evidence Board.
- Orient: Recognize patterns, frame the real problem. Evidence from Capture pre-populates the context. Build structured problem statements and use cases that carry forward.
- Refine: Test assumptions, match solutions to capabilities. The problem statement from Orient loads automatically. Generate solution architecture blueprints. Assumptions and solution matches persist across sessions.
- Execute: Deliver quick wins, resolve blockers, prepare handoff. The validated problem statement and assumptions from earlier phases display as context.
Beyond the four phases, the platform includes:
- Five specialized AI agents (discovery coach, problem analyst, transcript analyst, use case analyst, solution architect) that power the analysis behind each phase.
- An AI advisor that generates use case proposals and solution architecture blueprints from accumulated evidence.
- Bidirectional Vertex engagement integration for importing context from and exporting deliverables to Vertex repositories.
- Local documentation scanning that ingests PDF, PPTX, DOCX, XLSX, and text files to feed context into the AI agents.
- Real-time WebSocket collaboration so multiple users can work on a discovery session simultaneously.
- JSON and CSV export for sharing discovery outputs externally.
Architecture
┌─────────────────────────┐ ┌──────────────────────────┐
│ Next.js Frontend │ │ FastAPI Backend │
│ (React 19, TS, TW4) │────▶│ (Python 3.11+) │
│ localhost:3000 │ │ localhost:8000 │
└─────────────────────────┘ └──────────┬───────────────┘
│
┌──────▼──────┐
│ AI Agents │
│ (5 agents) │
└──────┬──────┘
│
┌────────────┼────────────┐
│ │ │
┌────▼───┐ ┌─────▼────┐ ┌────▼────┐
│ LLM │ │ Storage │ │ Blob │
│Provider│ │ Provider │ │Provider │
└────┬───┘ └─────┬────┘ └────┬────┘
│ │ │
┌───────────┼──┐ ┌────┼────┐ ┌───┼────┐
│Azure │Local │ │Cosmos│Local│ │Azure│Local│
│OpenAI │Ollama│ │ DB │JSON │ │Blob │File │
└───────┴─────┘ └─────┴─────┘ └─────┴─────┘
The backend uses a provider abstraction pattern. Swap between local development and Azure production by changing environment variables. No code changes required. See ADR-001 for the architectural rationale.
Tech Stack
| Layer | Technology |
|---|---|
| Frontend | Next.js 16, React 19, TypeScript 5.9 |
| Styling | Tailwind CSS v4, shadcn/ui (base-nova) |
| Backend | FastAPI, Pydantic 2, uvicorn |
| AI Agents | 5 specialized agents with base class and registry |
| LLM | Azure OpenAI, OpenAI direct, or Ollama (local) |
| Storage | Azure Cosmos DB or local JSON files |
| Blobs | Azure Blob Storage or local filesystem |
| Speech | Azure Speech Services (optional) |
| Auth | Azure Entra ID or none (local) |
| Realtime | WebSocket hub for live collaboration |
| Docs Parsing | PDF, PPTX, DOCX, XLSX via pymupdf and python-pptx |
| Integration | Vertex engagement repos (scan and export) |
| CI/CD | GitHub Actions (lint, test, build) |
Prerequisites
You need these installed before proceeding:
| Tool | Version | Purpose |
|---|---|---|
| Node.js | 20+ | Frontend runtime |
| pnpm | 10+ | Frontend package manager |
| Python | 3.11+ | Backend runtime |
| Git | 2.40+ | Version control |
For local-only development (no Azure), you also need:
| Tool | Version | Purpose |
|---|---|---|
| Ollama | latest | Local LLM inference |
For Azure deployment, you also need:
| Tool | Version | Purpose |
|---|---|---|
| Azure CLI | 2.60+ | Azure resource management |
Running Locally
1. Clone and install frontend dependencies
git clone https://github.com/zitro/core-framework.git
cd core-framework
pnpm install2. Configure the frontend environment
cp .env.local.example .env.local
The default configuration points at the backend on http://localhost:8000. Edit .env.local if your
backend runs on a different port.
NEXT_PUBLIC_API_URL=http://localhost:8000
3. Set up the backend
cd backend
python -m venv .venvActivate the virtual environment:
# Windows PowerShell & .\.venv\Scripts\Activate.ps1
# macOS/Linux source .venv/bin/activate
Install dependencies. For local-only development:
pip install -e ".[local,dev]"For Azure-backed development:
pip install -e ".[azure,dev]"4. Configure the backend environment
The default .env uses local providers. This requires Ollama running with a model pulled:
ollama pull llama3.1 ollama serve
Your .env should look like this for local development:
LLM_PROVIDER=local STORAGE_PROVIDER=local AUTH_PROVIDER=none SPEECH_PROVIDER=none OLLAMA_BASE_URL=http://localhost:11434 OLLAMA_MODEL=llama3.1 CORS_ORIGINS=["http://localhost:3000"]
5. Start the backend
cd backend
uvicorn app.main:app --host 0.0.0.0 --port 8000Verify it's running:
curl http://localhost:8000/api/health
You should see:
{
"status": "healthy",
"providers": {
"llm": "local",
"storage": "local",
"speech": "none",
"auth": "none"
}
}6. Start the frontend
Open a second terminal at the project root:
Open http://localhost:3000 in your browser.
Running with Azure
Azure mode replaces local storage and LLM with managed Azure services. This gives you persistent Cosmos DB storage, Azure OpenAI for question generation and transcript analysis, Azure Blob Storage for file uploads, and Azure Speech Services for audio transcription.
Required Azure Resources
| Resource | SKU/Tier | Purpose |
|---|---|---|
| Azure OpenAI | Standard S0 | LLM inference (gpt-4o deployment) |
| Azure Cosmos DB | Serverless | Discovery and evidence persistence |
| Azure Storage Account | Standard LRS | Transcript and export file storage |
| Azure Speech Services | Free F0 or S0 | Audio-to-text transcription |
Azure Resource Setup
All examples assume a resource group named my-resource-group. Adjust names as needed.
Create an Azure OpenAI resource and deploy a model:
az cognitiveservices account create \ --name "<OPENAI_RESOURCE_NAME>" \ --resource-group "<RESOURCE_GROUP>" \ --kind "OpenAI" \ --sku "S0" \ --location "eastus" az cognitiveservices account deployment create \ --name "<OPENAI_RESOURCE_NAME>" \ --resource-group "<RESOURCE_GROUP>" \ --deployment-name "gpt-4o" \ --model-name "gpt-4o" \ --model-version "2024-11-20" \ --model-format "OpenAI" \ --sku-capacity 10 \ --sku-name "GlobalStandard"
Create a Cosmos DB account with a database and containers:
az cosmosdb create \ --name "<COSMOS_ACCOUNT_NAME>" \ --resource-group "<RESOURCE_GROUP>" \ --capabilities "EnableServerless" \ --locations regionName="eastus2" az cosmosdb sql database create \ --account-name "<COSMOS_ACCOUNT_NAME>" \ --resource-group "<RESOURCE_GROUP>" \ --name "core-discovery" az cosmosdb sql container create \ --account-name "<COSMOS_ACCOUNT_NAME>" \ --resource-group "<RESOURCE_GROUP>" \ --database-name "core-discovery" \ --name "discoveries" \ --partition-key-path "/id" az cosmosdb sql container create \ --account-name "<COSMOS_ACCOUNT_NAME>" \ --resource-group "<RESOURCE_GROUP>" \ --database-name "core-discovery" \ --name "evidence" \ --partition-key-path "/discoveryId"
Create a Storage Account:
az storage account create \ --name "<STORAGE_ACCOUNT_NAME>" \ --resource-group "<RESOURCE_GROUP>" \ --sku "Standard_LRS" \ --kind "StorageV2" az storage container create --name "transcripts" --account-name "<STORAGE_ACCOUNT_NAME>" az storage container create --name "exports" --account-name "<STORAGE_ACCOUNT_NAME>"
Create a Speech Services resource (optional, for audio transcription):
az cognitiveservices account create \ --name "<SPEECH_RESOURCE_NAME>" \ --resource-group "<RESOURCE_GROUP>" \ --kind "SpeechServices" \ --sku "F0" \ --location "eastus"
RBAC Configuration
The application authenticates to Azure using DefaultAzureCredential, which works with your Azure CLI
login during development. Assign these roles to your user principal:
# Get your principal ID PRINCIPAL_ID=$(az ad signed-in-user show --query id -o tsv) # Azure OpenAI az role assignment create \ --role "Cognitive Services OpenAI User" \ --assignee "$PRINCIPAL_ID" \ --scope "/subscriptions/<SUB_ID>/resourceGroups/<RESOURCE_GROUP>/providers/Microsoft.CognitiveServices/accounts/<OPENAI_RESOURCE_NAME>" # Cosmos DB (SQL RBAC, not ARM RBAC) az cosmosdb sql role assignment create \ --account-name "<COSMOS_ACCOUNT_NAME>" \ --resource-group "<RESOURCE_GROUP>" \ --role-definition-id "00000000-0000-0000-0000-000000000002" \ --principal-id "$PRINCIPAL_ID" \ --scope "/" # Blob Storage az role assignment create \ --role "Storage Blob Data Contributor" \ --assignee "$PRINCIPAL_ID" \ --scope "/subscriptions/<SUB_ID>/resourceGroups/<RESOURCE_GROUP>/providers/Microsoft.Storage/storageAccounts/<STORAGE_ACCOUNT_NAME>" # Speech Services az role assignment create \ --role "Cognitive Services Speech User" \ --assignee "$PRINCIPAL_ID" \ --scope "/subscriptions/<SUB_ID>/resourceGroups/<RESOURCE_GROUP>/providers/Microsoft.CognitiveServices/accounts/<SPEECH_RESOURCE_NAME>"
Replace <SUB_ID> with your Azure subscription ID.
Azure Backend Configuration
Update backend/.env to use Azure providers:
LLM_PROVIDER=azure STORAGE_PROVIDER=azure AUTH_PROVIDER=none SPEECH_PROVIDER=azure AZURE_OPENAI_ENDPOINT=https://<OPENAI_RESOURCE_NAME>.openai.azure.com/ AZURE_OPENAI_DEPLOYMENT=gpt-4o AZURE_OPENAI_API_VERSION=2024-12-01-preview COSMOS_ENDPOINT=https://<COSMOS_ACCOUNT_NAME>.documents.azure.com:443/ COSMOS_DATABASE=core-discovery AZURE_STORAGE_ACCOUNT=<STORAGE_ACCOUNT_NAME> AZURE_STORAGE_CONTAINER=transcripts AZURE_SPEECH_REGION=eastus AZURE_SPEECH_RESOURCE_ID=/subscriptions/<SUB_ID>/resourceGroups/<RESOURCE_GROUP>/providers/Microsoft.CognitiveServices/accounts/<SPEECH_RESOURCE_NAME> CORS_ORIGINS=["http://localhost:3000"]
Note
No API keys are needed when using DefaultAzureCredential with RBAC. Make sure you are logged in
with az login before starting the backend.
Start the backend (Azure CLI must be on PATH):
# Windows: refresh PATH so DefaultAzureCredential can find az CLI $env:Path = [System.Environment]::GetEnvironmentVariable("Path","Machine") + ";" + [System.Environment]::GetEnvironmentVariable("Path","User") cd backend & .\.venv\Scripts\Activate.ps1 uvicorn app.main:app --host 0.0.0.0 --port 8000
# macOS/Linux cd backend source .venv/bin/activate uvicorn app.main:app --host 0.0.0.0 --port 8000
Project Structure
core-framework/
├── src/ # Next.js frontend
│ ├── app/ # App router pages
│ │ ├── page.tsx # Dashboard: create and select discoveries
│ │ ├── capture/page.tsx # Question generation + transcript analysis
│ │ ├── orient/page.tsx # Sensemaking + problem statement builder
│ │ ├── refine/page.tsx # Assumption tracker + solution matcher
│ │ ├── execute/page.tsx # Quick wins, blockers, handoff
│ │ ├── discoveries/page.tsx # Browse all past discoveries
│ │ └── evidence/page.tsx # Cross-phase evidence board
│ ├── components/ # Shared and feature components
│ │ ├── layout/ # Sidebar, header, theme toggle, phase shell
│ │ ├── capture/ # TranscriptResults
│ │ ├── orient/ # ProblemStatementBuilder, UseCaseBuilder
│ │ ├── refine/ # AssumptionTracker, SolutionArchitect, SolutionMatcher
│ │ ├── execute/ # QuickWinTracker, BlockerList, HandoffPanel
│ │ ├── settings/ # DocsPathConfig, VertexConfig
│ │ └── ui/ # shadcn/ui primitives (16 components)
│ ├── stores/ # React context stores
│ │ └── discovery-store.tsx # Active discovery state management
│ ├── lib/ # API client, utilities, realtime hook
│ ├── hooks/ # Custom React hooks
│ ├── __tests__/ # Frontend tests (vitest)
│ └── types/ # TypeScript type definitions
├── backend/ # FastAPI backend
│ ├── app/
│ │ ├── main.py # App factory, middleware, router registration
│ │ ├── config.py # Settings and startup validation
│ │ ├── dependencies.py # Auth dependency for route protection
│ │ ├── models/ # Pydantic models (Discovery, Evidence, etc.)
│ │ ├── agents/ # AI sub-agents
│ │ │ ├── base.py # Base agent class with AgentMeta/AgentResult
│ │ │ ├── registry.py # Agent discovery and registry
│ │ │ ├── discovery_coach.py # Phase-appropriate question generation
│ │ │ ├── problem_analyst.py # Evidence-to-problem-statement analysis
│ │ │ ├── transcript_analyst.py # Transcript insight extraction
│ │ │ ├── use_case_analyst.py # Persona and use case generation
│ │ │ └── solution_architect.py # Solution blueprint proposals
│ │ ├── routers/ # API endpoints
│ │ │ ├── discovery.py # CRUD for discovery sessions
│ │ │ ├── questions.py # Question generation + solution matching
│ │ │ ├── transcripts.py # Transcript analysis + audio upload
│ │ │ ├── evidence.py # Evidence CRUD scoped by discovery
│ │ │ ├── problem_statements.py # Problem statement generation
│ │ │ ├── advisor.py # Use case proposal generation
│ │ │ ├── blueprints.py # Solution architecture generation
│ │ │ ├── export.py # JSON/CSV export downloads
│ │ │ ├── docs.py # Local documentation scanning
│ │ │ ├── vertex.py # Vertex engagement integration
│ │ │ └── realtime.py # WebSocket hub for live collaboration
│ │ ├── utils/ # Utilities
│ │ │ ├── context.py # Context gathering (includes Vertex data)
│ │ │ ├── local_docs.py # Binary document parsing (PDF, PPTX, etc.)
│ │ │ └── vertex.py # Vertex repo scanning and frontmatter parsing
│ │ └── providers/ # Pluggable service providers
│ │ ├── llm/ # Azure OpenAI, OpenAI, Ollama
│ │ ├── storage/ # Cosmos DB, local JSON
│ │ ├── blob/ # Azure Blob, local filesystem
│ │ ├── speech/ # Azure Speech Services
│ │ └── auth/ # Azure Entra ID, no-auth
│ ├── .env.example # Template for backend environment
│ ├── tests/ # Backend tests (pytest)
│ └── pyproject.toml # Python dependencies and project metadata
├── .github/
│ └── workflows/ci.yml # CI pipeline: lint, test, build
├── docs/ # Documentation
│ ├── BRD/ # Business Requirements Document
│ ├── TRD/ # Technical Requirements Document
│ └── ADR/ # Architecture Decision Records
├── docker-compose.yml # Production containerization
├── docker-compose.dev.yml # Local containerization
├── Dockerfile # Frontend container
├── .env.local.example # Template for frontend environment
├── vitest.config.ts # Frontend test configuration
└── package.json # Frontend dependencies and scripts
API Endpoints
All endpoints are prefixed with /api.
| Method | Path | Purpose |
|---|---|---|
| GET | /api/health |
Health check with provider status |
| GET | /api/discovery/ |
List all discoveries |
| POST | /api/discovery/ |
Create a new discovery |
| GET | /api/discovery/{id} |
Get a single discovery |
| PATCH | /api/discovery/{id} |
Update a discovery |
| DELETE | /api/discovery/{id} |
Delete a discovery |
| POST | /api/questions/generate |
Generate phase-specific questions |
| POST | /api/questions/solution-match |
Match problems to capabilities |
| POST | /api/transcripts/analyze |
Analyze a text transcript |
| POST | /api/transcripts/upload-audio |
Upload audio for speech-to-text |
| GET | /api/evidence/{discoveryId} |
List evidence for a discovery |
| POST | /api/evidence/ |
Create an evidence item |
| PATCH | /api/evidence/{id} |
Update an evidence item |
| DELETE | /api/evidence/{id} |
Delete an evidence item |
| POST | /api/problem-statements/generate |
Generate evidence-backed problem stmt |
| POST | /api/advisor/use-cases/generate |
Generate use case proposals |
| POST | /api/blueprints/generate |
Generate solution architecture |
| POST | /api/docs/scan |
Scan local documentation directory |
| POST | /api/vertex/scan |
Scan a Vertex engagement repo |
| POST | /api/vertex/export |
Export deliverables to Vertex format |
| GET | /api/export/{id}?format=json |
Export discovery as JSON |
| GET | /api/export/{id}?format=csv |
Export discovery as CSV |
| WS | /ws/{discoveryId} |
Real-time collaboration channel |
Interactive API documentation is available at http://localhost:8000/docs when the backend is running.
Data Flow
Data flows forward through the CORE phases to build a connected narrative:
Capture ──▶ Orient ──▶ Refine ──▶ Execute
│ │ │ │
│ Evidence │ Problem │ Assump. │ Handoff
│ extracted │ statement│ validated│ package
│ from │ built │ solution │ with full
│ transcripts│ from │ matches │ context
│ │ evidence │ persisted│
▼ ▼ ▼ ▼
Evidence Board (cross-phase, scoped per discovery)
All data is scoped to the active discovery session. When you select a discovery from the Dashboard or the sidebar, every phase page and the Evidence Board operate within that discovery's context.
Documentation
- Business Requirements Document: stakeholder needs and success criteria
- Technical Requirements Document: technical design and constraints
- ADR-001: Provider Abstraction: rationale for the pluggable provider pattern
Development
Linting
Frontend:
Backend:
cd backend ruff check . ruff format --check .
Building for Production
This creates an optimized production build in .next/. All pages are statically generated where
possible.
Running Tests
Backend (27 tests):
cd backend
pytest tests/ -vFrontend (7 tests):
Security
All API routes (except /api/health) require authentication. The auth dependency uses the configured
provider:
AUTH_PROVIDER=none: All requests pass through with a local-dev user (development only).AUTH_PROVIDER=azure: Validates Entra ID JWT bearer tokens. RequiresAZURE_TENANT_IDandAZURE_CLIENT_ID.
Rate limiting is enforced globally at 100 requests per minute per IP address. Configure via the
RATE_LIMIT environment variable (default: 100/minute).
Startup validation checks that required environment variables are set for the selected providers and logs warnings if any are missing.
Real-Time Collaboration
The WebSocket endpoint at /ws/{discoveryId} enables live multi-user collaboration on discovery
sessions. Connected clients receive:
- Presence updates when users join or leave.
- Phase change notifications.
- Evidence additions relayed to all participants.
The frontend useRealtime hook manages the connection lifecycle and exposes send, activeUsers,
and connected state.
AI Agents
The backend includes five specialized AI agents, each focused on one aspect of the discovery
process. All agents share a common base class (AgentMeta for identity, AgentResult for output)
and register themselves in a central registry for discoverability.
| Agent | Phase | Responsibility |
|---|---|---|
| Discovery Coach | All | Generates phase-appropriate interview questions |
| Problem Analyst | Orient | Synthesizes evidence into structured problem statements |
| Transcript Analyst | Capture | Extracts insights, evidence, themes from meeting notes |
| Use Case Analyst | Orient | Builds personas and use case proposals from evidence |
| Solution Architect | Refine | Proposes architecture blueprints with service selections |
Each agent defines its own LLM system prompt with phase-specific guidance. The Discovery Coach, for example, uses different prompt strategies for Capture (listening and probing) versus Orient (sensemaking and pattern recognition) versus Refine (assumption testing).
Vertex Integration
CORE integrates bidirectionally with Vertex engagement repositories. Vertex serves as the system of record for customer engagements, while CORE provides the AI analysis layer.
Ingest (Vertex to CORE)
Point CORE at a Vertex repo path and it scans the directory structure, auto-detects the customer directory (skipping templates and samples), and parses YAML frontmatter from all markdown files. The scan returns structured metadata: customer name, initiative list, and file counts grouped by type (call transcripts, decisions, stakeholders, architecture, and 15+ other Vertex categories).
When a discovery session has a vertex_repo_path set, the context gathering utility automatically
calls read_vertex_context() to load relevant files, grouped by type, into the AI agents' context
window. Size caps (200 KB per file, 500 KB total) prevent prompt overflow.
Export (CORE to Vertex)
The export endpoint renders CORE deliverables (problem statements, use cases, solution blueprints)
as Vertex-compatible markdown files with type: decision YAML frontmatter. Drop the exported files
into the Vertex repo and they become part of the engagement record.
Frontend Configuration
The settings panel provides a Vertex configuration card where you enter the repo path, scan to preview the contents (customer name, initiative count, file counts by type), and export CORE outputs back to the repo.
Local Documentation Scanning
The /api/docs/scan endpoint reads a local directory of documents and feeds their content into the
AI context. Supported formats:
- PDF (via PyMuPDF)
- PowerPoint (.pptx)
- Word (.docx)
- Excel (.xlsx)
- Plain text and markdown
The frontend settings panel includes a docs path configuration card for pointing CORE at a local folder of existing engagement materials (SOWs, slide decks, architecture docs). The parsed content is available to all AI agents during analysis.
Dark Mode
The application supports light and dark themes. Toggle via the sun/moon button in the sidebar footer.
Theme preference persists in localStorage and respects the system setting by default.