Deep Research Cockpit
Steerable AI Research Platform for Mapping Knowledge in Real Time
An AI-powered research cockpit that explores web knowledge with real-time controls, prioritizes primary sources, and builds a verifiable graph of insights. Built for technical professionals who demand transparency and control: research engineers, PMs, data analysts, and tech leads.
๐ง Project Status: Deep Research Cockpit is under active development (Alpha - Phase 3: Advanced Features). Core research pipeline is functional, with Phase 1 (Foundations) and Phase 2 (Ranking & Core-First) complete. Currently implementing advanced features including graph memory and JS rendering.
Table of Contents
- Why This Matters
- Key Concepts
- Core Features
- Architecture Overview
- Quick Start
- Core Metrics & Performance
- Documentation Map
- Roadmap & Status
- Contributing
- Glossary
- License & Support
Why This Matters
The Problem
Traditional research and search tools fall short for knowledge work:
- Aggregation over exploration: Search results are ranked lists, not maps of adjacent concepts
- Black-box prioritization: You don't know why one source ranked above another
- Lost provenance: Claims are divorced from original sources and extraction context
- No steering: Once a search is launched, you can't adjust depth, breadth, or source priority mid-flight
- Opacity: No audit trail of how conclusions were reached
The Deep Research Cockpit Difference
1. Real-Time Steering Adjust depth, breadth, source priority, verification intensity, and comparison scope live with immediate visual feedback. Every control change shows previewed impact within 400ms.
2. Core-First Architecture Automatically prioritizes papers, specifications, and repositories (L0/L1) over blogs and forums (L3/L4). Enforce reading order with transparent reasoning for each ranking decision.
3. Graph Memory Concepts, claims, and relationships guide exploration. Every discovered fact is linked to sources and previous findings. The knowledge graph evolves with your session, enabling discovery of contradictions and novel connections.
4. Complete Verifiable Lineage Every claim has:
- The exact source document with URL
- Character span of the extracted evidence
- Timestamp of when it was discovered
- Why-ranked explanation of source prioritization
- Independent corroborating sources
5. Deterministic Replay & Analysis Replay any session exactly as it happened. Analyze how strategy changes affected outcomes. Compare A/B runs with synchronized timelines and metric deltas. Export complete evidence ledgers.
Use Cases
- Literature reviews with primary source emphasis and contradiction detection
- Technology spec hunts with origin tracing and related standards mapping
- Controversy analysis with evidence clustering and disagreement discovery
- Vendor comparison with specification matching and bias detection
- Learning paths with curated reading queues and prerequisite ordering
Key Concepts
Source Tiers (L0 โ L4)
Understanding source classification is central to Deep Research Cockpit's core-first approach:
| Tier | Classification | Examples |
|---|---|---|
| L0 | Foundational artifacts | Academic papers, technical specifications, official standards (RFCs, ISO), core repositories |
| L1 | Primary sources | Conference talks, official documentation, white papers, working groups |
| L2 | Informed secondary | Technical blogs, tutorials, conference proceedings, theses |
| L3 | Practitioner knowledge | StackOverflow, MathOverflow, professional forums, product docs |
| L4 | General discussion | Twitter, Reddit, general forums, unvetted blogs |
The Frontier
A priority queue of concepts, pages, and claims the system recommends exploring next. Scored by:
- Novelty: How new this information is relative to existing memory
- Centrality: How connected to core concepts
- Disagreement: Potential contradictions with existing claims
- Recency: When the source was published
- User interest: Explicit knob adjustments
Strategy Controls (The Knobs)
Real-time parameters that steer exploration:
- Depth โ Breadth: Focus on depth (following chains of logic) vs. breadth (exploring diverse angles)
- Core-First: Enforce reading order from L0โL4 vs. treat all sources equally
- Verify: High corroboration requirements vs. accept single-source claims
- Compare: Find contradicting sources and alternative viewpoints
- Pivot: Jump to adjacent concepts vs. stay focused on current topic
GraphRAG (Graph-guided Retrieval)
Hybrid retrieval combining:
- Text search: BM25 for keyword matching
- Vector search: Dense embeddings for semantic similarity
- Graph-guided: Follow edges in the knowledge graph to discover related claims
- Community summaries: Each cluster of related concepts has an AI-generated summary
Event Stream & Replay
Every actionโsearch, fetch, extract, rank, think, decideโemits a JSONL event with:
- Timestamp and run ID
- Action type and agent responsible
- Input parameters and decision rationale
- Output summary and cost (tokens, time, money)
- Artifacts created
Full replay determinism enables exact reproduction of any session.
Core Features
๐ฎ Run Mode โ "Pilot View"
Real-time research exploration with full agency:
- Command Palette: Macros for common patterns (Survey, Spec Hunt, Contradiction Hunt, Origin Trace, Video-First)
- Strategy Compass: Adjustable control knobs with what-if previews (ghost re-ranking)
- Frontier Map: Interactive concept graph with overlays showing:
- Core proximity (distance to foundational sources)
- Disagreement (potential contradictions)
- Freshness (recent discoveries)
- Memory writes (new concepts added this session)
- Evidence & Reading Queue: L0โL4 columns with rank cards showing:
- Why-ranked breakdown (relevance, authority, core proximity, recency, independence)
- Source metadata (venue, citations, OA status)
- Confidence scores and contradiction flags
- Live Log: Filterable event stream grouped by decision phases
- Status Strip: Real-time operational metrics
- Latency, token usage, %JS renders, diversity index, contradiction density
๐ Review Mode โ "Black-Box Recorder"
After-the-fact analysis of any completed research session:
- Timeline Replay: Scrub through session snapshots; jump to landmarks
- First core source discovered
- Contradiction spike events
- Cost inflection points
- Strategy control changes
- Effectiveness Dashboard: Key Performance Indicators
- TTFC (Time-to-First-Core source)
- Authority Mix (% L0/L1 sources)
- Evidence Robustness (independent sources per claim)
- Frontier Entropy (breadth measurement)
- Operational metrics (latency, cost, cache hits)
- Evidence Ledger: Complete claim inventory with
- Confidence scores
- Supporting sources and quotes
- Timestamps and extraction spans
- Contradiction flags and corrections
- Learning Map: Diff of graph memory
- New concepts and entities discovered
- New relationships and claims added
- Summary changes and retractions
- Session Summary: Machine-generated brief with key findings
- A/B Run Comparator: Compare two sessions side-by-side
- Synchronized timelines
- Metric deltas
- Frontier divergence analysis
- Exports: Multiple output formats
- JSONL (complete event stream)
- CSV (claims and sources)
- GraphML (knowledge graph structure)
- PDF (formatted research brief)
๐ Orchestrator & Strategy Controller
Central planning brain that decomposes research goals into exploration task graphs:
- Task Decomposition: Break objectives into search โ fetch โ extract โ index โ retrieve โ synthesize tasks
- Frontier Scoring: Maintain a priority queue of what to explore next using multi-factor scoring
- Strategy Application: Apply user control changes to re-weight scoring and branching
- Decision Logging: Every decision includes inputs, score vector, and outcomes
- Deterministic Replay: Same inputs always produce same decisions (seeded randomness)
- Budget Enforcement: Hard caps on tokens, time, domains, and JS renders
Response Times (P95):
- Knob change โ preview: โค400ms
- Knob change โ scheduler impact: โค800ms
๐ Fetch & Extraction Pipeline
Cost-aware content acquisition with intelligent routing:
- Non-JS Fetch (Default): HTTP with redirects, charset detection, robots.txt compliance via Trafilatura
- Smart Router: Escalates to JS rendering when detected:
- Placeholder/empty DOM
- SPA patterns (React, Vue, Angular)
- Interstitial content gates
- Dynamic loading patterns
- JS Fetch Path: Browserbase + Stagehand + Playwright with
- Site-specific playbooks for reliable extraction
- Structured extraction hooks for tables, lists, metadata
- Session management for multi-step authentication
- HTMLโMarkdown Normalization: Turndown service with
- Heading hierarchy preservation
- Code block and table fidelity
- Figure captions and link preservation
- Boilerplate removal
- Smart Caching: Idempotent by URL+ETag with configurable TTLs
- Compliance: Robots.txt respect, rate limiting, user-agent identification
Performance (P95):
- Basic fetch: โค600ms
- JS fetch: โค3s
- Cache hit rate target: โฅ40%
๐ฏ Source Ranking & Core-First Prioritization
Multi-factor scoring with transparent reasoning:
- Source Classification: Automatic categorization into tiers
- Paper/spec/repo detection via metadata and URL patterns
- Academic vs. practitioner vs. general source
- Authority level via venue, citation count, DOI presence
- Metadata Enrichment: Integration with scholarly databases
- Crossref (DOI, venue, citations)
- OpenAlex (comprehensive metadata)
- Semantic Scholar (citation context)
- Unpaywall (open access links)
- Retraction Watch (correction flags)
- Normalized Scoring: Combine factors with user-tunable weights
- Text relevance (BM25 similarity)
- Authority (venue rank, citation count, L-tier)
- Core proximity (citation-graph distance to origins)
- Recency (publish date with time decay)
- Independence (avoiding duplication and bias)
- Penalties for retractions/corrections
- Why-Ranked Explanations: Human-readable breakdown showing
- Which factors helped (green)
- Which factors hurt (red)
- What-if previews when knobs adjust
- Reading Queue Builder: Enforce L0โL4 reading order
- Quota per tier
- Gap detection (e.g., missing specification)
- Drag-to-override with audit trail
Performance (P95):
- Enrichment latency: โค1.5s (batched)
- Scoring with explanations: โค200ms
๐ Graph Memory & GraphRAG Retrieval
Persistent knowledge graph with intelligent exploration:
- Schema: Typed nodes and edges
- Nodes: Concept, Entity, Claim, Source, Note
- Edges: SUPPORTED_BY, ABOUT, RELATED_TO, CONTRADICTS, CITES
- Merge Operations: Upsert claims with automatic deduplication
- Concept linking
- Source attribution
- Relationship creation
- โค200ms per claim (P95)
- Community Detection: Automatic clustering and summarization
- Identify dense clusters of related concepts
- Generate community-level summaries
- Measure cluster coherence
- Origin Path Calculation: Shortest path from any claim to foundational sources
- Enables "why should I trust this?" answers
- Visualize citation chains
- Detect chain-of-logic breaks
- Hybrid Retrieval: Combine multiple strategies
- BM25 (keyword matching)
- Dense vectors (semantic similarity)
- Neural sparse (entity-aware keyword search)
- Graph-guided expansion (follow edges to neighbors)
- RRF (reciprocal rank fusion) to combine signals
- Query Subgraph Latency: โค400ms P95
๐ Reading Queue & "Watch the Act" Pipeline
Core-first reading order with integrated video/talk content:
- Origin Detection: Identify earliest highly-cited foundational works
- Patent searches
- Spec document discovery
- GitHub repository analysis
- Academic paper citation trails
- L0โL4 Queue Building: Enforce reading order with
- Tier quotas (e.g., 40% L0, 30% L1, 20% L2, 10% L3/L4)
- Gap marking (missing prerequisites or standards)
- Override capability with full audit trail
- Video Pipeline:
- Fetch captions (auto-transcription as fallback)
- Align snippets to extracted claims
- Add jump-to timestamps in UI
- โค3s per video with caching
๐ก Observability & Event Schema
Complete audit trail enabling transparency and replay:
- Event Stream: JSONL format with core fields
ts(ISO timestamp),run_id,step_idagent(which component),action(operation type)input,output_summary,artifactssource(document URL or system)cost_ms,tokens_in,tokens_out,decision
- Real-Time Streaming: SSE/WebSocket for live views
- Durable Storage: Object store + indexed queryable database
- Snapshot Generator: Pre-computed snapshots for โค2s replay load times
- Queryable: By run ID, time range, action, agent, decision type
- Deterministic: Same inputs always produce same decision log
๐ Security, Privacy & Compliance
Enterprise-grade safety:
- Site Compliance
- Robots.txt and Terms of Service respect
- Per-domain rate limiting
- Appropriate user-agent identification
- Retraction/correction flagging
- Data Protection
- PII minimization in storage
- Encryption at rest and in transit
- Secret vault for API keys and credentials
- Data retention policies with automatic cleanup
- Access Control
- Role-based access (viewer, editor, admin)
- Workspace isolation for teams
- Audit logs for all actions
- Export & deletion endpoints
- Compliance Documentation
- DPA/Terms of Service summaries
- Opt-in for auto-transcription features
- GDPR-ready data handling
๐ฐ Cost & Performance Management
Predictable operational costs:
- JS Rendering Budget: Policy enforcement
- Auto/Conservative/Aggressive modes
- Percentage-of-run caps
- Fallback to cached content when budget exceeded
- Per-Run Caps: Configurable limits
- Token budget (input + output)
- Time budget (wall-clock duration)
- Domain count limit
- JS render count limit
- Caching Layers: Reduce redundant work
- HTTP cache (ETag-aware)
- HTMLโMarkdown reduction cache
- Embeddings cache
- Metadata cache (scholarly API results)
- Batching & Optimization
- Batch API requests to scholarly databases
- Circuit breakers for failing services
- Adaptive backoff and retry logic
๐งช Evaluation & QA Framework
Continuous quality assurance:
- Golden Tasks: Reference queries with expected findings
- Known sources to discover
- Claims to validate
- Contradictions to surface
- Automated Scoring: RAGAS-style metrics
- Faithfulness (claims match sources)
- Answer relevancy (retrieved content answers question)
- Context precision (no irrelevant sources)
- Context recall (all relevant sources found)
- Claim-Level Judgment: LLM-as-judge with evidence links
- A/B Testing Infrastructure:
- Vary knobs and component configurations
- Track outcome differences
- Pareto front analysis for trade-offs
- Automated Reporting:
- Weekly quality reports
- Regression thresholds with alerts
- Component performance breakdown
๐ฌ External Scholarly Integrations
Authority-based metadata and enrichment:
- Integrated APIs:
- Crossref: DOI lookup, venue metadata, citation counts
- OpenAlex: Comprehensive scholarly metadata
- Semantic Scholar: Citation context and author information
- Unpaywall: Open access link discovery
- Venue Databases: Journal/conference impact rankings
- Video Caption APIs: YouTube, Vimeo, etc.
- Normalized Schema: Consistent "source profile" structure across providers
- Caching Strategy: Local caching with periodic refresh
- Rate Limit & Retry: Respectful API usage with exponential backoff
- 95%+ Enrichment Target: Scholarly metadata available for 95%+ of academic sources
Architecture Overview
System Diagram
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ User UI (Run Mode / Review Mode) โ
โ (Pilot View / Black-Box Recorder) โ
โโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
SSE/WebSocket โ Real-time Control
โ
โโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Orchestrator (Planner + Strategy Controller) โ
โ - Task decomposition โ
โ - Frontier scoring & prioritization โ
โ - Budget enforcement โ
โ - Decision logging โ
โโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโ
โ โ
โผ โผ
โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ
โ Fetch Basic โ โ Fetch JS โ
โ (httpx + โ โ (Browserbase + โ
โ Trafilatura) โ โ Stagehand) โ
โโโโโโโโโโฌโโโโโโโโโโ โโโโโโโโโโฌโโโโโโโโโโ
โ โ
โโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Turndown (HTML โ Markdown) โ
โโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Indexer (Hybrid OpenSearch) โ
โ - BM25 (keyword) โ
โ - Dense vectors (semantic) โ
โ - Neural sparse (entities) โ
โโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโดโโโโโโโโโโโ
โผ โผ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโ
โ Claim/Entity โ โ Graph Memory โ
โ Extractor โ โ (Neo4j/Memgraph) โ
โ (w/ citations) โ โ - Concepts/Entities โ
โโโโโโโโโโฌโโโโโโโโโ โ - Claims/Sources โ
โ โ - Relationships โ
โโโโโโโโฌโโโโโค - Community summariesโ
โผ โโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Retrieval (Hybrid + โ
โ Graph-Guided) โ
โโโโโโโโฌโโโโโโโโโโโโโโโโโโโโ
โผ
โโโโโโโโโโโโโโโโโโโโโโโ
โ Synthesis โ
โ (Notes/Reports) โ
โโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโดโโโโโโโโโโโ
โผ โผ
โโโโโโโโโโโ โโโโโโโโโโโโโโโโ
โ Events โ โ Snapshots โ
โ (JSONL) โ โ (for replay) โ
โโโโโโโโโโโ โโโโโโโโโโโโโโโโ
Key Technology Stack
| Component | Technology | Purpose |
|---|---|---|
| Graph Database | Neo4j / Memgraph | Knowledge graph storage |
| Search Index | OpenSearch | Hybrid retrieval (BM25 + vector) |
| Content Fetch | httpx + Trafilatura | Non-JS content acquisition |
| JS Rendering | Browserbase + Stagehand | Dynamic site rendering |
| HTMLโMarkdown | Turndown | Content normalization |
| Embedding Model | (Configurable) | Semantic search vectors |
| Event Streaming | SSE / WebSocket | Real-time client updates |
| Event Storage | Object store (S3/GCS) + OpenSearch | Queryable audit trail |
| Scheduling | (DAG-based task runner) | Orchestration engine |
| API Clients | Scholarly integrations | Metadata enrichment |
Design Principles
Cost-Aware: Non-JS fetch by default; JS rendering only when necessary and within budget Verifiable: Every claim links to source spans with timestamps and confidence Deterministic: Exact replay of sessions with seeded randomness Transparent: All decisions include reasoning; users see trade-offs Scalable: Horizontal scaling of components; streaming architecture Compliant: Respect robots.txt, TOS, rate limits, and privacy regulations
Quick Start
๐ Complete Setup Guide: For detailed installation instructions, troubleshooting, and verification steps, see the Getting Started Guide (30-45 minutes).
Prerequisites
System Requirements:
- Node.js 18.0.0 or higher
- npm 9.0.0 or higher
- Docker Desktop (or Docker Engine + Docker Compose v2.0+)
- 8GB RAM minimum (16GB recommended)
- 20GB disk space for Docker volumes
Optional:
- API keys for LLM providers (OpenAI, Anthropic) - for future integrations
Installation
5-Minute Setup:
# 1. Clone the repository git clone https://github.com/yourusername/deep-research-cockpit.git cd deep-research-cockpit # 2. Install dependencies npm install # 3. Start infrastructure services (Neo4j, OpenSearch, Redis, MinIO) npm run docker:up # 4. Wait for services to be healthy (~1-2 minutes) docker-compose ps # All services should show "healthy" # 5. Start development servers (backend + frontend) npm run dev
Access the Application:
- Frontend: http://localhost:3000
- Backend API: http://localhost:3001
- Health Check: http://localhost:3001/health
Verify Installation
# Check backend health curl http://localhost:3001/health | jq # Expected output: { "status": "healthy", "services": { "neo4j": { "status": "connected", "latency_ms": 12 }, "opensearch": { "status": "connected", "latency_ms": 45 }, "redis": { "status": "connected", "latency_ms": 3 } }, "uptime_seconds": 42, "version": "0.1.0" }
First Research Session
Note: In the current pre-alpha phase, many UI features are still in development. The backend event system and storage are functional.
- Open the UI at http://localhost:3000
- Create a new research session (when UI is ready)
- Monitor events in real-time:
# Watch event stream curl -N http://localhost:3001/events/stream - Check stored events:
- OpenSearch Dashboards: http://localhost:5601
- MinIO Console: http://localhost:9001
Troubleshooting
Services won't start?
# Check Docker is running docker ps # Check port conflicts lsof -i :7474 # Neo4j lsof -i :9200 # OpenSearch # View logs npm run docker:logs
Need more help?
- See comprehensive Getting Started Guide with detailed troubleshooting
- Check Getting Started - Troubleshooting for common issues
- Review Development Guide for development workflow
- Open an issue on GitHub
Next Steps
For Users:
- Explore the system as features are implemented
- Provide feedback on GitHub Discussions
For Developers:
- Read Development Guide - Workflow and standards
- Read Architecture Overview - System design
- Check Contributing Guide - How to contribute
- Review Testing Guide - Testing practices
Complete Documentation:
- Documentation Index - All documentation organized by role and topic
Core Metrics & Performance
Key Performance Indicators
| Metric | Target | Purpose |
|---|---|---|
| TTFC | โค90s P95 | Time to First Core (L0/L1) source discovery |
| Authority Mix | โฅ60% | Proportion of L0/L1 sources in core-first mode |
| Evidence Robustness | โฅ2.0 | Independent sources per high-impact claim |
| Control Responsiveness | โค400ms | Preview latency for knob changes |
| Commit Latency | โค800ms | Strategy change application |
| JS Render Rate | โค25% | Percentage JS for general topics |
| Cache Hit Rate | โฅ40% | HTTP/metadata caching effectiveness |
| Frontier Entropy | Tunable | Breadth measurability |
Operational Characteristics
- Deterministic Replay: 100% reproducibility of event sequence
- Frontier Scoring: Measurable changes when controls shift
- Domain Diversity: Tunable with visible impact on results
- Contradiction Density: Observable clustering of disagreements
Success Criteria (MVP)
- TTFC โค120s with Authority Mix โฅ50% L0/L1
- Complete event logging and basic replay
- Evidence Robustness โฅ2.0 independent sources per claim
Documentation Map
๐ Complete Documentation Index: See docs/INDEX.md for all documentation organized by role and topic
This README provides a high-level overview. For detailed information:
- System-level docs: See docs/ directory for guides, architecture, and PRDs
- Package-level docs: Each package (
backend/,frontend/,shared/) has its own detailed documentation inpackages/{name}/docs/ - API references: Available in both root ARCHITECTURE.md and package-specific API.md files
Quick Links by Role
๐จโ๐ป For Developers
Essential Guides:
- Getting Started Guide - Setup and installation (30-45 min)
- Development Guide - Workflow and standards
- Testing Guide - Testing strategy and practices
- Architecture Overview - System design and components
- Contributing Guide - How to contribute code
Architecture Deep Dives:
- System Components - Backend, frontend, services
- Data Flow Architecture - Request/event flow
- Event-Driven Architecture - Event sourcing
- Database Schemas - Neo4j, OpenSearch, Redis
๐ฌ For Research Engineers & Analysts
User Guides (Coming Soon):
- Pilot View UI Guide - Master the live research interface
- Strategy Controls Guide - Steer exploration effectively
- Source Tiers & Core-First - Understanding authority ranking
- Reading Queue Management - Building optimal reading order
- Review Mode Deep Dive - Analyzing completed sessions
๐ข For Technical Leaders & Product Managers
Strategic Documentation:
- Project Overview - Vision, features, and roadmap
- Architecture Overview - System design and scalability
- Complete PRD Suite - Product requirements and specifications
- Deployment Guide - Deployment strategies and operations
Future Documentation:
- Evaluation Framework - Quality metrics and measurement
- Cost & Performance Management - Budget controls and optimization
- Security & Compliance - Privacy, compliance, and safety
๐ For Data Scientists
Technical Implementation (Planned):
- GraphRAG Implementation - Graph-guided retrieval details
- Ranking Algorithms - Scoring function design
- Evaluation & Metrics - Measurement methodology
- Source Classification - L0-L4 detection algorithms
Core Documentation
| Document | Description | Status |
|---|---|---|
| README.md | Project overview and quick start | โ Current |
| ARCHITECTURE.md | Complete system architecture | โ Current |
| CONTRIBUTING.md | Contribution guidelines | โ Current |
| Getting Started | Setup and installation | โ Current |
| Development Guide | Development workflow | โ Current |
| Testing Guide | Testing practices | โ Current |
| Deployment Guide | Deployment strategies | โ Current |
| Documentation Index | Complete documentation map | โ Current |
Package Documentation
Each package has its own detailed documentation:
Backend (packages/backend/docs/):
- Backend README - Backend overview and setup
- Backend API - API endpoint reference
- Services - Service layer documentation
- Middleware - Express middleware
- Utils - Utility functions
Frontend (packages/frontend/docs/):
- Frontend Architecture - Frontend design patterns
- Frontend Development - Development workflow
- Components - React component library
- Hooks - Custom React hooks
- State Management - Redux state structure
Shared (packages/shared/docs/):
- Shared API - Shared utilities and types
- Types - TypeScript type definitions
- Validation - Validation schemas
Additional Resources
- Glossary - Key terminology and concepts
- PRD Documents - Product requirements and specifications
- GitHub Issues - Bug reports and feature requests
- GitHub Discussions - Questions and community
Roadmap & Status
Current Status
Alpha - Phase 3 โ Core research pipeline functional (Phases 1-2 complete); currently implementing advanced features (graph memory, JS rendering, video pipeline); evaluating user workflows and performance characteristics
Rollout Phases
โ Phase 1: Foundations (Weeks 1โ3)
- Event schema and bus
- Basic fetch + Turndown normalization
- Hybrid search index (OpenSearch)
- Orchestrator with frontier scoring
- Pilot View shell (layout, basic log, queue)
Exit Criteria: MVP research pipeline works end-to-end
โ Phase 2: Ranking & Core-First (Weeks 4โ6)
- Source classification and metadata enrichment
- Multi-factor ranking with why-ranked explanations
- Reading queue builder with L0โL4 enforcement
- Router heuristics for JS escalation
- HTTP and metadata caching
Exit Criteria: Authority Mix โฅ50%, TTFC โค120s
๐ง Phase 3: Advanced Features (Weeks 7โ9)
- JS rendering path (Browserbase + Stagehand)
- Graph memory with community summaries
- Graph overlays in Pilot View
- Origin detection and reading queue optimization
- Video caption alignment
Exit Criteria: โค25% JS renders for general topics; graph structure validated
๐ Phase 4: Analysis & Exports (Weeks 10โ12)
- Review Mode implementation
- Timeline replay with KPI dashboard
- Evidence ledger and learning map
- Export formats (JSONL, CSV, GraphML, PDF)
- A/B run comparison
- Evaluation framework and dashboards
- RBAC and audit logs
Exit Criteria: Full replay works; exports verified; team workflow tested
๐ Phase 5 (V2): Advanced Analytics & Scaling
- Advanced attribution analytics
- Team multi-tenancy and shared workspaces
- Failure autopsy and automated debugging
- Extended scholarly integrations (arXiv, PubMed, etc.)
- Multi-language support
- Custom ranking rule builder
Contributing
We welcome contributions! Here's how to get involved:
Report Issues
Found a bug? Have a feature idea? โ Open an issue
Contribute Code
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes following our style guide
- Add tests for new functionality
- Submit a pull request
See Contributing Guide for detailed setup and process.
Improve Documentation
- Clarify existing docs
- Add examples and tutorials
- Report broken links or unclear sections
- Translate documentation
Share Use Cases
Tell us how you're using Deep Research Cockpit! We'd love to feature your research or workflow.
Glossary
Authority Mix Percentage of sources in L0/L1 tiers; target โฅ60% for core-first runs.
Core Proximity Citation-graph distance to foundational sources (papers, specs, repos).
Frontier Priority queue of next concepts, pages, and claims to explore; scored by novelty, centrality, disagreement, and user interest.
GraphRAG Retrieval Augmented Generation combined with graph structure; uses knowledge graph to guide discovery and provide context.
L0 / L1 / L2 / L3 / L4 Source tier classification from foundational (L0) to general discussion (L4).
TTFC Time-to-First-Core source; how quickly the system discovers L0/L1 materials.
Why-Ranked Human-readable breakdown of factors contributing to a source's rank; shows which factors helped (green) and hurt (red).
Deterministic Replay Ability to reproduce any session exactly given same inputs; enabled by seeded randomness and event logging.
Snapshot Pre-computed state capture of a research session at a point in time; enables fast review mode loading.
Evidence Robustness Average number of independent sources supporting each high-impact claim; target โฅ2.0.
Frontier Entropy Measure of breadth in exploration; higher values indicate more diverse concept coverage.
License & Support
License
This project is licensed under the MIT License โ see LICENSE file for details.
Support
- Questions? Check the Getting Started Guide and Documentation Index
- Troubleshooting? See Getting Started - Troubleshooting section
- Found a bug? Report it on GitHub
- Security issue? Email security@example.com (do not open a public issue)
- General discussion? Join our community:
- GitHub Discussions
- Discord community (link coming soon)
Credits
Built with gratitude for these excellent projects:
- OpenSearch - Hybrid search engine
- Neo4j and Memgraph - Graph databases
- Browserbase - Browser automation
- Stagehand - Structured web scraping
- Trafilatura - Content extraction
- Turndown - HTML to Markdown
- Crossref, OpenAlex, Semantic Scholar - Scholarly metadata
Quick Links
- ๐ Documentation Index - All docs organized by role
- ๐ Getting Started - 30-minute setup guide
- ๐๏ธ Architecture - Complete system design
- ๐จโ๐ป Development Guide - Workflow and standards
- ๐งช Testing Guide - Testing practices
- ๐ค Contributing - How to contribute
- ๐ Roadmap - Development phases
- ๐ฌ Discussions - Community Q&A
Deep Research Cockpit โ Making research exploration transparent, verifiable, and steerable.