A vector embedding-based code semantic search tool with MCP server and multi-model integration. Can be used as a pure CLI tool. Supports Ollama for fully local embedding and reranking, enabling complete offline operation and privacy protection for your code repository.
# Semantic code search - Find code by meaning, not just keywords ╭─ ~/workspace/autodev-codebase ╰─❯ codebase search "user manage" --demo Found 20 results in 5 files for: "user manage" ================================================== File: "hello.js" ================================================== < class UserManager > (L7-20) class UserManager { constructor() { this.users = []; } addUser(user) { this.users.push(user); console.log('User added:', user.name); } getUsers() { return this.users; } } …… # Call graph analysis - Trace function call relationships and execution paths ╭─ ~/workspace/autodev-codebase ╰─❯ codebase call --demo --query="app,addUser" Connections between app, addUser: Found 2 matching node(s): - demo/app:L1-29 - demo/hello.UserManager.addUser:L12-15 Direct connections: - demo/app:L1-29 → demo/hello.UserManager.addUser:L12-15 Chains found: - demo/app:L1-29 → demo/hello.UserManager.addUser:L12-15 # Code outline with AI summaries - Understand code structure at a glance ╭─ ~/workspace/autodev-codebase ╰─❯ codebase outline 'hello.js' --demo --summarize # hello.js (23 lines) └─ Defines a greeting function that logs a personalized hello message and returns a welcome string. Implements a UserManager class managing an array of users with methods to add users and retrieve the current user list. Exports both components for external use. 2--5 | function greetUser └─ Implements user greeting logic by logging a personalized hello message and returning a welcome message 7--20 | class UserManager └─ Manages user data with methods to add users to a list and retrieve all stored users 12--15 | method addUser └─ Adds a user to the users array and logs a confirmation message with the user's name.
🚀 Features
- 🔍 Semantic Code Search: Vector-based search using advanced embedding models
- 🔗 Call Graph Analysis: Trace function call relationships and execution paths
- 🌐 MCP Server: HTTP-based MCP server with SSE and stdio adapters
- 💻 Pure CLI Tool: Standalone command-line interface without GUI dependencies
- ⚙️ Layered Configuration: CLI, project, and global config management
- 🎯 Advanced Path Filtering: Glob patterns with brace expansion and exclusions
- 🌲 Tree-sitter Parsing: Support for 40+ programming languages
- 💾 Qdrant Integration: High-performance vector database
- 🔄 Multiple Providers: OpenAI, Ollama, Jina, Gemini, Mistral, OpenRouter, Vercel
- 📊 Real-time Watching: Automatic index updates
- ⚡ Batch Processing: Efficient parallel processing
- 📝 Code Outline Extraction: Generate structured code outlines with AI summaries
- 💨 Dependency Analysis Cache: Intelligent caching for 10-50x faster re-analysis
📦 Installation
1. Dependencies
brew install ollama ripgrep ollama serve ollama pull nomic-embed-text
2. Qdrant
docker run -d -p 6333:6333 -p 6334:6334 --name qdrant qdrant/qdrant
3. Install
npm install -g @autodev/codebase codebase config --set embedderProvider=ollama,embedderModelId=nomic-embed-text
🛠️ Quick Start
# Demo mode (recommended for first-time) # Creates a demo directory in current working directory for testing # Index & search codebase index --demo codebase search "user greet" --demo # Call graph analysis codebase call --demo --query="app,addUser" # MCP server codebase index --serve --demo
📋 Commands
📝 Code Outlines
# Extract code structure (functions, classes, methods) codebase outline "src/**/*.ts" # Generate code structure with AI summaries codebase outline "src/**/*.ts" --summarize # View only file-level summaries codebase outline "src/**/*.ts" --summarize --title # Clear summary cache codebase outline --clear-summarize-cache
🔗 Call Graph Analysis
# 📊 Statistics Overview (no --query) codebase call # Show statistics overview codebase call --json # JSON format codebase call src/commands # Analyze specific directory # 🔍 Function Query (with --query) codebase call --query="getUser" # Single function call tree (default depth: 3) codebase call --query="main" --depth=5 # Custom depth codebase call --query="getUser,validateUser" # Multi-function connections (default depth: 10) # 🎨 Visualization codebase call --viz graph.json # Export Cytoscape.js format codebase call --open # Open interactive viewer codebase call --viz graph.json --open # Export and open # Specify workspace (works for both modes) codebase call --path=/my/project --query="main"
Query Patterns:
- Exact match:
--query="functionName"or--query="*ClassName.methodName" - Wildcards:
*(any characters),?(single character)- Examples:
--query="get*",--query="*User*",--query="*.*.get*"
- Examples:
- Single function:
--query="main"- Shows call tree (upward + downward)- Default depth: 3 (avoids excessive output)
- Multiple functions:
--query="main,helper"- Analyzes connection paths between functions- Default depth: 10 (deeper search needed for path finding)
Supported Languages:
- TypeScript/JavaScript (.ts, .tsx, .js, .jsx)
- Python (.py)
- Java (.java)
- C/C++ (.c, .h, .cpp, .cc, .cxx, .hpp, .hxx, .c++)
- C# (.cs)
- Rust (.rs)
- Go (.go)
🔍 Indexing & Search
# Index the codebase codebase index --path=/my/project --force # Search with filters codebase search "error handling" --path-filters="src/**/*.ts" # Search with custom limit and minimum score codebase search "authentication" --limit=20 --min-score=0.7 codebase search "API" -l 30 -S 0.5 # Search in JSON format codebase search "authentication" --json # Clear index data codebase index --clear-cache --path=/my/project
🌐 MCP Server
# HTTP mode (recommended) codebase index --serve --port=3001 --path=/my/project # Stdio adapter codebase stdio --server-url=http://localhost:3001/mcp
⚙️ Configuration
# View config codebase config --get codebase config --get embedderProvider --json # Set config codebase config --set embedderProvider=ollama,embedderModelId=nomic-embed-text codebase config --set --global qdrantUrl=http://localhost:6333
🚀 Advanced Features
🔍 LLM-Powered Search Reranking
Enable LLM reranking to dramatically improve search relevance:
# Enable reranking with Ollama (recommended) codebase config --set rerankerEnabled=true,rerankerProvider=ollama,rerankerOllamaModelId=qwen3-vl:4b-instruct # Or use OpenAI-compatible providers codebase config --set rerankerEnabled=true,rerankerProvider=openai-compatible,rerankerOpenAiCompatibleModelId=deepseek-chat # Search with automatic reranking codebase search "user authentication" # Results are automatically reranked by LLM
Benefits:
- 🎯 Higher precision: LLM understands semantic relevance beyond vector similarity
- 📊 Smart scoring: Results are reranked on a 0-10 scale based on query relevance
- ⚡ Batch processing: Efficiently handles large result sets with configurable batch sizes
- 🎛️ Threshold control: Filter results with
rerankerMinScoreto keep only high-quality matches
Path Filtering & Export
# Path filtering with brace expansion and exclusions codebase search "API" --path-filters="src/**/*.ts,lib/**/*.js" codebase search "utils" --path-filters="{src,test}/**/*.ts" # Export results in JSON format for scripts codebase search "auth" --json
Path Filtering & Export
# Path filtering with brace expansion and exclusions codebase search "API" --path-filters="src/**/*.ts,lib/**/*.js" codebase search "utils" --path-filters="{src,test}/**/*.ts" # Export results in JSON format for scripts codebase search "auth" --json
⚙️ Configuration
Config Layers (Priority Order)
- CLI Arguments - Runtime parameters (
--path,--config,--log-level,--force, etc.) - Project Config -
./autodev-config.json(or custom path via--config) - Global Config -
~/.autodev-cache/autodev-config.json - Built-in Defaults - Fallback values
Note: CLI arguments provide runtime override for paths, logging, and operational behavior. For persistent configuration (embedderProvider, API keys, search parameters), use config --set to save to config files.
Common Config Examples
Ollama:
{
"embedderProvider": "ollama",
"embedderModelId": "nomic-embed-text",
"qdrantUrl": "http://localhost:6333"
}OpenAI:
{
"embedderProvider": "openai",
"embedderModelId": "text-embedding-3-small",
"embedderOpenAiApiKey": "sk-your-key",
"qdrantUrl": "http://localhost:6333"
}OpenAI-Compatible:
{
"embedderProvider": "openai-compatible",
"embedderModelId": "text-embedding-3-small",
"embedderOpenAiCompatibleApiKey": "sk-your-key",
"embedderOpenAiCompatibleBaseUrl": "https://api.openai.com/v1"
}Key Configuration Options
| Category | Options | Description |
|---|---|---|
| Embedding | embedderProvider, embedderModelId, embedderModelDimension |
Provider and model settings |
| API Keys | embedderOpenAiApiKey, embedderOpenAiCompatibleApiKey |
Authentication |
| Vector Store | qdrantUrl, qdrantApiKey |
Qdrant connection |
| Search | vectorSearchMinScore, vectorSearchMaxResults |
Search behavior |
| Reranker | rerankerEnabled, rerankerProvider |
Result reranking |
| Summarizer | summarizerProvider, summarizerLanguage, summarizerBatchSize |
AI summary generation |
Key CLI Arguments:
index- Index the codebasesearch <query>- Search the codebase (required positional argument)outline <pattern>- Extract code outlines (supports glob patterns)call- Analyze function call relationships and dependency graphsstdio- Start stdio adapter for MCPconfig- Manage configuration (use with --get or --set)--serve- Start MCP HTTP server (use withindexcommand)--summarize- Generate AI summaries for code outlines--dry-run- Preview operations before execution--title- Show only file-level summaries--clear-summarize-cache- Clear all summary caches--path,--demo,--force- Common options--limit/-l <number>- Maximum number of search results (default: from config, max 50)--min-score/-S <number>- Minimum similarity score for search results (0-1, default: from config)--query <patterns>- Query patterns for call graph analysis (comma-separated)--viz <file>- Export full dependency data for visualization (cannot use with --query)--open- Open interactive graph viewer--depth <number>- Set analysis depth for call graphs--help- Show all available options
Configuration Commands:
# View config codebase config --get codebase config --get --json # Set config (saves to file) codebase config --set embedderProvider=ollama,embedderModelId=nomic-embed-text codebase config --set --global embedderProvider=openai,embedderOpenAiApiKey=sk-xxx # Use custom config file codebase --config=/path/to/config.json config --get codebase --config=/path/to/config.json config --set embedderProvider=ollama # Runtime override (paths, logging, etc.) codebase index --path=/my/project --log-level=info --force
For complete configuration reference, see CONFIG.md.
🔌 MCP Integration
HTTP Streamable Mode (Recommended)
codebase index --serve --port=3001
IDE Config:
{
"mcpServers": {
"codebase": {
"url": "http://localhost:3001/mcp"
}
}
}Stdio Adapter
# First start the MCP server in one terminal codebase index --serve --port=3001 # Then connect via stdio adapter in another terminal (for IDEs that require stdio) codebase stdio --server-url=http://localhost:3001/mcp
IDE Config:
{
"mcpServers": {
"codebase": {
"command": "codebase",
"args": ["stdio", "--server-url=http://localhost:3001/mcp"]
}
}
}🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request or open an Issue on GitHub.
📄 License
This project is licensed under the MIT License.
🙏 Acknowledgments
This project is a fork and derivative work based on Roo Code. We've built upon their excellent foundation to create this specialized codebase analysis tool with enhanced features and MCP server capabilities.
🌟 If you find this tool helpful, please give us a star on GitHub!
Made with ❤️ for the developer community