PicoCode/REST_API.md at master · CodeAtCode/PicoCode

PicoCode REST API Documentation

This document describes the REST API for integrating with PicoCode's local RAG backend.

Overview

PicoCode provides a production-ready local RAG (Retrieval-Augmented Generation) backend with per-project persistent storage. Each project gets its own SQLite database for isolation, making it ideal for IDE integration and custom tooling.

Base URL

http://127.0.0.1:8080/api

Authentication

Currently, the API does not require authentication. For production deployments, consider adding authentication via reverse proxy or API gateway.

API Endpoints

Health Check

Returns server status and available features.

Response:

{
  "status": "ok",
  "version": "0.2.0",
  "features": [
    "rag",
    "per-project-db",
    "incremental-indexing",
    "rate-limiting",
    "caching"
  ]
}

Project Management

Create or Get Project

POST /api/projects
Content-Type: application/json

{
  "path": "/absolute/path/to/project",
  "name": "Optional Project Name"
}

Creates a new project or returns existing one. Each project gets its own database.

Response:

{
  "id": "1234567890abcdef",
  "name": "MyProject",
  "path": "/absolute/path/to/project",
  "database_path": "~/.picocode/projects/1234567890abcdef.db",
  "created_at": "2025-11-06T14:30:00",
  "last_indexed_at": null,
  "status": "created",
  "settings": null
}

List All Projects

Returns list of all registered projects.

Response:

[
  {
    "id": "1234567890abcdef",
    "name": "MyProject",
    "path": "/absolute/path/to/project",
    "status": "ready",
    "created_at": "2025-11-06T14:30:00",
    "last_indexed_at": "2025-11-06T15:00:00"
  }
]

Get Project Details

GET /api/projects/{project_id}

Returns details for a specific project including indexing statistics.

Response:

{
  "id": "1234567890abcdef",
  "name": "MyProject",
  "path": "/absolute/path/to/project",
  "database_path": "~/.picocode/projects/1234567890abcdef.db",
  "status": "ready",
  "created_at": "2025-11-06T14:30:00",
  "last_indexed_at": "2025-11-06T15:00:00",
  "indexing_stats": {
    "file_count": 150,
    "embedding_count": 450,
    "is_indexed": true
  }
}

Indexing Stats Fields:

  • file_count: Number of files indexed in the project
  • embedding_count: Number of embeddings (chunks) generated
  • is_indexed: Boolean indicating if project has any indexed files

Delete Project

DELETE /api/projects/{project_id}

Permanently deletes project and its database.

Response:


Indexing

Index Project

POST /api/projects/index
Content-Type: application/json

{
  "project_id": "1234567890abcdef"
}

Starts background indexing of the project. This processes all files, generates embeddings, and stores them in the project's database.

Features:

  • Incremental indexing (skips unchanged files)
  • Code-aware chunking
  • Background processing

Rate Limit: 10 requests per minute per IP

Response:

{
  "status": "indexing",
  "project_id": "1234567890abcdef"
}

Project Status Values:

  • created - Project created but not indexed
  • indexing - Currently indexing
  • ready - Indexed and ready for queries
  • error - Indexing failed

Code Intelligence

Semantic Search

POST /api/query
Content-Type: application/json

{
  "project_id": "1234567890abcdef",
  "query": "How does authentication work?",
  "top_k": 5
}

Performs semantic search across the indexed project using vector embeddings.

Rate Limit: 100 requests per minute per IP

Response:

{
  "results": [
    {
      "file_id": 123,
      "path": "src/auth.py",
      "chunk_index": 0,
      "score": 0.8542,
      "content": "def authenticate(username, password):\n    ..."
    }
  ],
  "project_id": "1234567890abcdef",
  "query": "How does authentication work?",
  "count": 5
}

Code Completion / Question Answering

POST /code
Content-Type: application/json

{
  "project_id": "1234567890abcdef",
  "prompt": "Explain the authentication flow",
  "context": "",
  "use_rag": true,
  "top_k": 5
}

Gets code suggestions or answers using RAG + LLM.

Parameters:

  • project_id - (optional) Project to search, uses first available if not provided
  • prompt - (required) Question or request
  • context - (optional) Additional context
  • use_rag - (optional, default: true) Use semantic search
  • top_k - (optional, default: 5) Number of results to retrieve

Response:

{
  "response": "The authentication flow works as follows...",
  "used_context": [
    {
      "path": "src/auth.py",
      "score": 0.8542
    }
  ]
}

Error Handling

All endpoints return standard HTTP status codes:

  • 200 - Success
  • 400 - Bad request (validation error)
  • 404 - Resource not found
  • 429 - Rate limit exceeded
  • 500 - Server error

Error responses include a JSON object:

{
  "error": "Description of the error"
}

For rate limiting errors:

{
  "error": "Rate limit exceeded",
  "retry_after": 42
}

Configuration

Create a .env file in the project root:

# OpenAI API configuration
API_URL=https://api.openai.com/v1/
API_KEY=your-api-key-here

# Model selection
EMBEDDING_MODEL=text-embedding-3-small
CODING_MODEL=gpt-4o

# Server configuration
UVICORN_HOST=127.0.0.1
UVICORN_PORT=8080

# File processing
MAX_FILE_SIZE=200000

Example Client (Python)

import requests

class PicoCodeClient:
    def __init__(self, base_url="http://127.0.0.1:8080/api"):
        self.base_url = base_url
    
    def create_project(self, path, name=None):
        """Create or get a project."""
        response = requests.post(
            f"{self.base_url}/projects",
            json={"path": path, "name": name}
        )
        response.raise_for_status()
        return response.json()
    
    def index_project(self, project_id):
        """Start indexing a project."""
        response = requests.post(
            f"{self.base_url}/projects/index",
            json={"project_id": project_id}
        )
        response.raise_for_status()
        return response.json()
    
    def get_project(self, project_id):
        """Get project details."""
        response = requests.get(f"{self.base_url}/projects/{project_id}")
        response.raise_for_status()
        return response.json()
    
    def query(self, project_id, query, top_k=5):
        """Perform semantic search."""
        response = requests.post(
            f"{self.base_url}/query",
            json={
                "project_id": project_id,
                "query": query,
                "top_k": top_k
            }
        )
        response.raise_for_status()
        return response.json()
    
    def get_code_suggestion(self, project_id, prompt, use_rag=True):
        """Get code suggestion or answer."""
        response = requests.post(
            f"{self.base_url}/../code",  # Note: /code is at root level
            json={
                "project_id": project_id,
                "prompt": prompt,
                "use_rag": use_rag
            }
        )
        response.raise_for_status()
        return response.json()

# Usage example
client = PicoCodeClient()

# Create project
project = client.create_project("/path/to/my/project", "MyProject")
print(f"Created project: {project['id']}")

# Index project
result = client.index_project(project["id"])
print(f"Indexing status: {result['status']}")

# Wait for indexing to complete (poll status)
import time
while True:
    proj = client.get_project(project["id"])
    if proj["status"] == "ready":
        break
    elif proj["status"] == "error":
        print("Indexing failed")
        break
    time.sleep(2)

# Perform semantic search
results = client.query(project["id"], "authentication flow")
for result in results["results"]:
    print(f"- {result['path']} (score: {result['score']:.4f})")

# Get code suggestion
suggestion = client.get_code_suggestion(project["id"], "Explain auth")
print(suggestion["response"])

Interactive Documentation

PicoCode provides interactive API documentation:

These interfaces allow you to:

  • Browse all endpoints
  • See request/response schemas
  • Test API calls directly
  • View example requests

Rate Limiting

Rate limits are enforced per IP address:

Endpoint Type Limit Window
Queries 100 requests 1 minute
Indexing 10 requests 1 minute
General 200 requests 1 minute

When rate limited, the response includes a Retry-After header indicating seconds to wait.


Caching

The API uses LRU caching for performance:

  • Project metadata: 5 minute TTL
  • Project stats: 1 minute TTL
  • Search results: 10 minute TTL
  • File content: 5 minute TTL

Caches are automatically invalidated on data updates.


Support

For issues, feature requests, or questions: