GitHub - stephenhermes/quickmaths: Your LLM agent for math research

An intelligent research paper analysis system that enables natural language interaction with a knowledge base of academic papers. Built with LangGraph, the system uses a multi-agent architecture to intelligently route queries to the most appropriate processing pipeline.

Features

  • Intelligent Query Routing - Automatically determines whether to use RAG retrieval, database query tools, or direct LLM responses based on the question type
  • PDF Ingestion Pipeline - Extracts text and metadata from research papers using PyMuPDF and LLM-powered metadata extraction
  • Semantic Search - Vector-based retrieval using LlamaIndex and HuggingFace embeddings for finding relevant paper content
  • Structured Data Queries - SQL-based metadata lookups for questions about authors, paper counts, and other structured data
  • Citation-Aware Responses - Provides grounded answers with references to source papers

Architecture

The system uses a LangGraph-based agentic workflow with four specialized agents:

User Query
    │
    ▼
┌─────────────────┐
│  Router Agent   │ ─── Classifies query type
└────────┬────────┘
         │
    ┌────┴────┬──────────┐
    ▼         ▼          ▼
┌───────┐ ┌───────┐ ┌─────────┐
│  RAG  │ │ Query │ │ Answer  │
│ Agent │ │ Agent │ │ Agent   │
└───┬───┘ └───┬───┘ └────┬────┘
    │         │          │
    └─────────┴──────────┘
              │
              ▼
      Final Response
Route Use Case
RAG Content questions about paper topics, concepts, or findings
DATABASE Metadata queries (paper counts, author lookups, etc.)
ANSWER General knowledge questions unrelated to the paper collection

Tech Stack

Category Technologies
Orchestration LangGraph
LLM Framework LangChain
LLM Provider Anthropic Claude
RAG/Indexing LlamaIndex
Embeddings HuggingFace (BAAI/bge-small-en-v1.5)
Database SQLite + SQLModel
PDF Processing PyMuPDF
Testing pytest + DeepEval

Installation

Prerequisites: Python 3.13+

# Clone the repository
git clone https://github.com/stephenhermes/quickmaths.git
cd quickmaths

# Install dependencies
pip install -e .

# Or with dev dependencies
pip install -e ".[dev]"

Configuration

Create a .env file in the project root:

# Required
PDF_DIRECTORY='/path/to/your/pdf/papers'
ANTHROPIC_API_KEY='your-api-key'

# Optional
LLM_MODEL='claude-sonnet-4-20250514'
EMBED_MODEL='BAAI/bge-small-en-v1.5'
LOG_LEVEL='INFO'

Usage

1. Ingest Papers

First, ingest your PDF papers into the system:

quickmaths-ingest ${PDF_DIRECTORY}

# Options:
#   --rebuild-index    Rebuild the vector index from scratch
#   --rebuild-db       Rebuild the metadata database
#   --num-files-limit  Limit number of files to process

The ingestion pipeline:

  1. Converts PDFs to markdown text
  2. Extracts metadata (title, authors) using LLM
  3. Chunks documents with sentence-aware splitting
  4. Embeds and indexes chunks for semantic search
  5. Stores metadata in SQLite for structured queries

2. Interactive Chat

Start the interactive Q&A interface:

Example queries:

  • "What papers discuss transformer architectures?" → RAG retrieval
  • "How many papers are in the database?" → Database query
  • "What is gradient descent?" → Direct LLM answer

Testing

# Run all tests
pytest

# Run with coverage
pytest --cov=quickmaths

# Run specific test categories
pytest tests/unit
pytest tests/integration

The test suite includes DeepEval integration for LLM evaluation metrics.