GitHub - stephenhermes/quickmaths: Your LLM agent for math research

An intelligent research paper analysis system that enables natural language interaction with a knowledge base of academic papers. Built with LangGraph, the system uses a multi-agent architecture to intelligently route queries to the most appropriate processing pipeline.

Features

Intelligent Query Routing - Automatically determines whether to use RAG retrieval, database query tools, or direct LLM responses based on the question type
PDF Ingestion Pipeline - Extracts text and metadata from research papers using PyMuPDF and LLM-powered metadata extraction
Semantic Search - Vector-based retrieval using LlamaIndex and HuggingFace embeddings for finding relevant paper content
Structured Data Queries - SQL-based metadata lookups for questions about authors, paper counts, and other structured data
Citation-Aware Responses - Provides grounded answers with references to source papers

Architecture

The system uses a LangGraph-based agentic workflow with four specialized agents:

User Query
    │
    ▼
┌─────────────────┐
│  Router Agent   │ ─── Classifies query type
└────────┬────────┘
         │
    ┌────┴────┬──────────┐
    ▼         ▼          ▼
┌───────┐ ┌───────┐ ┌─────────┐
│  RAG  │ │ Query │ │ Answer  │
│ Agent │ │ Agent │ │ Agent   │
└───┬───┘ └───┬───┘ └────┬────┘
    │         │          │
    └─────────┴──────────┘
              │
              ▼
      Final Response

Route	Use Case
RAG	Content questions about paper topics, concepts, or findings
DATABASE	Metadata queries (paper counts, author lookups, etc.)
ANSWER	General knowledge questions unrelated to the paper collection

Tech Stack

Category	Technologies
Orchestration	LangGraph
LLM Framework	LangChain
LLM Provider	Anthropic Claude
RAG/Indexing	LlamaIndex
Embeddings	HuggingFace (BAAI/bge-small-en-v1.5)
Database	SQLite + SQLModel
PDF Processing	PyMuPDF
Testing	pytest + DeepEval

Installation

Prerequisites: Python 3.13+

# Clone the repository
git clone https://github.com/stephenhermes/quickmaths.git
cd quickmaths

# Install dependencies
pip install -e .

# Or with dev dependencies
pip install -e ".[dev]"

Configuration

Create a .env file in the project root:

# Required
PDF_DIRECTORY='/path/to/your/pdf/papers'
ANTHROPIC_API_KEY='your-api-key'

# Optional
LLM_MODEL='claude-sonnet-4-20250514'
EMBED_MODEL='BAAI/bge-small-en-v1.5'
LOG_LEVEL='INFO'

Usage

1. Ingest Papers

First, ingest your PDF papers into the system:

quickmaths-ingest ${PDF_DIRECTORY}

# Options:
#   --rebuild-index    Rebuild the vector index from scratch
#   --rebuild-db       Rebuild the metadata database
#   --num-files-limit  Limit number of files to process

The ingestion pipeline:

Converts PDFs to markdown text
Extracts metadata (title, authors) using LLM
Chunks documents with sentence-aware splitting
Embeds and indexes chunks for semantic search
Stores metadata in SQLite for structured queries

2. Interactive Chat

Start the interactive Q&A interface:

Example queries:

"What papers discuss transformer architectures?" → RAG retrieval
"How many papers are in the database?" → Database query
"What is gradient descent?" → Direct LLM answer

Testing

# Run all tests
pytest

# Run with coverage
pytest --cov=quickmaths

# Run specific test categories
pytest tests/unit
pytest tests/integration

The test suite includes DeepEval integration for LLM evaluation metrics.