Claude Parser v2.1.0 π
Git-like disaster recovery for Claude Code conversations
Claude Parser treats every Claude API call as a git commit, enabling powerful recovery and analysis capabilities when things go wrong.
π What's New in v2.1.0
New Features
- π Export Domain - Export conversations to different formats for indexing
- π LlamaIndex Export -
export_for_llamaindex()for semantic search - π οΈ Fixed Discovery -
discover_claude_files()now properly returns file paths - π¦ Complete API - All filtering functions now properly exported
π v2.0.0 Major Changes
- π― Complete API Redesign - Clean, intuitive Python API with 30+ functions
- π 15 Domain Architecture - Organized into focused, composable modules
- π§ CG Commands - Full Git-like CLI for disaster recovery
- πͺ CH Hook System - Composable hooks for Claude Code integrations
- π DuckDB Backend - Efficient JSONL querying without intermediate storage
- ποΈ LNCA Compliant - Every file <80 LOC, 100% framework delegation
- π Full Documentation - Complete API reference, user guides, and examples
Breaking Changes from v1
- New import structure:
from claude_parser import load_session(notfrom claude_parser.main) - Removed god objects - now uses focused domain modules
- All functions return plain dicts, not custom objects
- Pydantic schema normalization handles all JSONL variations
π Quick Start
Installation
pip install claude-parser
Basic Usage
from claude_parser import load_latest_session, analyze_session # Load your most recent Claude session session = load_latest_session() print(f"Found session with {len(session['messages'])} messages") # Analyze the session analysis = analyze_session(session) print(f"Total tokens: {analysis['total_tokens']}") print(f"Estimated cost: ${analysis['estimated_cost']:.2f}")
Disaster Recovery with CG Commands
# Oh no! Claude deleted important files! python -m claude_parser.cli.cg find "important_file.py" # Found it! Now restore it python -m claude_parser.cli.cg checkout /path/to/important_file.py # See what happened python -m claude_parser.cli.cg reflog # Jump back to before the disaster python -m claude_parser.cli.cg reset abc123
π― Core Features
Git-like Commands (CG)
cg status- Current session statecg log- Conversation historycg find <pattern>- Search across all sessionscg blame <file>- Who last modified a filecg checkout <file>- Restore deleted filescg reflog- Operation historycg show <uuid>- Operation detailscg reset <uuid>- Time travel to any point
Composable Hooks (CH)
# Run hooks with custom executors python -m claude_parser.cli.ch run --executor my_executor # Or set default executor export CLAUDE_HOOK_EXECUTOR=my_executor
Python SDK
from claude_parser import ( # Session Management load_session, load_latest_session, discover_all_sessions, # Analytics analyze_session, analyze_tool_usage, analyze_token_usage, calculate_session_cost, calculate_context_window, # File Operations restore_file_content, generate_file_diff, compare_files, # Navigation find_message_by_uuid, get_message_sequence, get_timeline_summary, # Discovery discover_claude_files, discover_current_project_files, # Filtering (NEW in v2!) filter_messages_by_type, filter_messages_by_tool, search_messages_by_content, exclude_tool_operations, # Export (NEW in v2.1!) export_for_llamaindex # Export conversations for semantic search )
π Real-World Use Cases
1. Disaster Recovery
from claude_parser import load_latest_session, restore_file_content session = load_latest_session() content = restore_file_content("/deleted/important.py", session) with open("recovered.py", "w") as f: f.write(content)
2. Cost Analysis
from claude_parser import load_latest_session, calculate_session_cost session = load_latest_session() cost = calculate_session_cost( input_tokens=100000, output_tokens=50000, model="claude-3-5-sonnet-20241022" ) print(f"This session cost: ${cost['total_cost']:.2f}")
3. Message Filtering
from claude_parser import load_latest_session, MessageType from claude_parser.filtering import filter_messages_by_type session = load_latest_session() user_messages = filter_messages_by_type(session['messages'], MessageType.USER) print(f"You sent {len(user_messages)} messages")
4. Real-time Monitoring
from claude_parser.watch import watch def on_assistant(message): print(f"Claude says: {message['content'][:100]}...") watch("~/.claude/projects/current/session.jsonl", on_assistant=on_assistant)
5. Export for Semantic Search (NEW in v2.1!)
from claude_parser import export_for_llamaindex # Export conversations to LlamaIndex format docs = export_for_llamaindex("session.jsonl") # Returns: [{"text": "message", "metadata": {...}}, ...] # Use with semantic search services for doc in docs: print(f"Text: {doc['text'][:50]}...") print(f"Speaker: {doc['metadata']['speaker']}")
ποΈ Architecture
Clean Domain Organization (19 modules)
claude_parser/
βββ analytics/ # Session and tool analysis
βββ api/ # API utilities
βββ cli/ # CG and CH commands
βββ core/ # Core utilities
βββ discovery/ # File and project discovery
βββ export/ # Export formats (NEW in v2.1!)
βββ extensions/ # Extension system
βββ filtering/ # Message filtering
βββ hooks/ # Hook system and API
βββ loaders/ # Session loading
βββ messages/ # Message utilities
βββ models/ # Data models
βββ navigation/ # Timeline and UUID navigation
βββ operations/ # File operations
βββ queries/ # DuckDB SQL queries
βββ session/ # Session management
βββ storage/ # DuckDB engine
βββ tokens/ # Token counting and billing
βββ watch/ # Real-time monitoring
LNCA Principles
- <80 LOC per file - Optimized for LLM comprehension
- 100% Framework Delegation - No custom loops or error handling
- Single Source of Truth - One file per feature
- Pydantic Everything - Schema normalization for all JSONL variations
π Documentation
Full documentation available at: https://alicoding.github.io/claude-parser/
π Migration from v1
Old v1 Code
# v1 - Complex imports, god objects from claude_parser.main import ClaudeParser parser = ClaudeParser() parser.load_and_analyze_everything() # God object doing too much
New v2 Code
# v2 - Clean, focused functions from claude_parser import load_latest_session, analyze_session session = load_latest_session() analysis = analyze_session(session) # One function, one purpose
Key Differences
- No more god objects - Each function does one thing well
- Plain dicts everywhere - No custom classes to learn
- Explicit imports - Import only what you need
- Better error handling - Framework delegation (Pydantic/Typer)
- More features - Filtering, watching, complete hook system
π’ Deployment
PyPI Release
# Build and upload to PyPI python -m build twine upload dist/*
Documentation
Documentation auto-deploys to GitHub Pages on every push to main.
πΊοΈ Export Format Roadmap
Currently Available (v2.1)
- β
LlamaIndex -
export_for_llamaindex()- For semantic search indexing
Planned Export Formats
- π Mem0 - Long-term memory for AI agents
- π ChromaDB - Vector database format
- π Pinecone - Cloud vector database
- π Markdown - Human-readable conversation logs
- π JSON-LD - Structured data with context
- π OpenAI Messages - Direct OpenAI API format
- π Anthropic Messages - Direct Anthropic API format
- π LangChain Documents - LangChain document format
- π Haystack Documents - Haystack NLP framework
Export Domain Architecture
claude_parser/export/ βββ __init__.py # Export registry βββ llamaindex.py # LlamaIndex format (DONE) βββ mem0.py # Mem0 format (TODO) βββ chroma.py # ChromaDB format (TODO) βββ markdown.py # Markdown format (TODO) βββ ... # More formats
π Roadmap
v3.0.0 - UI-Ready API (Coming Soon)
Complete redesign for zero-boilerplate UI development.
Core/Feature Layer Separation
- β Audit complete - identified all internal vs public functions
- β Framework delegation mapped (humanize, babel, arrow, rich)
- π
claude_parser.core- Low-level utilities for advanced users - π
claude_parser- High-level UI-ready functions
Display-Ready Functions
# Coming in v3.0.0 from claude_parser import ( get_session_summary, # "436 messages, 3 hours, $12.45" get_formatted_messages, # Markdown-formatted conversation get_token_breakdown, # "45,678 tokens ($12.45)" get_file_changes_display, # Formatted diff with colors export_as_html, # Complete HTML report ) # One-liner, zero parsing needed: print(get_session_summary(session)) # That's it!
Planned Features
- Session Display: Pre-formatted messages with timestamps, roles, emojis
- Analytics Dashboard: Human-readable metrics (not raw numbers)
- File Operations: Formatted diffs, file lists with status icons
- Export Formats: HTML, Markdown, JSON, PDF - all display-ready
- Smart Defaults: "No messages found" instead of empty arrays
- Number Formatting: "$12.45" not 0.01245, "45,678" not 45678
- Time Formatting: "2:34 PM" not timestamps, "3 hours ago" not seconds
Framework Delegation
All formatting delegated to specialized libraries:
humanize- Number and size formattingbabel- Currency formattingarrow- Time and date formattingrich- Terminal colors and tablesemoji- Status indicators (β β β οΈ)tabulate- Markdown/HTML tablesjinja2- HTML report generation
v2.2.0 - Bug Fixes (Next Release)
- β Fixed token counting to match UI (v2.1.1)
- β Fixed None message field handling
- π Additional message extraction improvements
π€ Contributing
We welcome contributions! Please ensure:
- Files stay under 80 lines of code
- Use framework delegation (no custom loops)
- Add tests for new features
- Update documentation
π License
MIT License - See LICENSE file for details.
π Acknowledgments
- Built for the Claude Code community
- Inspired by git's powerful recovery capabilities
- Designed with LNCA principles for LLM-native development
π Stats
- 19 specialized domains
- 35+ public functions
- <80 lines per file
- 100% framework delegation
- 0 custom error handling
Ready to never lose code again? Install v2.1.0 and experience the power of Git-like recovery for Claude Code!
pip install claude-parser==2.1.0
Documentation | GitHub | PyPI