π Travel Itinerary Analyzer - Complete Documentation
A powerful AI-powered travel itinerary comparison and analysis platform with Hybrid RAG (Retrieval-Augmented Generation) capabilities.
π Features Overview
Core Features
| Feature | Description |
|---|---|
| π Multi-format Upload | Supports PDF, Word (DOCX), and Images |
| π AI Analysis | Extracts structured data from itineraries |
| βοΈ Comparison | Side-by-side comparison of multiple itineraries |
| π‘ Smart Insights | RAG-enhanced strategic recommendations |
| π¬ Q&A Chat | Ask questions about your documents |
| π Bilingual Support | English (Roboto Slab) & Thai (Sarabun) fonts |
Advanced RAG Features
| Feature | Description |
|---|---|
| π Knowledge Base | Store and index documents for retrieval |
| π Hybrid Search | Combines vector similarity + graph traversal |
| πΉπ Thai RAG | Optimized embedding for Thai documents |
| πΌοΈ Multimodal | Process images with GPT-4 Vision |
ποΈ Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Frontend (React + Vite) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β βββββββββββββββ βββββββββββββββ βββββββββββββββββββββββ β
β β Upload β β Analysis β β Knowledge Base β β
β β Component β β Output β β Manager β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Services Layer β
β ββββββββββββ ββββββββββββ ββββββββββββ βββββββββββββββββ β
β β AI β β File β β Arango β β Thai RAG β β
β β Service β β Parser β β Service β β Service β β
β ββββββββββββ ββββββββββββ ββββββββββββ βββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β External Services β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββ β
β β OpenAI β β ArangoDB β β ChromaDB β β
β β GPT-4o β β (Hybrid) β β (Backup) β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π Project Structure
itin-analyzer/
βββ components/ # React UI components
β βββ AnalysisOutput.tsx # Main output container
β βββ InsightsView.tsx # RAG-enhanced insights display
β βββ ComparisonView.tsx # Side-by-side comparison
β βββ QnaView.tsx # Q&A chat interface
β βββ KnowledgeBase.tsx # KB management UI
β βββ RagProgressOverlay.tsx # Loading overlay
β βββ icons/ # SVG icon components
β
βββ services/ # Business logic services
β βββ aiService.ts # OpenAI API integration
β βββ arangoService.ts # ArangoDB Hybrid RAG
β βββ thaiRagService.ts # Thai language processing
β βββ fileParser.ts # PDF/DOCX/Image parsing
β βββ multimodalRagService.ts # GPT-4 Vision
β βββ dbService.ts # IndexedDB local storage
β βββ exportService.ts # PDF/Excel export
β
βββ utils/ # Utility functions
β βββ markdownRenderer.tsx # Markdown to React
β
βββ App.tsx # Main application component
βββ types.ts # TypeScript type definitions
βββ index.html # HTML entry point
βββ index.css # Global styles
βββ vite.config.ts # Vite configuration
βββ docker-compose.yml # Docker services
βββ .env # Environment variables
π How It Works
1. Document Upload & Parsing
graph LR
A[Upload File] --> B{File Type?}
B -->|PDF| C[pdf-parse]
B -->|DOCX| D[mammoth.js]
B -->|Image| E[GPT-4 Vision]
C --> F[Extracted Text]
D --> F
E --> F
F --> G[AI Analysis]
Supported formats:
- PDF: Uses
pdfjs-distfor text extraction - Word: Uses
mammothfor DOCX parsing - Images: Uses GPT-4 Vision for OCR
2. AI Analysis Flow
When you click "Analyze & Compare":
- Text Extraction β Parse uploaded files
- Structured Analysis β GPT-4o extracts:
- Tour name, duration, destinations
- Pricing breakdown
- Inclusions/exclusions
- Day-by-day itinerary
- Comparison β Generate side-by-side table
- Geocoding β Map destinations
3. Knowledge Base & RAG
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β RAG Pipeline β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β 1. INDEXING (Upload to KB) β
β βββββββββββ βββββββββββ βββββββββββ β
β β Chunk β β β Embed β β β Store β β
β β Text β β Vectors β β ArangoDBβ β
β βββββββββββ βββββββββββ βββββββββββ β
β β β
β βββββββββββ βββββββββββ β
β β Extract β β β Build β β
β β Entitiesβ β Graph β β
β βββββββββββ βββββββββββ β
β β
β 2. RETRIEVAL (Get Insights / Q&A) β
β βββββββββββ βββββββββββββββββββββββ β
β β Query β β β Hybrid Search β β
β β Embed β β (Vector + Graph) β β
β βββββββββββ βββββββββββββββββββββββ β
β β β
β βββββββββββββββββββββββββββββββββββββββ β
β β LLM Generate Answer with Context β β
β βββββββββββββββββββββββββββββββββββββββ β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
4. Language Detection
The app automatically detects Thai text and applies appropriate processing:
| Detection | Embedding Model | Font |
|---|---|---|
| πΊπΈ English | OpenAI text-embedding-3-small |
Roboto Slab |
| πΉπ Thai | OpenThaiGPT or Thai-preprocessed | Sarabun |
| π Mixed | Smart hybrid embedding | Both |
βοΈ Configuration
Environment Variables (.env)
# Required: OpenAI API Key OPENAI_API_KEY=sk-proj-your_key_here # ArangoDB (Hybrid RAG) ARANGO_URL=http://localhost:8529 ARANGO_USER=root ARANGO_PASSWORD=password123 ARANGO_DATABASE=itinerary_kb # ChromaDB (Backup vector store) CHROMA_URL=http://localhost:8000 # Optional: Thai RAG OPENTHAI_ENABLED=false OPENTHAI_API_URL=http://localhost:5000
Docker Services
services: dev: # Vite dev server (port 3000) arangodb: # Hybrid RAG database (port 8529) chromadb: # Vector backup (port 8000)
π Getting Started
Prerequisites
- Node.js 18+
- Docker & Docker Compose
- OpenAI API Key
Quick Start
# 1. Clone the repository git clone https://github.com/SSaksit23/package-tour-comparison.git cd package-tour-comparison # 2. Copy environment file cp env.example .env # Edit .env and add your OPENAI_API_KEY # 3. Start with Docker docker-compose up -d # 4. Open browser # http://localhost:3000
Development Mode
# Install dependencies npm install # Start dev server only npm run dev # Or start with databases docker-compose up arangodb chromadb -d npm run dev
π Usage Guide
Basic Workflow
-
Upload Itineraries
- Drag & drop PDF/DOCX/Images into upload zones
- Add competitor names for each itinerary
-
Analyze
- Click "Analyze & Compare"
- View structured data, comparison table, insights
-
Build Knowledge Base (for RAG)
- Click "Knowledge Base" button
- Upload reference documents
- Wait for indexing (check console logs)
-
Get Enhanced Insights
- With KB populated, click "Get Insights"
- System searches KB for relevant context
- Generates RAG-enhanced recommendations
-
Q&A
- Ask questions about your documents
- System uses hybrid search for answers
Tips for Best Results
| Tip | Description |
|---|---|
| π Populate KB First | Upload similar itineraries to KB before analysis |
| πΉπ Thai Documents | System auto-detects Thai and uses optimized processing |
| π File Size | Keep files under 10MB for best performance |
| π¬ Q&A Context | More KB documents = better Q&A answers |
π§ Troubleshooting
Common Issues
| Issue | Solution |
|---|---|
| "No KB docs used" | Upload documents to Knowledge Base first |
| 409 Conflict errors | Normal - collections already exist |
| Embedding stuck | Check OpenAI API key and rate limits |
| Thai text garbled | Ensure Sarabun font is loaded |
Checking Logs
# View Docker logs docker-compose logs -f dev docker-compose logs -f arangodb # Browser console shows: # β ArangoDB Hybrid RAG initialized # π¦ Knowledge Base: X chunks total # π RAG Context: Found X relevant documents
π API Reference
AI Service Functions
// Analyze itinerary text analyzeItinerary(text: string, language: string): Promise<ItineraryData> // Generate comparison table getComparison(competitors: Competitor[], language: string): Promise<string> // Get recommendations (with optional RAG context) getRecommendations( competitors: Competitor[], history: AnalysisRecord[], language: string, ragContext?: string ): Promise<string> // Q&A with RAG generateAnswer( contexts: {name: string, text: string}[], question: string, chatHistory: ChatMessage[], language: string, ragContext?: string ): Promise<string>
ArangoDB Service Functions
// Index document in knowledge base indexDocumentInArango(doc: Document): Promise<{chunks: number, entities: number}> // Hybrid search (vector + graph) hybridSearch(query: string, topK?: number): Promise<HybridSearchResult[]> // RAG query with chat arangoHybridQuery( question: string, chatHistory: ChatMessage[], language: string ): Promise<HybridRAGResponse>
π Technologies Used
| Category | Technology |
|---|---|
| Frontend | React 18, TypeScript, Tailwind CSS |
| Build | Vite |
| AI | OpenAI GPT-4o, GPT-4 Vision |
| Vector DB | ArangoDB (primary), ChromaDB (backup) |
| Graph | ArangoDB Graph |
| pdfjs-dist | |
| Word | mammoth.js |
| Maps | Leaflet |
| Fonts | Roboto Slab, Sarabun |
| Container | Docker, Docker Compose |
π License
MIT License - See LICENSE file for details.
π¨βπ» Author
Created by Saksit Saelow
Last updated: December 2024