FluxLoop OSS
Agentic Testing for AI Agents
"Hey, test my agent for order cancellation with angry customers"
β FluxLoop handles the rest: setup, CLI install, synthesis, execution, and analysis.
π€ Agent-First Workflow
Your coding agent (Claude Code) orchestrates the entire testing flow. Just describe what you want to testβFluxLoop does the heavy lifting.
π― Simulate at Scale
Run thousands of realistic multi-turn scenarios in parallel. Find edge cases before production.
π Align to Your Standards
Capture your implicit decision criteria. Turn intuition into automated evaluation.
Getting Started
β Claude Code Plugin (Recommended)
Install the plugin, then just talk.
/plugin install Fluxloop-AI/fluxloop-claude-plugin
That's it. Now say:
"test my agent for refund scenarios"
The Agent Test Skill handles everything:
- β Installs FluxLoop CLI (if needed)
- β Logs you in
- β Creates project/scenario
- β Synthesizes test inputs
- β Runs simulations
- β Analyzes results and suggests fixes
No commands to memorize. No manual setup. Just ask.
Example Conversation
User: "Test my chatbot for refund scenarios with frustrated customers"
Agent: Let me set up FluxLoop and run tests...
β FluxLoop CLI installed
β Logged in
β Project created
β 10 test inputs synthesized (40% hard cases)
β Running simulation...
π Results: 8/10 passed (80%)
β οΈ Failed on edge case: customer requesting partial refund
π‘ Suggested fix: Add handling for partial refund requests
Would you like me to analyze the failures in detail?
π Documentation: docs.fluxloop.ai/claude-code
π¦ Packages
1. Claude Code Plugin β
The primary way to use FluxLoop. Your coding agent orchestrates the entire testing workflow through natural conversation.
| Feature | Description |
|---|---|
| Agent Test Skill | Auto-activates on "test my agent", handles everything |
| Zero Config | Skill installs CLI, logs in, creates projects automatically |
| Context-Aware | Knows your setup state, guides you through missing steps |
π Location: packages/fluxloop-plugin/
π Docs: docs.fluxloop.ai/claude-code
2. CLI
For power users and CI/CD pipelines. Direct command-line control when you need it.
pip install fluxloop-cli
fluxloop test --scenario my-testπ Docs: docs.fluxloop.ai/cli
π¦ PyPI: fluxloop-cli
3. SDK (Python 3.11+)
Core instrumentation library. Add @fluxloop.agent() decorator to trace agent execution.
import fluxloop @fluxloop.agent() def my_agent(input: str) -> str: # Your agent logic return response
π Docs: docs.fluxloop.ai/sdk
π¦ PyPI: fluxloop
Key Features
π€ Agentic Testing with Claude Code
Just talk naturally:
"Test my order-bot for cancellation scenarios"
"Generate edge cases for payment failures"
"Why did the last test fail?"
The skill understands context and adapts to your state.
π― Simple Instrumentation
Works with any Python agent framework:
@fluxloop.agent() def my_agent(input: str) -> str: # LangChain, LlamaIndex, customβanything works return response
π Evaluation-First Testing
Define criteria, run reproducible experiments, get actionable insights.
π§ͺ Offline-First Simulation
Run experiments locally with full control. No cloud dependency for testing.
βοΈ Seamless Web Integration
FluxLoop combines local execution with cloud intelligence for a powerful testing workflow.
1. Cloud-Powered Synthesis
When you say "generate edge cases", FluxLoop Web synthesizes realistic, diverse test data using advanced LLMs. This data is instantly synced to your local environment for testing.
2. Deep Evaluation & Analysis
Test results are automatically uploaded to alpha.app.fluxloop.ai for deep inspection:
- π΅οΈ Trace Analysis: Step-by-step debugging of agent conversations
- π Performance Metrics: Success rates, latency, token usage trends
- βοΈ Comparison: Side-by-side view of how recent changes affected behavior
3. The Perfect Loop
- You: "Test my agent" (Claude Code)
- Web: Generates test scenarios (Cloud)
- CLI: Runs tests locally (Local)
- Web: Analyzes results (Cloud)
- You: Review summary in IDE & detailed report on Web
What You Can Do
| Capability | How |
|---|---|
| π€ Conversational Testing | "test my agent with angry customers" |
| π― Instrument Agents | @fluxloop.agent() decorator |
| π Synthesize Inputs | Skill generates realistic test data |
| π§ͺ Run Simulations | Batch experiments with parallel execution |
| π¬ Multi-Turn Conversations | Auto-extend into dialogues |
| π Analyze Results | Get insights and fix suggestions |
Links
| Resource | URL |
|---|---|
| FluxLoop Web | alpha.app.fluxloop.ai |
| Documentation | docs.fluxloop.ai |
| Claude Code Plugin | docs.fluxloop.ai/claude-code |
| CLI Docs | docs.fluxloop.ai/cli |
| SDK Docs | docs.fluxloop.ai/sdk |
π€ Why Contribute?
We're building the future of AI agent testingβwhere your coding agent tests your AI agents.
- Improve agentic workflows: Make the Claude Code skill smarter
- Build framework adapters: LangChain, LlamaIndex, CrewAI
- Enhance synthesis: Better intent-to-input generation
- Develop evaluation methods: Novel agent performance metrics
Check out our contribution guide and open issues.
π¨ Community & Support
- Issues: GitHub Issues
- Docs: docs.fluxloop.ai
π License
FluxLoop is licensed under the Apache License 2.0.
