๐ธ๏ธ Decentralized Dark Web OSINT via Signal Diffusion ๐ธ๏ธ
Quick Start โข Architecture โข Specialists โข Enrichment โข Reports
A Rust reimagining of Robin that replaces central LLM orchestration with SMESH's plant-inspired signal diffusion protocol.
The Difference
| Aspect | Python Robin | RobinรSMESH |
|---|---|---|
| Orchestration | Sequential pipeline | Emergent via signals |
| Search | ThreadPool, 16 engines | N crawler agents, infinite scale |
| Filtering | Single LLM call | Multiple filter agents + consensus |
| Fault tolerance | Breaks on timeout | Signals decay, others pick up |
| Performance | ~seconds per stage | ~ฮผs signal ops + async I/O |
Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ SHARED SIGNAL FIELD โ
โ Signals decay over time ยท Reinforcement = consensus ยท No central controller โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โฒ โฒ โฒ โฒ
โโโโโโดโโโโโ โโโโโโดโโโโโ โโโโโโดโโโโโ โโโโโโดโโโโโ
โ REFINER โ โ CRAWLER โ โ FILTER โ โ ANALYST โ
โ Agent โ โ Swarm โ โ Agent โ โ Agent โ
โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ
Signal Flow
- UserQuery โ Refiner senses, emits RefinedQuery
- RefinedQuery โ Crawlers sense, emit RawResult (per .onion link)
- RawResult โ Filter senses batch, emits FilteredResult (top 20)
- FilteredResult โ Scrapers sense, emit ScrapedContent
- ScrapedContent โ Extractor senses, emits ExtractedArtifacts (IOCs)
- ExtractedArtifacts โ Enricher senses, queries surface web, emits EnrichedArtifacts
- ScrapedContent + Artifacts โ Analyst senses, emits Summary
Quick Start
# Build cargo build --release # Check Tor connection ./target/release/robin-smesh status # Run investigation (Anthropic is default) ANTHROPIC_API_KEY=sk-ant-... ./target/release/robin-smesh query \ -q "ransomware payments" \ --timeout 300 # Multi-specialist mode (6 expert analysts + lead synthesis) ANTHROPIC_API_KEY=sk-ant-... ./target/release/robin-smesh query \ -q "threat actor infrastructure" \ --specialists # External OSINT enrichment (GitHub + Brave search) ANTHROPIC_API_KEY=sk-ant-... ./target/release/robin-smesh query \ -q "data breach credentials" \ --enrich \ --specialists # Blockchain temporal analysis (BTC/ETH wallet patterns) ANTHROPIC_API_KEY=sk-ant-... ./target/release/robin-smesh query \ -q "ransomware bitcoin wallets" \ --blockchain \ --specialists # Use OpenAI instead OPENAI_API_KEY=sk-... ./target/release/robin-smesh query \ -q "ransomware payments" \ --openai # Use OpenRouter (Claude Sonnet 4.5) OPENROUTER_API_KEY=... ./target/release/robin-smesh query \ -q "data breach credentials" \ --openrouter # Use OpenRouter with permissive mode for security research # (uses Mistral Large - less restrictive for threat intel queries) OPENROUTER_API_KEY=... ./target/release/robin-smesh query \ -q "stealer logs redline raccoon vidar" \ --openrouter --permissive \ --specialists
LLM Model Selection
RobinรSMESH auto-selects optimal models based on provider. You can override with -m:
| Provider | Flag | Default Model | Notes |
|---|---|---|---|
| Anthropic | (default) | claude-sonnet-4-20250514 |
Best quality, recommended |
| OpenAI | --openai |
gpt-4o |
Strong reasoning |
| OpenRouter | --openrouter |
anthropic/claude-sonnet-4.5 |
Claude via OpenRouter |
| OpenRouter | --openrouter --permissive |
mistralai/mistral-large-2512 |
Less restrictive for security research |
Permissive Mode
For security research queries that may trigger content filters (malware names, exploit terminology), use --permissive with OpenRouter:
# These queries work with --permissive robin-smesh query -q "stealer logs redline raccoon" --openrouter --permissive robin-smesh query -q "infostealer malware analysis" --openrouter --permissive robin-smesh query -q "ransomware bitcoin wallets" --openrouter --permissive
Custom Models
Override the default model with -m:
# Use a specific OpenRouter model robin-smesh query -q "threat actor" --openrouter -m meta-llama/llama-3.1-70b-instruct # Use GPT-4o-mini for cost savings robin-smesh query -q "dark web market" --openai -m gpt-4o-mini
Requirements
- Rust 1.75+
- Tor running on port 9050:
# Linux sudo apt install tor && sudo systemctl start tor # Mac brew install tor && brew services start tor
- LLM API Key:
ANTHROPIC_API_KEY(default, recommended)OPENAI_API_KEY(with--openaiflag)OPENROUTER_API_KEY(with--openrouterflag)
- Optional for enrichment:
GITHUB_TOKENโ Increases GitHub API rate limitsBRAVE_API_KEYโ Enables Brave Search integration
Crate Structure
robin-smesh/
โโโ robin-core/ # Signals, artifacts, field, search engines
โโโ robin-tor/ # Tor proxy, crawler, scraper
โโโ robin-agents/ # Specialized OSINT agents (refiner, crawler, filter, etc.)
โโโ robin-runtime/ # SMESH swarm coordinator
โโโ robin-cli/ # CLI binary
Key Concepts from SMESH
- Signals: Messages with intensity that decays over time
- Field: Shared space where signals propagate
- Reinforcement: Agreement from multiple agents boosts confidence
- Emergence: No central controller; coordination emerges from simple rules
Artifact Extraction
Automatically extracts:
- ๐ Onion addresses
- ๐ฐ Bitcoin/Ethereum/Monero addresses
- ๐ง Email addresses
- ๐ File hashes (MD5, SHA1, SHA256)
- ๐ CVE identifiers
- โ๏ธ MITRE ATT&CK TTPs
- ๐ Domains and IPs
Multi-Specialist Analysis
With --specialists, analysis is performed by 6 expert personas before synthesis:
| Specialist | Focus |
|---|---|
| ๐ฏ Threat Intel | Actor TTPs, campaign patterns, IOC correlation |
| ๐ฐ Financial Crime | Cryptocurrency flows, money laundering, fraud |
| ๐ Technical | Malware, exploits, infrastructure analysis |
| ๐ Geopolitical | Nation-state activity, regional threats |
| โ๏ธ Legal/Regulatory | Compliance, jurisdiction, evidence handling |
| ๐ฎ Strategic | Trend forecasting, risk assessment |
External OSINT Enrichment
With --enrich, extracted artifacts are queried against surface web sources:
- GitHub Code Search โ Emails, usernames, code snippets, hashes
- Brave Search โ IPs, domains, malware hashes, threat intel
This bridges dark web findings with public attribution data.
Blockchain Temporal Analysis
With --blockchain, extracted cryptocurrency addresses are analyzed for temporal patterns:
- Bitcoin โ Blockstream API (no key required)
- Ethereum โ Etherscan API (optional
ETHERSCAN_API_KEYfor higher rate limits)
Analysis includes:
- Wallet age (first/last transaction)
- Transaction frequency and volume
- Temporal patterns โ Regular intervals, burst activity, dormancy periods
- Timezone inference โ Activity concentration by hour
- Risk indicators (high volume, recent activity, contract interactions)
Paste Site Monitoring
With --pastes, public paste sites are searched for leaked data matching query terms:
- Pastebin โ Via psbdmp.ws API (paste dump search)
- Rentry.co โ Slug-based discovery
- dpaste.org โ Recent pastes API
- ControlC โ Search interface
- JustPaste.it โ Search interface
This catches leaked credentials, wallet addresses, and IOCs that often appear on paste sites before propagating to dark web markets.
Example Reports
Sample investigation reports are available in reports/:
reports/
โโโ summary_2026-01-20_15-24-29.md # Ransomware payment investigation
โโโ summary_2026-01-20_15-26-30.md # Threat actor infrastructure
โโโ summary_2026-01-20_15-51-10.md # Multi-specialist analysis
โโโ summary_2026-01-20_16-09-02.md # With external enrichment
License
MIT OR Apache-2.0
