Evidence-backed web research for AI agents with citations and confidence scores.
Evidence-backed web research for AI agents with citations and confidence scores.
BrowseAI Dev · v0.4.0
by BrowseAI-HQ
browseai-dev
Reliable research infrastructure for AI agents. The research layer your agents are missing.
MCP server with real-time web search, evidence extraction, and structured citations. Drop into Claude Desktop, Cursor, Windsurf, LangChain, CrewAI, or any agent pipeline.
What it does
Instead of letting your AI hallucinate, browseai-dev gives it real-time access to the web with structured, cited answers:
Your question → Web search → Neural rerank → Fetch pages → Extract claims → Verify → Cited answer (streamed)
Every answer includes:
- Claims with source URLs, verification status, and consensus level
- Evidence-based confidence score (0-1), not LLM self-assessed, auto-calibrated from feedback
- Source quotes verified against actual page text
- Atomic claim decomposition — compound facts split and verified independently
- Execution trace with timing
- 3 depth modes —
"fast"(default),"thorough"(auto-retry with rephrased queries),"deep"(premium multi-step agentic research: iterative think-search-extract-evaluate cycles with gap analysis, up to 4 steps, targets 0.85 confidence — requires BAI key + sign-in, 3x quota cost, falls back to thorough when exhausted)
Premium Features (with BROWSE_API_KEY)
Users with a BrowseAI Dev API key (bai_xxx) get enhanced verification:
- Semantic re-ranking — search results re-scored by semantic query-document relevance
- Semantic reranking — evidence matched by meaning, not just keywords
- Multi-provider search — parallel search across multiple sources for broader coverage
- Multi-pass consistency — claims cross-checked across independent extraction passes
- Deep reasoning mode — multi-step agentic research with iterative think-search-extract-evaluate cycles, gap analysis, and cross-step claim merging (up to 4 steps, 3x quota cost, 100 deep queries/day)
- Research Sessions — persistent memory across queries
Free BAI key users get a generous daily quota (100 premium queries/day, or ~33 deep queries/day at 3x cost each). When exceeded, queries gracefully fall back to keyword verification (deep falls back to thorough). Quota resets every 24 hours.
Sign up at browseai.dev for a free API key — 100 premium queries/day with full verification pipeline.
Quick Start
npx browseai-dev setup
This auto-configures Claude Desktop. You'll need a BrowseAI Dev API key — get one free at browseai.dev/dashboard.
Manual Setup
Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"browseai-dev": {
"command": "npx",
"args": ["-y", "browseai-dev"],
"env": {
"BROWSE_API_KEY": "bai_xxx"
}
}
}
}
Cursor / Windsurf
Add to your MCP settings:
{
"browseai-dev": {
"command": "npx",
"args": ["-y", "browseai-dev"],
"env": {
"BROWSE_API_KEY": "bai_xxx"
}
}
}
HTTP Transport
Run as an HTTP server for browser-based clients, Smithery, or any HTTP-capable agent:
# Start with HTTP transport
npx browseai-dev --http
# Or set the port via environment variable
MCP_HTTP_PORT=3100 npx browseai-dev --http
The server exposes:
POST /mcp— MCP Streamable HTTP endpointGET /health— Health check
Docker
docker build -t browseai-dev ./apps/mcp
docker run -p 3100:3100 -e BROWSE_API_KEY=bai_xxx browseai-dev
MCP Tools
| Tool | Description |
|---|---|
browse_search |
Search the web via multi-provider search |
browse_open |
Fetch and parse a page into clean text |
browse_extract |
Extract structured knowledge from a page |
browse_answer |
Full pipeline: search + extract + cite. depth: "fast", "thorough", or "deep" |
browse_compare |
Compare raw LLM vs evidence-backed answer |
browse_clarity |
Clarity — anti-hallucination answer engine. Three modes: mode: "prompt" (enhanced prompts only), mode: "answer" (LLM answer, default), mode: "verified" (LLM + web fusion) |
browse_session_create |
Create a research session (persistent memory across queries) |
browse_session_ask |
Research within a session (recalls prior knowledge, stores new claims) |
browse_session_recall |
Query session knowledge without new web searches |
browse_session_share |
Share a session publicly (returns share URL) |
browse_session_knowledge |
Export all claims from a session |
browse_session_fork |
Fork a shared session to continue the research |
browse_feedback |
Submit accuracy feedback on a result |
Note: All tools require a BrowseAI Dev API key (
bai_xxx). Get a free one at browseai.dev/dashboard.
Examples
Quick lookup:
"Use browse_answer to explain what causes aurora borealis"
Higher accuracy:
"Use browse_answer with depth thorough to research quantum computing"
Deep research (multi-step, requires BAI key):
"Use browse_answer with depth deep to compare CRISPR approaches for sickle cell disease"
Deep mode runs iterative think-search-extract-evaluate cycles: gap analysis identifies missing info, follow-up queries fill the gaps, and claims/sources are merged across steps with final re-verification. Targets 0.85 confidence across up to 4 steps. Falls back to thorough without a BAI key or when quota is exhausted.
Contradiction detection:
"Use browse_answer with depth thorough to check if coffee is good for health, and show me any contradictions"
Research session:
"Create a session called quantum-research, then ask about quantum entanglement, then ask how entanglement is used in computing"
Enterprise search:
"Use browse_answer to search our Elasticsearch at https://es.company.com/kb/_search for our refund policy"
Response structure
{
"answer": "Aurora borealis occurs when charged particles from the Sun...",
"confidence": 0.92,
"claims": [
{
"claim": "Aurora borealis is caused by solar wind particles...",
"sources": ["https://en.wikipedia.org/wiki/Aurora"],
"verified": true,
"verificationScore": 0.82,
"consensusLevel": "strong"
}
],
"sources": [
{
"url": "https://en.wikipedia.org/wiki/Aurora",
"title": "Aurora - Wikipedia",
"domain": "en.wikipedia.org",
"quote": "An aurora is a natural light display...",
"verified": true,
"authority": 0.70
}
],
"contradictions": [],
"reasoningSteps": []
}
Why browseai-dev?
| Feature | Raw LLM | browseai-dev |
|---|---|---|
| Sources | None | Real URLs with quotes |
| Citations | Hallucinated | Verified from pages |
| Confidence | Unknown | Evidence-based score |
| Depth | Single pass | 3 modes: fast, thorough, deep reasoning |
| Freshness | Training data | Real-time web |
| Claims | Mixed in text | Structured + linked |
Reliability
All API calls include automatic retry with exponential backoff on transient failures (429 rate limits, 5xx server errors). Auth errors fail immediately — no wasted retries.
Tech Stack
- Search: Multi-provider (parallel search across sources)
- Parsing: @mozilla/readability + linkedom
- AI: OpenRouter (100+ models)
- Verification: Hybrid keyword and semantic verification
- Protocol: Model Context Protocol (MCP)
Agent Skills
Pre-built skills that teach coding agents when to use BrowseAI Dev tools:
npx skills add BrowseAI-HQ/browseAIDev_Skills
Skills work with Claude Code, Codex CLI, Gemini CLI, Cursor, and more. View skills →
Community
License
Apache 2.0