agentsweb.org / blog | docs · blog · about · security · faq

> blog

may 2, 2026 — we built a shared internet for AI agents

Every AI agent on earth fetches the same web pages independently. Same 403s. Same captchas. Same HTML-to-markdown conversion. Millions of times a day. We thought that was insane, so we fixed it.

agentsweb.org is a global shared cache of web pages as clean markdown. The first agent to fetch a URL caches it at the edge. Every agent after gets it in under 50 milliseconds. The network gets smarter with every request.

But shared caches have a problem: poisoning. If anyone can write, anyone can lie. So we built a self-healing consensus engine. Entries gain trust as independent sources — verified by IP, not self-reported IDs — confirm the content. Poisoned entries self-destruct on the next legitimate read. An attacker would need to control multiple IP addresses AND produce content that passes 30+ prompt injection patterns, XSS filters, unicode steganography detection, and vocabulary analysis. And even if they did, trust decays on mismatch.

Good luck.

the architecture

It's a single Cloudflare Worker with a KV store. That's it. No databases, no containers, no Kubernetes. One file, deployed globally to 300+ edge locations.

When you search, 6 search backends race in parallel — 5 SearXNG instances plus DuckDuckGo. First with results wins. Results are cached for 5 minutes at the KV layer and 2 minutes at the edge.

When you fetch a URL, 9 content sources race in parallel — Cloudflare Browser Run (JS/SPA rendering), Jina Reader, Codetabs, Wayback Machine, Arquivo.pt, Google Cache, archive.ph, AllOrigins, and raw fetch. First success wins — 20x faster than sequential fallback. Content is validated against prompt injection, XSS, captcha patterns, login walls, and structural integrity checks. Then it's cached globally.

Three-tier caching on every read: edge cache (sub-1ms) → KV cache (~50ms) → live fetch (1-5s). At scale, most requests never touch a backend.

the legal question

Is caching the web legal? Yes — the same way Google Cache, CDN caches, and browser caches are legal. We operate under DMCA 512(b) (system caching safe harbor). The content is a transformative derivative (HTML → markdown) for a fundamentally different purpose (machine processing, not human reading). All entries expire. Content owners can request instant removal.

We chose .org deliberately. This is public interest infrastructure. No ads, no tracking, no VC, no paid tiers. Open source under MIT. The kind of thing that should just exist.

what you can do with it

One API call to search the web, fetch the results, and cache them as markdown:

curl agentsweb.org/research?q=react+server+components

That's it. No API keys. No authentication. No SDK required. Any HTTP client works.

For tighter integration, use intercept-mcp (Node/MCP) or pip install agentsweb (Python). Both use agentsweb.org as tier 0 automatically.

what's next

The cache is live with >40 pages seeded. Every fetch from every intercept-mcp instance worldwide contributes back. The network effect kicks in as adoption grows — more agents means more cached pages means faster responses means more agents.

We're watching the stats. When the free tier limits start pinching, we'll scale. Cloudflare Workers paid plan is $5/month for 10 million requests. The whole thing can serve hundreds of thousands of daily users for the cost of a coffee.