Works everywhere
Any website, any time. No per-site integration, no API stubs, no 'connect your account' dance. The agent reads the page through standards the browser already speaks.
Auto Browser sits in your Chrome side panel. Describe a task in plain English — it sees the page, acts on your behalf, and reports back. No scripts, no servers, no vendor lock-in.
Auto Browser runs a perceive → plan → act loop. It keeps going until the task is done or it needs you. Every action is yours to approve.
Type what you want in plain English. Attach screenshots or audio if useful.
“Find me a flight from NYC to Tokyo next Thursday, one-way.” The agent reads the page's accessibility tree and captures a snapshot — it understands structure, not pixels.
Snapshot taken → 47 elements, search form detected It clicks, types, scrolls, navigates, or runs the page's own WebMCP tools — always gated by your permissions.
Fill form → submit → read results Any website, any time. No per-site integration, no API stubs, no 'connect your account' dance. The agent reads the page through standards the browser already speaks.
Four providers, one dropdown. Run Gemini Nano on-device for privacy, OpenRouter for any model, Google Gemini for frontier quality, or Ollama for air-gapped setups.
Learn moreNo analytics. No telemetry. Your API keys stay on your machine. Passwords and secret tokens are redacted before they can reach the model. Pick on-device AI and zero data leaves your browser.
Learn moreThe agent asks before every write. Script execution always prompts. Redirect attacks are neutralized automatically. Banking, gov, and healthcare sites are protected by default.
Learn moreA generic, app-agnostic agent means the same install handles radically different jobs. Tap a card to watch it run.
The agent searches the store, reads specs and reviews on each option, and hands back a side-by-side summary with a clear recommendation.
Complex HTML tables stop being complex. The agent writes and runs eval scripts (with your approval) to filter, sort, and extract exactly what you asked for.
Point it at any X profile and ask what they're up to. The agent scrolls the timeline, reads the posts, and hands back a concise brief.
Drop it on a blog post, whitepaper, or release note. The agent reads the full page and returns a summary plus the key insights.
It finds the reservation form, asks only for what it doesn't already know, and submits autonomously — you're in the loop only when it matters.
It scans the showtimes on a cinema site, checks in on which film you want, and walks through seat-and-pay — pausing for your final approval.
Most agent tools marry you to a single model. Auto Browser lets you pick — and switch — without reconfiguring the agent or losing your conversation.
Free. On-device. Private.
Gemini Nano runs locally inside Chrome. No API key, no network calls, no account. Downloaded once (~3–5 GB) and it works offline forever.
Frontier quality, direct API.
Point at Google's API directly. Great for complex reasoning tasks where latency and model quality both matter.
Every model, one key.
Claude, GPT, Llama, Mistral — any model in the OpenRouter catalog, switched with a dropdown. Image and audio support auto-detected per model.
Your machine. Your rules.
Point at Ollama, LM Studio, or any OpenAI-compatible endpoint on localhost. Full control, zero cloud, custom fine-tunes welcome.
Your conversation history carries over. Try Gemini Nano for a task, hit a hard one, flip to Claude on OpenRouter, finish, flip back. No reconfiguration.
Every surface is explainable and auditable. Read the full architecture in the docs, or skim the four things that make it work.
Perceive, interact, navigate, observe — everything the agent needs to work a page without guessing. Works on pages built long before AI agents existed.
Pages that describe themselves with WebMCP get preferred treatment. A tagged form beats a dozen DOM traversals — reliable, explicit, fast.
The agent never acts silently. Writes prompt. Sensitive origins are blocked. Redirects invalidate prior approvals. Secrets are redacted before they can leak.
The agent tracks its own steps, notices when a strategy isn't making progress, and summarizes long sessions so it stays coherent turn after turn.
Auto Browser speaks WebMCP — the open web spec that lets your pages describe themselves to any AI agent, not just ours. Tag your forms, expose your actions, and the agent uses your tools instead of poking around the DOM.
A web-native protocol for pages to expose tools and actions to AI agents. Developed in the open by the Web Machine Learning community group.
github.com/webmachinelearning/webmcpAn agent that can click anything is dangerous by default. Auto Browser is designed to be safe first — six layers of protection, starting with the one you control.
Clicks, typing, navigation — the agent requests approval before it acts. Read-only looks (snapshots, reading the page) don't interrupt you; writes always do.
Running arbitrary JavaScript is the most powerful thing the agent can do, so it's also the most gated. Even broad approvals never cover it.
Banking, government, and healthcare origins are off-limits by default. The agent is disabled entirely on those sites — there's no way to override inline.
If a page redirects mid-action (login loop, phishing hop), prior approvals no longer apply. The agent has to ask again for the new origin.
Passwords and tokens are redacted before they can leave your browser. The agent can tell a field has a value, but it never sees the value itself.
Gemini Nano runs entirely inside Chrome — no cloud, no account, no network. Pick it and your data never leaves your machine.