v2.3.0 — Free to Use

Your AI Agent
for the Browser

Type what you want in natural language. Crab-Agent navigates pages, clicks elements, fills forms, reads content, manages tabs, records workflows — all autonomously. Bring your own API key. Works with any LLM provider.

Click & Type
Navigate
Read Pages
Workflows
Memory
Works with your favorite LLM provider
A
Anthropic
O
OpenAI
G
Google Gemini
R
OpenRouter
O
Ollama (Local)
+
OpenAI-compatible

Everything you need to automate the web

A complete agent toolkit built into a Chrome side panel. No coding required.

Vision + DOM Understanding

Takes screenshots and reads the full DOM tree with interactive element references. Sees your page like a human would.

Full Browser Control

Click, type, scroll, drag, navigate, open tabs, fill forms, upload files, execute JavaScript — all via Chrome DevTools Protocol.

Persistent Memory

Remembers your preferences, rules, and personal info across sessions. Dream consolidation keeps memory clean and relevant.

Workflow Recording

Record browser interactions as reusable workflows. Replay them on demand or let the agent invoke them automatically.

Task Scheduler

Schedule tasks for the future — one-time or recurring with cron expressions. The agent runs them automatically via Chrome alarms.

Permission System

You stay in control. Choose between Ask, Auto, or Strict modes. Sensitive pages always require explicit approval.

From natural language to action

A tool-use agent loop that keeps going until the task is done.

1

You describe the task

Open the side panel and type what you want in plain language. Attach screenshots if needed. "Book the cheapest flight to Tokyo for next Friday"

2

Agent observes the page

Crab-Agent takes a screenshot and reads the DOM, building a map of every interactive element with coordinate references.

3

LLM decides the next action

The conversation (including the visual context) is sent to your chosen LLM. It selects a tool — click, type, navigate, read — and the extension executes it via CDP.

4

Repeat until done

The result is appended to the conversation and the loop continues. The agent handles multi-step flows, tab switching, form filling, and error recovery automatically.

A tool for every browser interaction

The agent picks the right tool for each step. Here's what's in the toolkit.

computer (click/type/scroll)
navigate
read_page
find (accessibility)
form_input
get_page_text
tabs_context
tabs_create
switch_tab
close_tab
javascript_tool
file_upload
download_file
document_generator
gif_creator
canvas_toolkit
visualize (charts)
code_editor
run_workflow
schedule_task
memory (CRUD)
suggest_rule
update_plan
set_of_mark
resize_window
shortcuts_list
shortcuts_execute
read_console
read_network
ask_user / done

Built for reliability

Chrome MV3 service worker, React side panel, multi-provider LLM gateway.

Side Panel (React + Zustand) Chat | Workflows | Schedule | Memory | Settings | Port messages | Background Service Worker Session management | Tab groups | Memory dreams | Alarms | Agent Loop (tool-use cycle) Screenshot -> Read DOM -> Call LLM -> Execute Tool -> Repeat | | LLM Gateway Tool Executors Anthropic CDP: click, type, scroll OpenAI Browser: tabs, navigate Gemini Page: read, find, JS OpenRouter Files: upload, download Ollama Docs, GIF, workflows OpenAI-compatible Memory, scheduler

Ready to automate your browser?

Free to use. Bring your own API key and let Clawd handle the rest.

Standing on the shoulders of giants

Crab-Agent wouldn't exist without these amazing projects and their creators.

Clawd Tank

Assets & Mascot

The adorable "Clawd" crab pixel-art mascot and SVG animations used throughout Crab-Agent are derived from the Clawd Tank project by Marcio Granzotto. Thank you for the amazing crab character!

Claude Computer Use (Anthropic)

Agent Logic

Core agent loop architecture and browser automation logic inspired by Anthropic's computer-use approach. The tool-use cycle pattern — screenshot, observe, decide, act — draws heavily from their pioneering work on AI browser agents.