Hacker News • 98일 전

Zig 기반 AI 에이전트용 브라우저 자동화 도구 'Kuri'

IMP

7/10

핵심 요약

Node.js 의존성 없이 Zig 언어로 작성된 초경량 브라우저 자동화 및 웹 크롤링 도구인 Kuri가 소개되었습니다. AI 에이전트 루프에 최적화되어 기존 도구(agent-browser) 대비 토큰 사용량을 16% 줄이고, 464KB 크기의 단일 바이너리로 3ms의 매우 빠른 콜드 스타트를 자랑합니다. 복잡한 자바스크립트(JS) 렌더링이 필요 없는 독립 실행형 페처(Fetcher) 및 대화형 터미널 브라우저 모드도 내장하고 있습니다.

번역된 본문

Kuri 🌰: AI 에이전트를 위한 브라우저 자동화 및 웹 크롤링 도구입니다. Zig로 작성되었으며 Node.js 의존성이 전혀(Zero) 없습니다.

주요 기능: CDP 자동화 · A11y(접근성) 스냅샷 · HAR 녹화 · 독립 실행형 페처 · 대화형 터미널 브라우저 · 에이전트 CLI · 보안 테스트

팀들이 Kuri로 전환하는 이유: 464 KB 크기의 바이너리 파일과 약 3ms의 콜드 스타트를 제공합니다. Google Flights에서 전체 에이전트 루프(go→snap→click→snap→eval)를 실행할 때 agent-browser 대비 16% 적은 4,110개의 토큰이 소요되며, 이는 다단계 작업에서 비용을 크게 절감해 줍니다.

에이전트에 최적화된 Kuri의 장점: 대부분의 브라우저 자동화 도구는 QA 엔지니어링을 위해 만들어졌습니다. Kuri는 에이전트 루프, 즉 페이지 읽기, 토큰 비용 최소화, 안정적인 참조(Refs) 기반 동작 및 이동에 최적화되어 있습니다. 이 제품의 핵심은 '많은 명령어 제공'이 아니라 '최소한의 모델 비용으로 실제 페이지에서 유용한 상태 가져오기'입니다. 작은 출력 크기는 페이지가 실제로 렌더링되었을 때만 의미가 있습니다. 빈 껍데기 상태로 출력하는 것은 효율이 아니라 실패입니다.

동일한 페이지, 세션, 토크나이저를 사용한 비교를 통해 이를 증명합니다. 스냅샷 토큰 수 비교: Google Flights SIN → TPE (동일 Chrome 세션, tiktoken cl100k_base로 측정)

kuri snap (compact): 13,479 바이트 / 4,328 토큰 (기준점)
kuri snap --interactive: 7,024 바이트 / 1,927 토큰 (0.4배, 에이전트 루프에 가장 적합)
kuri snap --json: 102,124 바이트 / 31,280 토큰 (7.2배, 이전 기본값)
agent-browser snapshot: 17,103 바이트 / 4,641 토큰 (1.1배)
agent-browser snapshot -i: 8,704 바이트 / 2,425 토큰 (0.6배)
lightpanda semantic_tree: 67,830 바이트 / 26,244 토큰 (6.1배, JS 없음 - raw DOM)
lightpanda semantic_tree_text: 1,909 바이트 / 507 토큰 (0.1배, JS 없음 - 빈 껍데기)

전체 워크플로우 비용: go → snap → click → snap → eval

kuri-agent: 사이클당 4,110 토큰
agent-browser: 사이클당 4,880 토큰 Kuri는 워크플로우 사이클당 16%의 토큰을 절약하며, 다단계 작업에서 그 효과가 누적됩니다. 액션 응답은 복잡한 CDP 구조 대신 단순한 JSON(예: {"ok":true})으로 제공되어 토큰을 크게 줄여줍니다: click = 9 토큰, back = 5 토큰, scroll = 5 토큰.

lightpanda 점수가 낮은 이유: lightpanda는 JS가 많이 사용되는 SPA(단일 페이지 애플리케이션)를 실행할 수 없습니다. Google Flights는 클라이언트 측 fetch()를 통해 렌더링되는데, lightpanda는 항공편 데이터가 전혀 없는 507 토큰짜리 빈 탐색 레이아웃만 반환합니다. 이 낮은 토큰 수는 효율성이 아니라 렌더링 실패를 의미합니다.

작은 바이너리, 빠른 시작 (Apple M3 Pro, macOS 15.3에서 측정, kuri는 -Doptimize=ReleaseFast로 빌드):

CLI 바이너리: agent-browser 6.0 MB vs kuri 464 KB (13배 작음)
콜드 스타트(--version): agent-browser 3.4ms vs kuri 3.0ms (비슷함)
설치 용량(npm): agent-browser 33 MB vs kuri 3.3 MB (3개 바이너리 포함, 10배 작음)
명령어 수: agent-browser 140개 이상 vs kuri 40개 이상 (엔드포인트, 목적이 다름)
독립 실행형 페처(Chrome 불필요): agent-browser ❌ vs kuri ✅ (kuri-fetch)
터미널 브라우저(대화형 REPL): agent-browser ❌ vs kuri ✅ (kuri-browse)
JS 엔진(Chrome 없음): agent-browser ❌ vs kuri ✅ (QuickJS SSR 스타일 DOM)
HTTP API 서버: agent-browser ❌ (CLI 전용) vs kuri ✅ (스레드 기반 연결)

agent-browser는 더 넓은 범위의 브라우저 제어 기능을 제공합니다. 반면 Kuri는 에이전트 통합, 토큰 경제성 및 배포 단순성에 맞춰 의도적으로 기능을 좁히고 가벼운 HTTP API와 CLI 스택을 최적화했습니다.

해결하는 문제: 기존 브라우저 자동화 도구는 Playwright(약 300MB), Node.js 런타임 및 수많은 npm 종속성을 필요로 합니다. AI 에이전트가 원하는 것은 단순히 페이지를 읽고, 버튼을 클릭한 뒤 넘어가는 것뿐입니다. Kuri는 단일 Zig 바이너리로 구성된 4가지 모드를 통해 런타임 없이 이를 해결합니다:

kuri → CDP 서버 (Chrome 자동화, a11y 스냅샷, HAR)
kuri-fetch → 독립 실행형 페처 (Chrome 없음, QuickJS 사용, 약 2MB)
kuri-browse → 대화형 터미널 브라우저 (탐색, 링크 이동, 검색)
kuri-agent → 에이전트 CLI (스크립트 가능한 Chrome 자동화 + 보안 테스트)

설치 방법: 한 줄 설치 (macOS / Linux): curl -fsSL https://raw.githubusercontent.com/justrach/kuri/main/install.sh | sh 플랫폼을 자동으로 감지하여 적절한 바이너리를 다운로드하고 ~/.local/bin에 설치합니다. macOS 바이너리는 공증(notarized)되어 있어 보안 경고창이 뜨지 않습니다.

bun / npm: bun install -g kuri-agent (또는 npm install -g kuri-agent) 설치 시 플랫폼에 맞는 네이티브 바이너리를 자동으로 다운로드합니다.

수동 설치: GitHub Releases에서 플랫폼에 맞는 tarball을 다운로드하여 $PATH에 압축을 풉니다.

소스에서 빌드: Zig ≥ 0.15.0이 필요합니다. git clone https://github.com...

원문 보기

원문 보기 (영어)

Kuri 🌰 Browser automation & web crawling for AI agents. Written in Zig. Zero Node.js. CDP automation · A11y snapshots · HAR recording · Standalone fetcher · Interactive terminal browser · Agentic CLI · Security testing Quick Start · Benchmarks · kuri-agent · Security Testing · API · Changelog Why teams switch to Kuri: 464 KB binary, ~3 ms cold start. On Google Flights, a full agent loop (go→snap→click→snap→eval) costs 4,110 tokens vs 4,880 for agent-browser — 16% less per cycle , compounding across multi-step tasks. Why Kuri Wins for Agents Most browser tooling was built for QA engineers. Kuri is built for agent loops: read the page, keep token cost low, act on stable refs, and move on. The product story is not "most commands." It is "useful state from real pages at the lowest model cost." A tiny output only counts if the page actually rendered. Empty-shell output is a failure mode, not a win. The best proof is same-page, same-session, same-tokenizer comparisons. Snapshot tokens: Google Flights SIN → TPE Same Chrome session, measured with tiktoken cl100k_base . Run ./bench/token_benchmark.sh to reproduce. Tool / Mode Bytes Tokens vs kuri Note kuri snap (compact) 13,479 4,328 baseline kuri snap --interactive 7,024 1,927 0.4x Best for agent loops kuri snap --json 102,124 31,280 7.2x Old default agent-browser snapshot 17,103 4,641 1.1x agent-browser snapshot -i 8,704 2,425 0.6x lightpanda semantic_tree 67,830 26,244 6.1x ⚠ no JS — raw DOM lightpanda semantic_tree_text 1,909 507 0.1x ⚠ no JS — empty shell Full workflow cost: go → snap → click → snap → eval Tool Tokens per cycle kuri-agent 4,110 agent-browser 4,880 kuri saves 16% tokens per workflow cycle — compounding across multi-step tasks. Action responses are flat JSON ( {"ok":true} ) instead of nested CDP, which adds up: click = 9 tokens, back = 5 tokens, scroll = 5 tokens. Why lightpanda scores low: Lightpanda can't execute JS-heavy SPAs. Google Flights renders via client-side fetch() — lightpanda returns a 507-token empty nav shell with zero flight data. The low token count is a failed render, not efficiency. Small binary, fast start Measured on Apple M3 Pro, macOS 15.3. kuri built with -Doptimize=ReleaseFast . agent-browser v0.20.0. agent-browser kuri delta (v0.20) (v0.2) ───────────────────────────────────────────────────────────────────── CLI binary 6.0 MB 464 KB 13× smaller Cold start (--version) 3.4 ms 3.0 ms ~same Install (npm) 33 MB 3.3 MB (3 bins) 10× smaller Commands 140+ 40+ endpoints different focus Standalone fetcher ❌ ✅ kuri-fetch no Chrome needed Terminal browser ❌ ✅ kuri-browse interactive REPL JS engine (no Chrome) ❌ ✅ QuickJS SSR-style DOM HTTP API server ❌ (CLI only) ✅ kuri thread-per-conn agent-browser exposes a broader browser-control surface. Kuri is intentionally narrower: a lightweight HTTP API and CLI stack optimized for agent integration, token economy, and deployment simplicity. The Problem Every browser automation tool drags in Playwright (~300 MB), a Node.js runtime, and a cascade of npm dependencies. Your AI agent just wants to read a page, click a button, and move on. Kuri is a single Zig binary. Four modes, zero runtime: kuri → CDP server (Chrome automation, a11y snapshots, HAR) kuri-fetch → standalone fetcher (no Chrome, QuickJS for JS, ~2 MB) kuri-browse → interactive terminal browser (navigate, follow links, search) kuri-agent → agentic CLI (scriptable Chrome automation + security testing) 📦 Installation One-line install (macOS / Linux) curl -fsSL https://raw.githubusercontent.com/justrach/kuri/main/install.sh | sh Detects your platform, downloads the right binary, installs to ~/.local/bin . macOS binaries are notarized — no Gatekeeper prompt. bun / npm bun install -g kuri-agent # or: npm install -g kuri-agent Downloads the correct native binary for your platform at install time. Manual Download the tarball for your platform from GitHub Releases and unpack it to your $PATH . Build from source Requires Zig ≥ 0.15.0 . git clone https://github.com/justrach/kuri.git cd kuri zig build -Doptimize=ReleaseFast # Binaries in zig-out/bin/: kuri kuri-agent kuri-fetch kuri-browse ⚡ Quick Start Requirements: Zig ≥ 0.15.1 · Chrome/Chromium (for CDP mode) git clone https://github.com/justrach/kuri.git cd kuri zig build # build everything zig build test # run 230+ tests # CDP mode — launches Chrome automatically ./zig-out/bin/kuri # Standalone mode — no Chrome needed ./zig-out/bin/kuri-fetch https://example.com # Interactive browser — browse from your terminal ./zig-out/bin/kuri-browse https://example.com First run, shortest path # start the server; if CDP_URL is unset, kuri launches managed Chrome for you ./zig-out/bin/kuri # discover tabs from that managed browser curl -s http://127.0.0.1:8080/discover # inspect the discovered tab list curl -s http://127.0.0.1:8080/tabs If you already have Chrome running with remote debugging, set CDP_URL to either the WebSocket or HTTP endpoint: CDP_URL=ws://127.0.0.1:9222/devtools/browser/... ./zig-out/bin/kuri # or CDP_URL=http://127.0.0.1:9222 ./zig-out/bin/kuri Browse vercel.com in 4 commands # 1. Discover Chrome tabs curl -s http://localhost:8080/discover # → {"discovered":1,"total_tabs":1} # 2. Get tab ID curl -s http://localhost:8080/tabs # → [{"id":"ABC123","url":"chrome://newtab/","title":"New Tab"}] # 3. Navigate curl -s " http://localhost:8080/navigate?tab_id=ABC123&url=https://vercel.com " # 4. Get accessibility snapshot (token-optimized for LLMs) curl -s " http://localhost:8080/snapshot?tab_id=ABC123&filter=interactive " # → [{"ref":"e0","role":"link","name":"VercelLogotype"}, # {"ref":"e1","role":"button","name":"Ask AI"}, ...] 🌐 HTTP API All endpoints return JSON. Optional auth via KURI_SECRET env var. Core Path Description GET /health Server status, tab count, version GET /tabs List all registered tabs GET /discover Auto-discover Chrome tabs via CDP GET /browdie 🌰 (easter egg) Browser Control Path Params Description GET /navigate tab_id , url Navigate tab to URL GET /tab/new url Create a new tab GET /window/new url Create a new window/tab target GET /snapshot tab_id , filter , format A11y tree snapshot with @eN refs GET /text tab_id Extract page text GET /screenshot tab_id , format , quality Capture screenshot (base64) GET /action tab_id , ref , kind Click/type/scroll by ref GET /evaluate tab_id , expression Execute JavaScript GET /close tab_id Close tab + cleanup Content Extraction Path Description GET /markdown Convert page to Markdown GET /links Extract all links GET /dom/query CSS selector query GET /dom/html Get element HTML GET /pdf Print page to PDF HAR Recording & API Replay Path Description GET /har/start?tab_id= Start recording network traffic GET /har/stop?tab_id= Stop + return HAR 1.2 JSON GET /har/status?tab_id= Recording state + entry count GET /har/replay?tab_id=&filter=api&format=all API map with curl/fetch/python code snippets Navigation & State Path Description GET /back Browser back GET /forward Browser forward GET /reload Reload page GET /cookies Get cookies GET /cookies/delete Delete cookies GET /cookies/clear Clear all cookies GET /storage/local Get localStorage GET /storage/session Get sessionStorage GET /storage/local/clear Clear localStorage GET /storage/session/clear Clear sessionStorage GET /session/save Save browser session GET /session/load Restore browser session GET /session/list List saved browser sessions GET /auth/profile/save Save cookies + storage as a named auth profile GET /auth/profile/load Restore a named auth profile into a tab GET /auth/profile/list List saved auth profiles GET /auth/profile/delete Delete a saved auth profile GET /debug/enable Enable in-page debug HUD and optional freeze mode GET /debug/disable Disable in-page debug HUD GET /headers Set custom request headers GET /perf/lcp Capture Largest Contentful Paint timing, optionally after navigation On macOS, auth profile secrets are stored in the user Keychain. On other platforms, Kuri falls back to .kuri/auth-profiles/ .

오픈소스 AI 에이전트 웹 자동화 Zig 토큰 최적화