Hacker News • 104일 전

Libretto: AI 브라우저 자동화를 안정적으로 만드는 오픈소스

IMP

7/10

핵심 요약

Libretto는 코딩 에이전트가 브라우저 및 네트워크 트래픽을 실시간으로 분석하고, 사용자의 행동을 자동화 스크립트로 녹화/재생할 수 있게 돕는 도구입니다. UI 자동화 대신 안전하고 빠른 직접 네트워크 API 호출 스크립트로 변환하거나 깨진 자동화를 손쉽게 디버깅할 수 있어, 신뢰성 높은 웹 통합 구축에 필수적입니다.

번역된 본문

Libretto 웹사이트: libretto.sh 저장소: github.com/saffron-health/libretto 문서: libretto.sh/docs 디스코드: discord.gg/NYrG56hVDt

Libretto는 견고한 웹 통합을 구축하기 위한 툴킷입니다. 이 도구는 코딩 에이전트에 실시간 브라우저와 토큰 효율적인 CLI를 제공하여 다음을 수행할 수 있습니다:

최소한의 컨텍스트 오버헤드로 실시간 페이지 검사
네트워크 트래픽 캡처를 통한 사이트 API 리버스 엔지니어링
사용자 행동 녹화 및 이를 자동화 스크립트로 재생
실제 사이트를 대상으로 중단된 워크플로우를 인터랙티브하게 디버깅

우리 Saffron Health는 일반적인 의료 소프트웨어에 대한 브라우저 통합을 유지보수하기 위해 Libretto를 구축했습니다. 다른 팀들도 이 작업을 더 쉽게 할 수 있도록 오픈소스로 공개합니다.

(데모 영상: libretto-demo.mov)

설치

npm install libretto

최초 온보딩: 스킬 설치, Chromium 다운로드, 기본 스냅샷 모델 고정

npx libretto setup

워크스페이스 준비 상태를 언제든 확인

npx libretto status

스냅샷 분석 모델 수동 변경 (고급 오버라이드)

npx libretto ai configure < openai | anthropic | gemini | vertex >

setup은 사용 가능한 제공자 자격 증명(예: OPENAI_API_KEY)을 감지하고 자동으로 기본 모델을 .libretto/config.json에 고정합니다. 정상적인 워크스페이스에서 setup을 다시 실행하면 다시 묻지 않고 현재 설정을 표시합니다. 이전에 구성된 제공자에 대한 자격 증명이 누락된 경우, setup은 인터랙티브 복구 흐름을 제공합니다. 제공자를 명시적으로 전환하거나 사용자 지정 모델 문자열을 설정하려면 ai configure를 사용하세요.

사용 사례

Libretto는 코딩 에이전트를 통해 스킬로 사용하도록 설계되었습니다. 다음은 몇 가지 프롬프트 예시입니다:

원샷 스크립트 생성 "Libretto 스킬을 사용하세요. LinkedIn에 접속하여 첫 10개의 게시물에서 콘텐츠, 작성자, 반응 수, 첫 25개의 댓글 및 첫 25개의 리포스트를 스크랩하세요." 코딩 에이전트가 LinkedIn에 로그인할 수 있는 창을 열면, 이후부터 자동으로 탐색을 시작합니다.

인터랙티브 스크립트 빌딩 "eclinicalworks EHR에서 환자의 기본 보험 ID를 가져오는 워크플로우를 보여드리겠습니다. Libretto 스킬을 사용하여 환자 이름과 생년월일을 입력받아 보험 ID를 반환하는 Playwright 스크립트로 변환하세요. URL은 ... 입니다." Libretto는 브라우저에서 수행하는 사용자의 작업을 읽을 수 있으므로, 워크플로우를 직접 수행한 후 해당 작업을 기반으로 워크플로우를 다시 구축하도록 요청할 수 있습니다.

브라우저 자동화를 네트워크 요청으로 변환 "./integration.ts에 해커 뉴스에 접속하여 첫 10개의 게시물을 가져오는 자동화 브라우저 스크립트가 있습니다. 이를 직접적인 네트워크 스크립트로 변환하세요. Libretto 스킬을 사용하세요." Libretto는 브라우저에서 네트워크 요청을 읽어 API를 리버스 엔지니어링하고 해당 요청을 직접 호출하는 스크립트를 생성할 수 있습니다. 직접 API를 호출하는 것이 UI 자동화보다 빠르고 안정적입니다. 또한 Libretto에게 요청을 분석하여 일반적인 보안 쿠키를 점검하는 보안 분석을 요청할 수도 있으므로, 네트워크 요청 방식이 안전한지 파악할 수 있습니다.

깨진 통합 수정 "./integration.ts에 Availity에 접속하여 환자의 자격을 확인하는 브라우저 스크립트가 있습니다. 하지만 실행하면 깨진 선택자(selector) 오류가 발생합니다. Libretto 스킬을 사용하여 수정하세요." 에이전트는 Libretto를 사용하여 실패를 재현하고, 언제든지 워크플로우를 일시 중지하며, 실시간 페이지를 검사하고, 모든 것을 자율적으로 수정할 수 있습니다.

CLI 사용법

Libretto를 명령줄에서 직접 사용할 수도 있습니다. 모든 명령은 특정 세션을 대상으로 --session <이름>을 허용합니다.

npx libretto setup # 인터랙티브 최초 실행 온보딩; 에이전트가 아닌 직접 실행 npx libretto status # AI 설정 상태 및 열려 있는 세션 확인 npx libretto open < url > # 브라우저를 실행하고 URL 열기 (기본적으로 헤드eded 모드) npx libretto snapshot --objective " ... " --context " ... " # PNG + HTML 캡처 및 LLM으로 분석 npx libretto exec " <코드> " # 열려 있는 페이지에 대해 Playwright TypeScript 실행 (작은따옴표 인수) echo " <코드> " | npx libretto exec - # 의도적으로 stdin에서 Playwright TypeScript 읽기 npx libretto run

원문 보기

원문 보기 (영어)

Libretto Website: libretto.sh Repository: github.com/saffron-health/libretto Docs: libretto.sh/docs Discord: discord.gg/NYrG56hVDt Libretto is a toolkit for building robust web integrations. It gives your coding agent a live browser and a token-efficient CLI to: Inspect live pages with minimal context overhead Capture network traffic to reverse-engineer site APIs Record user actions and replay them as automation scripts Debug broken workflows interactively against the real site We at Saffron Health built Libretto to help us maintain our browser integrations to common healthcare software. We're open-sourcing it so other teams have an easier time doing the same thing. libretto-demo.mov Installation npm install libretto # First-time onboarding: install skill, download Chromium, and pin the default snapshot model npx libretto setup # Check workspace readiness at any time npx libretto status # Manually change the snapshot analysis model (advanced override) npx libretto ai configure < openai | anthropic | gemini | vertex > setup detects available provider credentials (e.g. OPENAI_API_KEY ) and automatically pins the default model to .libretto/config.json . Re-running setup on a healthy workspace shows the current configuration instead of re-prompting. If credentials are missing for a previously configured provider, setup offers an interactive repair flow. Use ai configure when you want to explicitly switch providers or set a custom model string. Use cases Libretto is designed to be used as a skill through your coding agent. Here are some example prompts: One-shot script generation Use the Libretto skill. Go on LinkedIn and scrape the first 10 posts for content, who posted it, the number of reactions, the first 25 comments, and the first 25 reposts. Your coding agent will open a window for you to log into LinkedIn, and then automatically start exploring. Interactive script building I'm gonna show you a workflow in the eclinicalworks EHR to get a patient's primary insurance ID. Use libretto skill to turn it into a playwright script that takes patient name and dob as input to get back the insurance ID. URL is ... Libretto can read your actions you perform in the browser, so you can perform a workflow, then ask it to use your actions to rebuild the workflow. Convert browser automation to network requests We have a browser script at ./integration.ts that automates going to Hacker News and getting the first 10 posts. Convert it to direct network scripts instead. Use the Libretto skill. Libretto can read network requests from the browser, which it can use to reverse engineer the API and create a script that directly calls those requests. Directly making API calls is faster, and more reliable, than UI automation. You can also ask Libretto to conduct a security analysis which analyzes the requests for common security cookies, so you can understand whether a network request approach will be safe. Fix broken integrations We have a browser script at ./integration.ts that is supposed to go to Availity and perform an eligibility check for a patient. But I'm getting a broken selector error when I run it. Fix it. Use the Libretto skill. Agents can use Libretto to reproduce the failure, pause the workflow at any point, inspect the live page, and fix issues, all autonomously. CLI usage You can also use Libretto directly from the command line. All commands accept --session <name> to target a specific session. npx libretto setup # interactive first-run onboarding; run yourself, not through an agent npx libretto status # check AI config health and open sessions npx libretto open < url > # launch browser and open a URL (headed by default) npx libretto snapshot --objective " ... " --context " ... " # capture PNG + HTML and analyze with an LLM npx libretto exec " <code> " # execute Playwright TypeScript against the open page (single quoted argument) echo " <code> " | npx libretto exec - # intentionally read Playwright TypeScript from stdin npx libretto run < file > # run the file's default-exported workflow npx libretto resume # resume a paused workflow npx libretto pages # list open pages in the session npx libretto save < domain > # save browser session (cookies, localStorage) for reuse npx libretto close # close the browser npx libretto ai configure < provider > # manually change snapshot analysis model npx libretto status # show AI config and open sessions Configuration All Libretto state lives in a .libretto/ directory at your project root. Configuration is stored in .libretto/config.json . Config file .libretto/config.json controls snapshot analysis and viewport settings: { "version" : 1 , "ai" : { "model" : " openai/gpt-5.4 " , "updatedAt" : " 2026-01-01T00:00:00.000Z " }, "viewport" : { "width" : 1280 , "height" : 800 } } The ai field configures which model Libretto uses for snapshot analysis — extracting selectors, identifying interactive elements, or diagnosing why a step failed. This keeps heavy visual context out of your coding agent's context window. Snapshot analysis is required. npx libretto setup automatically pins the default model for the first provider whose credentials it finds. To explicitly change the provider or model afterward: npx libretto ai configure < openai | anthropic | gemini | vertex > To inspect the current configuration without changing anything: npx libretto status Provider credentials are read from environment variables or a .env file at your repository root (next to your .git directory): OPENAI_API_KEY , ANTHROPIC_API_KEY , GEMINI_API_KEY / GOOGLE_GENERATIVE_AI_API_KEY , or GOOGLE_CLOUD_PROJECT for Vertex. Set LIBRETTO_DISABLE_DOTENV=1 to skip .env loading. The viewport field sets the default browser viewport size. Both fields are optional. Sessions Each Libretto session gets its own directory under .libretto/sessions/<name>/ containing runtime state. Sessions are git-ignored. state.json — session metadata (debug port, PID, status) logs.jsonl — structured session logs network.jsonl — captured network requests actions.jsonl — recorded user actions snapshots/ — screenshot PNGs and HTML snapshots Profiles Profiles save browser sessions (cookies, localStorage) so you can reuse authenticated state across runs. They are stored in .libretto/profiles/<domain>.json , created via npx libretto save <domain> . Profiles are machine-local and git-ignored. Community Have a question, idea, or want to share what you've built? Join the conversation on Discord for quick help or GitHub Discussions for longer-form threads. Q&A — Ask questions and get help Ideas — Suggest new features or improvements Show and tell — Share your workflows and automations General — Chat about anything Libretto-related Found a bug? Please open an issue . Authors Maintained by the team at Saffron Health . Development For local development in this repository: pnpm i pnpm build pnpm type-check pnpm test Source layout: packages/libretto/src/cli/ — CLI commands packages/libretto/src/runtime/ — browser runtime (network, recovery, downloads, extraction) packages/libretto/src/shared/ — shared utilities (config, LLM client, logging, state) packages/libretto/test/ — test files ( *.spec.ts ) packages/libretto/README.template.md — source of truth for the repo and package READMEs packages/libretto/skills/libretto/ — source of truth for the Libretto skill Run pnpm sync:mirrors after editing packages/libretto/README.template.md or anything under packages/libretto/skills/libretto/ . To check that generated READMEs, skill mirrors, and skill version metadata are in sync without fixing them, run pnpm check:mirrors . To release, run pnpm prepare-release .

웹 자동화 오픈소스 코딩 에이전트 리버스 엔지니어링 디버깅