Hacker News • 104일 전

에이전트 - 네이티브 맥OS 코딩 IDE

IMP

7/10

핵심 요약

오픈소스 기반의 네이티브 macOS용 코딩 IDE 및 자동화 도구인 'Agent!'가 공개되었습니다. 이 프로젝트는 Claude Code, Cursor 등을 대체하는 것을 목표로 하며, 17개 이상의 다양한 클라우드 및 로컬 LLM 제공업체를 단일 앱에 통합했습니다. 특히 온디바이스 Apple AI를 활용해 UI 자동화를 수행하고, 컨텍스트를 압축하여 API 토큰 비용을 획기적으로 절감할 수 있는 것이 가장 큰 특징입니다.

번역된 본문

🎖️ 이 프로젝트의 창립자님께서 현재 암 투병 중이십니다. 여러분의 Stars와 Forks에 감사드립니다. 🎖️

🦾 Agent! - macOS 26.4 이상을 위한 애플  Mac 데스크톱용 에이전트 AI Claude Code, Cursor, Cline, OpenClaw를 대체하는 오픈소스 솔루션

새로운 기능 🚀

진정한 도구 호출 에이전트로서의 Apple AI: 온디바이스 Apple Intelligence(FoundationModels.Tool)가 "Photo Booth로 사진 찍기"와 같은 UI 자동화 요청을 로컬에서 처리합니다. 다단계 도구 호출이 가능하며, 클라우드 LLM 토큰을 전혀 사용하지 않고 실패 시에만 클라우드 LLM으로 대체(Fallback)됩니다.
SDEF 및 런타임 앱 탐색: Bundle ID 해석에 하드코딩이 전혀 없습니다. Agent/SDEFs/ 디렉토리와 /Applications, /System/Applications, ~/Applications 경로에 있는 모든 .app이 런타임에 자동으로 탐지됩니다. 새로운 앱을 설치하면 코드 수정 없이 에이전트가 제어할 수 있는 대상이 확장됩니다.
모든 OpenAI 형식 제공업체에 대한 프롬프트 캐싱: Z.ai, OpenAI, Grok, Mistral, DeepSeek, Qwen, Gemini, BigModel, Hugging Face 등의 응답에서 cached_tokens를 파싱하여 LLM 사용량 패널에 표시합니다. 또한 JSON 요청 본문에 .sortedKeys를 사용하여 바이트 안정적인 접두사(Byte-stable prefixes)가 실제로 제공업체의 캐시에 적중하도록 설계되었습니다.
온디바이스 토큰 압축: 컨텍스트가 3만 토큰을 초과하면 Apple AI가 이전 대화 내용을 요약합니다(tieredCompact의 Tier 1). 이는 무료이며 프라이버시가 보장되고 API 토큰을 소모하지 않습니다. 뇌 아이콘 팝오버에서 이 기능을 켜고 끌 수 있습니다.
환각(Hallucination) 방지 프롬프트 규칙: 모든 시스템 프롬프트에 불완전한 도구 읽기 결과를 조작하지 못하도록 명시적인 지침이 포함되었습니다. '10번 연속 읽기' 가드(10-consecutive-reads guard)는 모델이 '임의로 추측'하는 대신 '범위를 좁히거나 done()을 호출'하도록 유도합니다.
자율적인 작업 루프, Xcode 통합, AXorcist 데스크톱 자동화, 권한이 부여된 데몬(Privileged daemon), 멀티탭 LLM 구성, LLMRegistry를 통한 Ollama 사전 웜업(Warm-up) 등 이전에 제공되던 모든 기본 기능은 그대로 유지됩니다.

하나의 앱. 어떤 AI든 가능. 당신의 Mac을 완벽하게 통제하세요. Agent!는 17개의 LLM 제공업체 — Claude, GPT, Gemini, Grok, Mistral, DeepSeek, Qwen, Z.ai, BigModel, Hugging Face, Ollama(클라우드 및 로컬), vLLM, LM Studio, Codestral, Mistral Vibe 및 온디바이스 Apple Intelligence — 를 네이티브 macOS 앱에 연결합니다. 이 앱은 단순히 수행하는 척하는 것이 아니라 실제로 작업을 수행합니다.

당신이 커피를 내리는 동안 이 앱이 코드베이스를 읽고, 버그를 수정하고, Xcode 프로젝트를 빌드하고, 변경 사항을 커밋하는 것을 지켜보세요. Safari를 열고 도쿄행 항공권 가격을 문자로 보내달라고 지시해 보세요. 방 건너편에서 "Agent!"라고 말하여 음성으로 테스트 스위트를 실행해 보세요. iMessage로 Mac에 문자를 보내고 차에 도착하기 전에 완벽하게 다듬어진 답변을 받아보세요.

이 앱은 외과적 수술처럼 정밀하게 문자열 교체(String-replace diff) 방식으로 파일을 편집하며, 모든 변경 사항은 타임머신 스타일의 롤백을 통해 클릭 한 번으로 실행 취소할 수 있습니다. AppleScript 없이 접근성 API(Accessibility API)를 통해 모든 Mac 앱을 구동합니다. 세션 간에 사용자의 기본 설정을 기억합니다. 분산된 작업을 처리하기 위해 병렬 하위 에이전트(Sub-agents)를 생성합니다. 모든 LLM이 소비할 수 있는 휴대용 JSONL 형식의 레포지토리 맵(Repo-map)으로 전체 코드베이스를 인덱싱합니다. 사용자의 권한으로, 또는 단 한 번 승인하면 Launch Daemon을 통해 root 권한으로 셸 명령을 실행합니다.

자신만의 API 키를 가져오세요(Bring your own API key). Ollama, vLLM 또는 LM Studio를 통해 완전히 로컬에서 실행할 수 있습니다. 또는 Apple Intelligence를 사용하여 평생 무료로 실행할 수 있습니다. 구독료가 없습니다. 원격 측정(Telemetry) 데이터 수집이 없습니다. 공급업체 종속성(Vendor lock-in)이 없습니다. 당신의 키, 당신의 기계, 당신의 데이터입니다. 다운로드하고, 필요한 것을 말한 뒤, 그것이 이루어지는 것을 지켜보세요.

빠른 시작 (다운로드)

Agent!를 다운로드하고 응용 프로그램 폴더로 드래그하세요.
Agent!를 열면 모든 것이 자동으로 설정됩니다.
AI 선택 — Settings(설정) → 제공업체 선택 → API 키 입력

빠른 시작 (소스에서 빌드) 저장소 복제: git clone https://github.com/toddbruss/Agent.git cd Agent

옵션 A: Xcode로 빌드 (Apple Developer 계정 필요)

Xcode에서 Agent.xcodeproj를 엽니다.
Agent 타겟을 빌드하고 실행(Build and Run)합니다.
Helper Tool 승인: 메시지가 표시되면 root 수준의 명령 실행을 허용하도록 권한 있는 데몬(Privileged daemon)에 권한을 부여합니다.

옵션 B: Apple Developer 계정 없이 빌드 (Xcode Command Line Tools만 필요) 빌드 스크립트 실행: ./build.sh # 디버그 빌드 또는: ./build.sh Release # 릴리스 빌드 앱은 build/DerivedData/Build/Products/Debug/Agent!.app 경로에 생성됩니다. 실행: open "build/DerivedData/Build/Products/Debug/Agent!.app"

⚠️ 개발자 계정이 없는 경우 앱은 임시(Ad-hoc) 서명됩니다.

원문 보기

원문 보기 (영어)

🎗️ Our Founder! of this project is battling cancer. Your Stars and Forks are appreciated. 🎗️ 🦾 Agent! for macOS 26.4+ Agentic AI for your  Mac Desktop Open Source replacement for Claude Code, Cursor, Cline, OpenClaw What's New 🚀 Apple AI as a real tool-calling agent: On-device Apple Intelligence (FoundationModels.Tool) handles UI automation requests like "take a photo using Photo Booth" locally — multi-step tool calls, zero cloud LLM tokens, falls through to the cloud LLM only on failure. SDEF + runtime app discovery: Bundle ID resolution is now zero-hardcoded. Apps in Agent/SDEFs/ plus every .app in /Applications , /System/Applications , ~/Applications are discovered at runtime — installing a new app extends what the agent can target with no code edit. Prompt caching for every OpenAI-format provider: Z.ai, OpenAI, Grok, Mistral, DeepSeek, Qwen, Gemini, BigModel, Hugging Face — cached_tokens is parsed from the response and shown in the LLM Usage panel. JSON request bodies use .sortedKeys so byte-stable prefixes actually hit the provider's cache. On-device token compression: Apple AI summarizes old conversation turns when context exceeds 30K tokens (Tier 1 of tieredCompact ) — free, private, no API tokens consumed. Toggleable in the brain icon popover. Anti-hallucination prompt rule: Every system prompt now includes explicit guidance against fabricating findings from incomplete tool reads. The 10-consecutive-reads guard pushes the model toward "narrow or call done()" instead of "guess". Autonomous task loop, Xcode integration, AXorcist desktop automation, privileged daemon, multi-tab LLM config, Ollama pre-warming via LLMRegistry — all the previously-shipped fundamentals are still there. One app. Any AI. Total command over your Mac. Agent! wires 17 LLM providers — Claude, GPT, Gemini, Grok, Mistral, DeepSeek, Qwen, Z.ai, BigModel, Hugging Face, Ollama (cloud and local), vLLM, LM Studio, Codestral, Mistral Vibe, and on-device Apple Intelligence — into a native macOS app that doesn't just talk about doing things. It does them. Watch it read your codebase, fix the bug, build the Xcode project, and commit the diff while you make coffee. Tell it to open Safari and text you the price of flights to Tokyo. Say "Agent!" from across the room and have it run your test suite by voice. Text your Mac from iMessage and get a polished answer before you reach your car. It edits files with surgical string-replace diffs — every change one-click undoable from a Time-Machine-style rollback. It drives any Mac app through the Accessibility API — no AppleScript required. It remembers your preferences across sessions. It spawns parallel sub-agents for work that fans out. It indexes entire codebases into a portable JSONL repo-map that any LLM can consume. It runs shell commands as you, or as root via a Launch Daemon you approve exactly once. Bring your own API key. Run it fully local on Ollama, vLLM, or LM Studio. Or run it free, forever, on Apple Intelligence. No subscription. No telemetry. No vendor lock-in. Your keys, your machine, your data. Download it. Say what you need. Watch it happen. Quick Start (Download) Download Agent! and drag to Applications Open Agent! -- it sets up everything automatically Pick your AI -- Settings → choose a provider → enter API key Quick Start (Build from Source) Clone the repository: git clone https://github.com/toddbruss/Agent.git cd Agent Option A: Build with Xcode (Apple Developer account) Open Agent.xcodeproj in Xcode. Build and Run the Agent target. Approve the Helper Tool: When prompted, authorize the privileged daemon to allow root-level command execution. Option B: Build without an Apple Developer account Run the build script (requires only Xcode Command Line Tools): ./build.sh # Debug build ./build.sh Release # Release build The app lands in build/DerivedData/Build/Products/Debug/Agent!.app Run it: open "build/DerivedData/Build/Products/Debug/Agent!.app" ⚠️ Without a developer account the app is ad-hoc signed. The Launch Agent/Daemon helpers won't register (SMAppService needs a team ID), but the LLM loop, all tools, accessibility, AppleScript, shell, and MCP all work. Then: Configure your AI Provider: Go to Settings and enter your API key or select a local provider like Ollama. 💡 Cheapest cloud path? GLM-5.1 (the latest) is now available on all four of the cheap cloud providers — Ollama , Hugging Face , Z.ai , and BigModel . Pennies per million tokens vs Claude/GPT pricing. Pick whichever you already have an account with; pricing is competitive across all of them. 💡 Z.ai is still the recommended starting point if you don't have an account anywhere yet — fastest signup, GLM-5.1 is the default model, no infrastructure to provision. ⚠️ Running GLM locally? Only GLM-4.7-Turbo (32B) runs well on consumer hardware — M2/M3/M4 Mac with 64-128GB unified memory via Ollama. GLM-5 (744B MoE) and GLM-5.1 (754B MoE) are too large to run locally (~1.6TB full weight) — use them via Z.ai , BigModel , Hugging Face cloud, or Ollama cloud. What Can It Do? "Play my Workout playlist in Music" "Build the Xcode project and fix any errors" "Take a photo with Photo Booth" "Send an iMessage to Mom saying I'll be home at 6" "Open Safari and search for flights to Tokyo" "Refactor this class into smaller files" "What calendar events do I have today?" Just type what you want. Agent! figures out how and makes it happen. Key Features 🧠 Agentic AI Framework Built-in autonomous task loop that reasons, executes, and self-corrects. Agent! doesn't just run code; it observes the results, debugs errors, and iterates until the task is complete. 🛠 Agentic Coding Full coding environment built in. Reads codebases, edits files with precision, runs shell commands, builds Xcode projects, manages git, and auto-enables coding mode to focus the AI on development tools. Replaces Claude Code, Cursor, and Cline -- no terminal, no IDE plugins, no monthly fee. Features Time Machine-style backups for every file change, letting you revert any edit instantly. 🔍 Dynamic Tool Discovery Automatically detects and uses available tools (Xcode, Playwright, Shell, etc.) based on your prompt. No manual configuration required for core tools. 🛡 Privileged Execution Securely runs root-level commands via a dedicated macOS Launch Daemon. The user approves the daemon once, then the agent can execute commands autonomously via XPC. 🖥 Desktop Automation (AXorcist) Control any Mac app through the Accessibility API. Click buttons, type into fields, navigate menus, scroll, drag -- all programmatically. Powered by AXorcist for reliable, fuzzy-matched element finding. 🤖 17 AI Providers The provider picker (LLM Settings, toolbar button #7) shows 16 providers; Apple Intelligence is reached via the separate brain icon (#8). Source of truth: AgentTools.APIProvider . Provider API key Best for Claude (Anthropic) Paid Long autonomous tasks, complex reasoning, prompt caching OpenAI Paid General purpose, tool calling, vision Google Gemini Paid (free tier) Long context, vision, fast Grok (xAI) Paid Real-time info Mistral Paid Open-weight cloud, fast tool calling Codestral (Mistral) Paid Code-specialized Mistral Mistral Vibe Paid Mistral's chat/agent product DeepSeek Cheap Budget cloud, strong coding, prompt cache hit reporting Hugging Face Varies Open-source models hosted serverless or on dedicated endpoints Z.ai Cheap GLM-5.1 via API — recommended starting point BigModel (Zhipu) Cheap GLM family via Zhipu's API Qwen (Alibaba) Cheap Qwen 2.5 / 3 via Dashscope Ollama (cloud) Free tier Run open models via Ollama's hosted endpoint Local Ollama Free + hardware Self-hosted Ollama daemon — fully offline, no account vLLM Free + hardware Self-hosted vLLM server with prefix caching LM Studio Free + hardware Self-hosted, easiest GUI for local models Apple Intelligence Free, on-device Triage, summary, accessibility intent (via brain icon, not the provider picker) 💡 Self-hosted "free" providers (Local Ollama, vLLM, LM

오픈소스 macOS 코딩 에이전트 로컬 LLM 자동화