Hacker News • 118일 전

AMD 레모네이드: GPU·NPU 활용 오픈소스 로컬 LLM 서버

IMP

7/10

핵심 요약

AMD가 GPU와 NPU를 모두 활용하여 매우 빠르고 프라이빗한 로컬 AI 환경을 제공하는 오픈소스 서버 '레모네이드(Lemonade)'를 공개했습니다. 이 도구는 단 2MB 크기의 가벼운 C++ 백엔드 기반으로 작동하며, Windows, Linux, macOS 모두를 지원하고 표준 OpenAI API와 호환되어 수많은 앱과 즉시 연동됩니다. 텍스트, 이미지 생성, 음성 인식 및 생성까지 아우르는 통합 API를 제공함으로써, 누구나 자신의 PC에 부담 없이 고성능 로컬 AI를 구축할 수 있는 실용적인 솔루션을 제안합니다.

번역된 본문

GPU와 NPU를 활용하여 놀라울 정도로 빠른 이미지 및 LLM(대형 언어 모델)을 구동하세요. 오픈소스 기반이며, 개인정보를 안전하게 보호하고, 어떤 PC에서든 몇 분 안에 사용할 수 있습니다.

Windows 11 다운로드 | 개발자 환경 설정: Linux, Windows, macOS

채팅 (Chat)

128GB의 통합 메모리(UNIFIED RAM)로 무엇을 할 수 있나요? 고급 도구 활용을 위해 gpt-oss-120b 또는 Qwen-Coder-Next와 같은 모델을 로드할 수 있습니다.
무엇을 먼저 튜닝해야 하나요? --no-mmap을 사용하여 로드 시간을 단축하고 컨텍스트 크기를 64 이상으로 늘릴 수 있습니다.

이미지 생성 (Image Generation)

르네상스 회화 스타일의 레모네이드 물병

음성 (Speech)

"안녕하세요, 저는 여러분의 AI 비서입니다. 오늘 어떤 도움을 드릴까요?"

오픈소스 (Open Source) 모든 PC를 위해 로컬 AI 커뮤니티가 직접 구축했습니다. 레모네이드는 로컬 AI가 무료이고, 공개되어 있으며, 빠르고 프라이빗해야 한다는 철학에서 탄생했습니다.

커뮤니티 참여: GitHub 2.1k 스타 | Discord 117명 접속 중

최고의 추론 엔진 기반 구축

생태계 (Ecosystem) 훌륭한 앱들과 완벽하게 연동됩니다. 레모네이드는 다양한 앱에 통합되어 있으며, OpenAI API 표준을 지원하여 수백 개의 앱과 바로 사용할 수 있습니다.

연동 앱: Open WebUI, n8n, Gaia, Infinity Arcade, Continue, GitHub Copilot, OpenHands, Dify, Deep Tutor, Iterate.ai 등
마켓플레이스 둘러보기

기술 사양 (Tech Specs) 실용적인 로컬 AI 워크플로우를 위해 설계되었습니다. 설치부터 런타임까지 모든 것이 빠른 설정, 광범위한 호환성, 로컬 우선 실행에 맞게 최적화되어 있습니다.

네이티브 C++ 백엔드: 단 2MB 크기의 가벼운 서비스.
1분 설치: 스택을 자동으로 설정하는 간편한 설치 프로그램.
OpenAI API 호환: 수백 개의 앱과 즉시 작동하며 몇 분 안에 통합됨.
하드웨어 자동 구성: 사용자의 GPU 및 NPU에 맞게 종속성을 자동으로 구성.
멀티 엔진 호환성: llama.cpp, Ryzen AI SW, FastFlowLM 등과 함께 작동.
다중 모델 동시 실행: 두 개 이상의 모델을 동시에 실행.
크로스 플랫폼: Windows, Linux, macOS(beta)에서 일관된 경험 제공.
빌트인 앱: 모델을 빠르게 다운로드, 테스트 및 전환할 수 있는 GUI 제공.
통합 API: 모든 모달리티를 위한 단일 로컬 서비스. 앱을 레모네이드에 연결하면 표준 API를 통해 채팅, 비전, 이미지 생성, 텍스트 변환(전사), 음성 생성 등을 사용할 수 있습니다.

지원 모달리티: 채팅(Chat) | 비전(Vision) | 이미지 생성(Image Gen) | 텍스트 변환(Transcription) | 음성 생성(Speech Gen)

POST /api/v1/chat/completions 서버 사양 보기

최신 릴리즈 (Latest Release) 항상 개선되고 있습니다. 레모네이드 릴리즈 스트림에서 최신 개선 사항과 주요 내용을 확인하세요. ‹ 릴리즈 로드 중... 릴리즈 보기 ›

원문 보기

원문 보기 (영어)

Refreshingly fast images LLMs on GPUs and NPUs Open source. Private. Ready in minutes on any PC. Download for Windows 11 Developer Setup Linux, Windows, macOS Chat What can I do with 128 GB of unified RAM? Load up models like gpt-oss-120b or Qwen-Coder-Next for advanced tool use. What should I tune first? You can use --no-mmap to speed up load times and increase context size to 64 or more. Image Generation A pitcher of lemonade in the style of a renaissance painting Speech Hello, I am your AI assistant. What can I do for you today? Open Source Built by the local AI community for every PC. Lemonade exists because local AI should be free, open, fast, and private. Join the community GitHub 2.1k stars Discord 117 online now Built on the best inference engines Ecosystem Works with great apps. Lemonade is integrated in many apps and works out-of-box with hundreds more thanks to the OpenAI API standard. Open WebUI n8n Gaia Infinity Arcade Continue GitHub Copilot OpenHands Dify Deep Tutor Iterate.ai Explore the Marketplace Tech Specs Built for practical local AI workflows. Everything from install to runtime is optimized for fast setup, broad compatibility, and local-first execution. Native C++ Backend Lightweight service that is only 2MB. One Minute Install Simple installer that sets up the stack automatically. OpenAI API Compatible Works with hundreds of apps out-of-box and integrates in minutes. Auto-configures for your hardware Configures dependencies for your GPU and NPU. Multi-engine compatibility Works with llama.cpp, Ryzen AI SW, FastFlowLM, and more. Multiple Models at Once Run more than one model at the same time. Cross-platform A consistent experience across Windows, Linux, and macOS (beta). Built-in app A GUI that lets you download, try, and switch models quickly. Unified API One local service for every modality. Point your app at Lemonade and get chat, vision, image gen, transcription, speech gen, and more with standard APIs. Chat Vision Image Gen Transcription Speech Gen POST /api/v1/chat/completions View the Server Spec Latest Release Always improving. Track the newest improvements and highlights from the Lemonade release stream. ‹ Loading releases... View Release ›

오픈소스 로컬 AI AMD LLM 서버 NPU