r/LocalLLaMA • 63일 전

PrismML, 로컬 브라우저 구동 텍스트-이미지 모델 발표

IMP

8/10

핵심 요약

PrismML팀이 1비트와 3진법 가중치를 활용한 텍스트-이미지 디퓨전 트랜스포머인 Binary 및 Ternary Bonsai Image 4B를 공개했습니다. 기존 FLUX.2 Klein 4B 모델(약 16GB)과 비교해 약 3GB 수준으로 크기를 획기적으로 줄이면서도 WebGPU를 통해 브라우저 내에서 완벽하게 로컬 구동이 가능합니다. Apache-2.0 라이선스로 제공되어 누구나 제한 없이 사용하고 변형할 수 있는 오픈소스 모델이라는 점이 가장 큰 의의입니다.

번역된 본문

PrismML팀이 이번에 발표한 모델들은 정말 놀라운 성과입니다. (약 16GB 크기인 FLUX.2 Klein 4B와 비교했을 때) 이 모델들의 크기는 단지 ~3GB에 불과합니다. Apache-2.0 라이선스!

허깅페이스(Hugging Face) 공식 컬렉션: https://huggingface.co/collections/prism-ml/bonsai-image 데모 링크: https://huggingface.co/spaces/webml-community/bonsai-image-webgpu

원문 보기

원문 보기 (영어)

The PrismML team really cooked with these models. They're only \~3GB in size (compared to FLUX.2 Klein 4B, which is \~16GB). Apache-2.0! Official collection on HF: [https://huggingface.co/collections/prism-ml/bonsai-image](https://huggingface.co/collections/prism-ml/bonsai-image) Link to demo: [https://huggingface.co/spaces/webml-community/bonsai-image-webgpu](https://huggingface.co/spaces/webml-community/bonsai-image-webgpu)

오픈소스 이미지 생성 로컬 구동 가중치 양자화 디퓨전 모델