r/LocalLLaMA • 112일 전

8GB VRAM으로 Gemma 4 로컬 파인튜닝 및 버그 수정 안내

IMP

8/10

핵심 요약

Unsloth에서 무료 노트북을 통해 Gemma 4 E2B 및 E4B 모델을 파인튜닝할 수 있게 되었습니다. 단 8GB VRAM만으로도 로컬 환경에서 학습이 가능하며, 기존 대비 약 1.5배 빠르고 60% 적은 VRAM을 사용합니다. 또한 학습 시 Loss 폭주, 추론 오류 등 4가지 핵심 버그를 수정하여 안정적인 학습 및 추론 환경을 제공합니다.

번역된 본문

안녕하세요 여러분, 이제 무료 Unsloth 노트북에서 Gemma 4 E2B 및 E4B 모델을 파인튜닝할 수 있습니다! 로컬 환경에서 Gemma-4-E2B를 학습시키려면 8GB의 VRAM이 필요합니다. Unsloth는 기존 FA2(Flash Attention 2) 설정보다 약 1.5배 빠르고 약 60% 적은 VRAM으로 Gemma 4를 학습합니다: https://github.com/unslothai/unsloth

또한 Gemma 4 학습 과정에서 발견된 여러 버그를 수정했습니다:

그래디언트 누적(Grad accumulation)으로 인해 Loss가 폭증하던 문제 해결 - 이전에는 300~~400까지 치솟던 Loss 값이 정상적인 10~~15 수준을 유지하도록 Unsloth에서 수정했습니다.
26B 및 31B 모델 추론 시 인덱스 에러(Index Error) - transformers를 사용할 때 26B 및 31B 모델의 추론이 실패하던 문제를 해결했습니다.
use_cache=False 설정 시 E2B, E4B 모델에서 알아들을 수 없는 텍스트(Gibberish)가 출력되던 문제 - 자세한 내용: https://github.com/huggingface/transformers/issues/45242
float16 환경에서 오디오 데이터에 -1e9 오버플로우가 발생하던 문제

26B-A4B 및 31B 모델 학습도 가능하며, Unsloth Studio의 UI를 통해 편리하게 학습할 수 있습니다. Studio와 노트북은 비전(Vision), 텍스트(Text), 오디오(Audio) 및 추론 모두 지원합니다.

버그 수정 세부 사항과 다양한 팁 및 활용법은 저희 블로그/가이드에서 확인해주세요: https://unsloth.ai/docs/models/gemma-4/train

무료 Colab 노트북:

E4B + E2B (Studio 웹 UI)	E4B (비전 + 텍스트)	E4B (오디오)	E2B (비전 + 텍스트)

감사합니다!

원문 보기

원문 보기 (영어)

Hey guys, you can now fine-tune Gemma 4 E2B and E4B in our free Unsloth notebooks! You need **8GB VRAM to train Gemma-4-E2B** locally. Unsloth trains Gemma 4 **\~1.5x faster with \~60% less VRAM** than FA2 setups: [https://github.com/unslothai/unsloth](https://github.com/unslothai/unsloth) We also found and did bug fixes for Gemma 4 training: 1. Grad accumulation no longer causes losses to explode - before you might see losses of 300 to 400 - it should be 10 to 15 - Unsloth has this fixed. 2. Index Error for 26B and 31B for inference - this will fail inference for 26B and 31B when using transformers - we fixed it. 3. `use_cache=False` had gibberish for E2B, E4B - see [https://github.com/huggingface/transformers/issues/45242](https://github.com/huggingface/transformers/issues/45242) 4. float16 audio -1e9 overflows on float16 You can also train 26B-A4B and 31B or train via a UI with [Unsloth Studio](https://unsloth.ai/docs/models/gemma-4/train#quickstart). Studio and the notebooks work for Vision, Text, Audio and inference. **For Bug Fix details and tips and tricks, read our blog/guide:** [**https://unsloth.ai/docs/models/gemma-4/train**](https://unsloth.ai/docs/models/gemma-4/train) Free Colab Notebooks: |[E4B + E2B (Studio web UI)](https://colab.research.google.com/github/unslothai/unsloth/blob/main/studio/Unsloth_Studio_Colab.ipynb)|[E4B (Vision + Text)](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma4_(E4B)-Vision.ipynb)|[E4B (Audio)](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma4_(E4B)-Audio.ipynb)|[E2B (Run + Text)](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma4_(E2B)-Text.ipynb)| |:-|:-|:-|:-| Thanks guys!

Gemma-4 파인튜닝 오픈소스 VRAM 최적화 Unsloth

Gemma 4 기반 화면 관찐 워크플로 자동 스킬화

오픈소스 Mac 메뉴바 앱 AgentHandover가 로컬 Gemma 4(Ollama)로 화면을 관찰해 반복 워크플로를 구조화된 Skill 파일로 자동 생성합니다. MCP를 통해 Claude Code, Cursor 등 어떤 에이전트든 즉시 연동 가능하며, 전 과정이 온디바이스에서 암호화되어 처리되어 프라이버시가 강력합니다.

에이전트 로컬 모델 워크플로 자동화

r/LocalLLaMA • 112일 전

IMP 8

Gemma 4 31B GGUF 양자화 모델 KL 발산 성능 순위

oobabooga 사용자가 Hugging Face 주요 업로더들의 Gemma 4 31B GGUF 양자화 모델 52종의 품질을 KL 발산 지표로 비교 분석했습니다. 그 결과 파레토 최적화 기준 unsloth의 UD- 시리즈가 동일 용량 대비 가장 뛰어난 성능을 보여주었으며, 코딩 및 과학 분야보다 긴 문맥이나 비라틴어 텍스트 처리 시 품질 저하가 크게 나타났습니다. 이는 로컬 환경에서 LLM을 구동하는 사용자들에게 자신의 메모리 용량에 맞는 최적의 양자화 모델을 선택하는 중요한 가이드를 제공합니다.

로컬-LLM 양자화 Gemma-4