r/LocalLLaMA • 97일 전

Rust 기반 로컬 만화 번역기, LLM 내장

IMP

7/10

핵심 요약

오픈소스 로컬 만화 번역기 'Koharu'가 공개되었습니다. llama.cpp를 통합해 시각적 LLM OCR과 객체 탐지, 인페인팅을 결합한 고성능 파이프라인을 제공합니다. 번역 결과를 폰트와 색상 등 미세 조정할 수 있는 내장 에디터도 포함되어 있어 실무 번역 작업에 즉시 활용할 수 있다는 점이 중요합니다.

번역된 본문

안녕하세요, LocalLLaMA 여러분.

몇 주 전에 글을 올렸었는데, 이번에는 프로젝트가 훨씬 안정적이고 사용하기 쉬워졌습니다.

이것은 만화 번역기이며, 모든 이미지의 번역에도 사용할 수 있습니다. 객체 탐지, 시각 LLM 기반 OCR, 레이아웃 분석, 파인튜닝된 인페인팅 모델을 결합해 사용합니다. 제 생각에 만화 번역을 위한 가장 성능이 좋고 사용하기 쉬운 파이프라인입니다.

LLM 부분은 llama.cpp를 애플리케이션에 통합했습니다. Gemma 4 계열과 Qwen3.5 계열을 지원하며, 검열 해제 및 파인튜닝 모델도 포함됩니다. 또한 OpenAI API 호환 API도 지원하여 LM Studio나 OpenRouter 등을 사용할 수 있습니다.

데모 영상이 워크플로우를 잘 설명해 줍니다. 기본적으로 버튼을 누르면 파이프라인이 실행됩니다. 결과를 교정 및 편집하고, 글꼴, 크기, 색상 등을 변경할 수도 있습니다. 일종의 미니 포토샵 에디터입니다.

관심 있으신 분들을 위해 완전한 오픈소스입니다: https://github.com/mayocream/koharu

원문 보기

원문 보기 (영어)

Hi LocalLLaMA, I created a post a few weeks ago, but this time this project has become more reliable and easier to use. This is a manga translator that can also be used to translate any image. It uses a combination of object detection, visual LLM-based OCR, layout analysis, and fine-tuned inpainting models. I believe it is the most performant and easy-to-use pipeline for manga translation. For the LLM part, I have integrated llama.cpp into this application; it supports the Gemma 4 family and the Qwen3.5 family, and also includes uncensored and fine-tuned models. It also supports OpenAPI-compatible API, so you can use LM Studio or OpenRouter, etc. I think the demo video explains the workflow a lot, basiclly you just click a button and it will run the pipeline for you. You can also proofread and edit the result, changing the font, size, color, etc. It's a mini Photoshop editor. For who may have interest on this, it's fully open-source: [https://github.com/mayocream/koharu](https://github.com/mayocream/koharu)

오픈소스 만화 번역 시각 LLM 로컬 LLM OCR