r/LocalLLaMA • 75일 전

VS Code 로컬 AI 지원 추가...하지만 유료 플랜 필수

IMP

7/10

핵심 요약

최근 VS Code에 도입된 '에이전트 창(Agents window)'에서 로컬 AI 모델을 포함한 다양한 언어 모델을 사용할 수 있게 되었습니다. 사용자는 간단한 코딩에는 빠른 모델을, 복잡한 리팩토링이나 아키텍처 결정에는 추론(Reasoning) 모델을 선택하고 '생각 노력(Thinking effort)'을 세밀하게 조정할 수 있습니다. 하지만 이러한 기능을 활용하려면 기본적으로 인터넷 연결과 GitHub Copilot 유료 구독이 필요하며, 관리자의 정책 설정이 필요할 수 있어 제약이 존재합니다.

번역된 본문

VS Code의 AI 언어 모델 Visual Studio Code는 다양한 작업에 최적화된 여러 내장 언어 모델(Language models)을 제공합니다. 또한, 사용자가 직접 다른 제공업체의 모델을 사용하기 위한 API 키를 가져올(Bring your own key) 수도 있습니다. 언어 모델의 작동 방식과 주요 특성에 대한 배경 지식은 '언어 모델 개념(Language models concepts)'을 참조하세요. 이 문서에서는 채팅이나 인라인 제안에 사용되는 언어 모델을 변경하는 방법과 자신만의 API 키를 사용하는 방법을 설명합니다.

작업에 맞는 올바른 모델 선택 기본적으로 채팅은 코딩, 요약, 지식 기반 질문, 추론 등 광범위한 작업에 대해 빠르고 유능한 응답을 제공하기 위해 기본 모델(Base model)을 사용합니다. 하지만 이 모델만 사용해야 하는 것은 아닙니다. 각각의 고유한 강점을 가진 다양한 언어 모델 중에서 선택할 수 있습니다. 일반적인 가이드라인으로는, 빠른 편집이나 간단한 질문에는 빠른 모델(Fast model)을 사용하고, 복잡한 리팩토링, 아키텍처 결정 또는 다단계 작업에는 추론 모델(Reasoning model)을 사용하세요. 자세한 비교는 GitHub Copilot 설명서의 '작업에 맞는 올바른 AI 모델 선택'을 참조하세요. 사용 중인 에이전트(Agent)에 따라 사용 가능한 모델 목록이 다를 수 있습니다. 예를 들어, 에이전트 모드에서는 도구 호출(Tool calling)을 잘 지원하는 모델로 목록이 제한됩니다.

참고: Copilot Business 또는 Enterprise 사용자의 경우, 관리자가 GitHub.com의 Copilot 정책 설정에서 '에디터 미리 보기 기능(Editor Preview Features)'을 선택하여 조직의 특정 모델을 활성화해야 합니다.

채팅 대화 모델 변경 채팅 입력 창의 언어 모델 선택기(Language model picker)를 사용하여 채팅 대화 및 코드 편집에 사용되는 모델을 변경할 수 있습니다.

팁: AI Toolkit 확장 프로그램을 설치하면 더 많은 언어 모델을 추가하여 GitHub Copilot의 기능을 강화할 수 있습니다. 자세한 내용은 '채팅 모델 변경'을 참조하세요. 자체 언어 모델 API 키를 사용하여 사용 가능한 모델 목록을 추가로 확장할 수 있습니다. 유료 Copilot 플랜을 보유하고 있는 경우, 모델 선택기에는 프리미엄 모델에 대한 프리미엄 요청 배율기(Premium request multiplier)가 표시됩니다. GitHub Copilot 설명서에서 프리미엄 요청에 대해 자세히 알아보세요.

생각 노력(Thinking effort) 구성 일부 모델은 구성 가능한 '생각 노력'을 지원합니다. 생각 노력은 모델이 각 요청에 얼마나 많은 추론을 적용할지 제어합니다. 아키텍처 결정이나 다단계 디버깅과 같은 복잡한 작업에는 더 높은 수준을, 단순한 코드 생성이나 간단한 질문에는 더 낮은 수준을 사용하세요. 생각과 추론이 작동하는 방식에 대한 배경 지식은 '생각 및 추론(Thinking and reasoning)'을 참조하세요.

VS Code는 평가 및 온라인 성능 데이터를 기반으로 권장 기본 생각 노력 수준을 설정하며, 적응형 추론(Adaptive reasoning)을 활성화합니다. 적응형 추론은 모델이 각 요청의 복잡성에 따라 언제, 얼마나 깊이 생각할지 동적으로 결정할 수 있게 합니다. 대부분의 사용 사례에서 기본 설정이 잘 작동하므로 변경할 필요가 없습니다.

모델 선택기에서 직접 생각 노력을 구성할 수 있습니다:

채팅 입력 창에서 모델 선택기를 열고 추론 모델을 선택합니다.
모델 이름 옆에 나타나는 > 화살표를 선택하여 '생각 노력(Thinking Effort)' 하위 메뉴를 엽니다. 참고: GPT-4.1 및 GPT-4o와 같은 비추론 모델은 생각 노력 하위 메뉴를 표시하지 않습니다.
노력 수준을 선택합니다. 모델 선택기 레이블이 업데이트되어 선택한 노력 수준(예: "Claude Sonnet 4.6 · High")이 표시됩니다. 노력 수준은 동일한 모델에 대한 대화 간에 유지됩니다.

참고: github.copilot.chat.anthropic.thinking.effort 및 github.copilot.chat.responsesApiReasoningEffort 설정은 더 이상 사용되지 않습니다(deprecated). 생각 노력은 언어 모델 선택기를 통해 직접 구성해야 합니다.

자동 모델 선택(Auto model selection) 참고: 자동 모델 선택은 VS Code 1.104 릴리스부터 사용할 수 있습니다. 자동 모델 선택 기능을 통해 VS Code는 최적의 성능을 보장하고, 특정 언어 모델의 과도한 사용으로 인한 속도 제한(Rate limit)을 줄이기 위해 자동으로 모델을 선택합니다. 이 기능은 모델 성능 저하를 감지하고 해당 시점에 가장 적합한 모델을 사용합니다.

원문 보기

원문 보기 (영어)

AI language models in VS Code Visual Studio Code offers different built-in language models that are optimized for different tasks. You can also bring your own language model API key to use models from other providers. For background on how language models work and their key characteristics, see Language models concepts . This article describes how to change the language model for chat or inline suggestions and how to use your own API key. Choose the right model for your task By default, chat uses a base model to provide fast, capable responses for a wide range of tasks, such as coding, summarization, knowledge-based questions, reasoning, and more. However, you are not limited to using only this model. You can choose from a selection of language models , each with its own particular strengths. As a general guideline, use a fast model for quick edits and simple questions, and a reasoning model for complex refactoring, architectural decisions, or multi-step tasks. For a detailed comparison, see Choosing the right AI model for your task in the GitHub Copilot documentation. Depending on the agent you are using, the list of available models might be different. For example, in agent mode, the list of models is limited to those that have good support for tool calling. Note If you are a Copilot Business or Enterprise user, your administrator needs to enable certain models for your organization by opting in to Editor Preview Features in the Copilot policy settings on GitHub.com. Change the model for chat conversations Use the language model picker in the chat input field to change the model that is used for chat conversations and code editing. Tip Install the AI Toolkit extension to add more language models to enhance GitHub Copilot capabilities. For more information, see Change the chat model . You can further extend the list of available models by using your own language model API key . If you have a paid Copilot plan, the model picker shows the premium request multiplier for premium models. Learn more about premium requests in the GitHub Copilot documentation. Configure thinking effort Some models support configurable thinking effort. Thinking effort controls how much reasoning the model applies to each request. Use a higher effort level for complex tasks like architectural decisions or multi-step debugging, and a lower level for straightforward code generation or simple questions. For background on how thinking and reasoning work, see Thinking and reasoning . VS Code sets recommended default effort levels based on evaluations and online performance data, and has adaptive reasoning enabled. Adaptive reasoning lets the model dynamically determine when and how much to think based on the complexity of each request. For most use cases, the defaults work well and you don't need to change them. You can configure the thinking effort directly from the model picker: Open the model picker in the chat input field and select a reasoning model. Select the > arrow that appears next to the model name to open the Thinking Effort submenu. Note Non-reasoning models, such as GPT-4.1 and GPT-4o, do not show the thinking effort submenu. Select an effort level. The model picker label updates to show the selected effort level, for example "Claude Sonnet 4.6 · High". The effort level persists across conversations for the same model. Note The github.copilot.chat.anthropic.thinking.effort Open in VS Code Open in VS Code Insiders and github.copilot.chat.responsesApiReasoningEffort Open in VS Code Open in VS Code Insiders settings are deprecated. You should configure thinking effort directly via the language model picker. Auto model selection Note Auto model selection is available as of VS Code release 1.104. With auto model selection, VS Code automatically selects a model to ensure that you get the optimal performance and reduce rate limits due to excessive usage of particular language models. It detects degraded model performance and uses the best model at that point in time. We continue to improve this feature to pick the most suitable model for your needs. To use auto model selection, select Auto from the model picker in chat. Currently, auto chooses between Claude Sonnet 4, GPT-5, GPT-5 mini and other models. If your organization has opted out of certain models , auto will not select those models. If none of these models are available or you run out of premium requests, auto will fall back to a model at 0x multiplier. Important Starting April 20, 2026 , new sign-ups for Copilot Pro, Copilot Pro+, and student plans are temporarily paused. Additionally, we are tightening weekly usage limits. If you hit a weekly limit and you have premium requests remaining, you can continue using Copilot with auto model selection. See GitHub Copilot usage limits . Multiplier discounts When using auto model selection, VS Code uses a variable model multiplier , based on the selected model. If you are a paid user, auto will apply a request discount. At any time, you can see which model and model multiplier are used by hovering over the chat response. Manage language models You can use the language models editor to view all available models, choose which models are shown in the model picker, and add more models by adding from built-in providers or from extension-provided model providers. To open the Language Models editor, open the model picker in the Chat view and select Manage Models or run the Chat: Manage Language Models command from the Command Palette. The Language Models editor opens by default in a modal overlay on top of the editor area. The editor lists all models available to you, showing key information such as the model capabilities, context size, billing details, and visibility status. By default, models are grouped by provider, but you can also group them by visibility. You can search and filter models by using the following options: Text search with the search box Provider: @provider:"OpenAI" Capability: @capability:tools , @capability:vision , @capability:agent Visibility: @visible:true/false Customize the model picker You can customize which models are shown in the model picker by changing the visibility status of models in the Language Models editor. You can show or hide models from any provider. Hover over a model in the list and select the eye icon to show or hide the model in the model picker. Bring your own language model key Note If you are a Copilot Business or Enterprise user, your administrator can disable the Bring Your Own Language Model Key policy in the Copilot policy settings on GitHub.com. GitHub Copilot in VS Code comes with a variety of built-in language models that are optimized for different tasks. If you want to use a model that is not available as a built-in model, you can bring your own language model API key (BYOK) to use models from other providers. Using your own language model API key in VS Code has several benefits: Model choice : access hundreds of models from different providers, beyond the built-in models. Experimentation : experiment with new models or features that are not yet available in the built-in models. Local compute : use your own compute for one of the models already supported in GitHub Copilot or to run models not yet available. Greater control : by using your own key, you can bypass the standard rate limits and restrictions imposed on the built-in models. VS Code provides different options to add more models: Use one of the built-in model providers Install a language model provider extension from the Visual Studio Marketplace, for example, AI Toolkit for VS Code with Foundry Local Considerations when using bring your own model key Only applies to the chat experience and doesn't affect inline suggestions or other AI-powered features in VS Code. Capabilities are model-dependent and might differ from the built-in models, for example, support for tool calling, vision, or thinking. The Copilot service API is still used for some tasks, such as se

vs-code github-copilot 로컬-ai-모델 개발-도구 에이전트