r/OpenAI • 76일 전

오픈AI, "윈도우 샌드박스 부족… 리눅스는 이미 다 갖춰져 있었다"

IMP

8/10

핵심 요약

오픈AI 엔지니어가 AI 코딩 에이전트인 코덱스(Codex)를 윈도우 환경에서 안전하게 격리(Sandboxing)하기 위해 겪은 복잡한 과정을 상세히 공유했습니다. 리눅스와 macOS는 이미 훌륭한 보안 격리 도구들을 제공하지만, 윈도우는 이에 상응하는 기본 기능이 부족해 방화벽 제어부터 커스텀 사용자 생성까지 복잡한 우회 방법을 처음부터 새로 구축해야만 했습니다. 이는 반자율적인 AI 에이전트가 개발자의 로컬 시스템에서 직접 명령을 실행하는 시대에 기존 운영체제의 보안 모델이 어떤 한계에 직면했는지를 보여주는 중요한 사례입니다.

번역된 본문

코덱스(Codex)와 같은 AI 코딩 도구에 대해 사람들이 이야기할 때, 대화는 보통 생산성에 초점이 맞춰집니다. 코드를 작성할 수 있는가? 소프트웨어를 디버그할 수 있는가? 개발자의 시간을 절약해 줄 수 있는가? 등이죠. 하지만 이러한 논의에서 자주 무시되는 무서운 부분이 있습니다. 바로 이러한 도구들이 사용자의 컴퓨터에서 직접 명령을 실행하도록 점점 더 많은 신뢰를 받고 있다는 사실입니다. 이 현실은 오픈AI 엔지니어인 데이비드 위젠(David Wiesen)이 윈도우 환경에서 코덱스를 위한 안전한 샌드박스를 구축하는 과정의 놀랍도록 복잡했던 과정을 상세히 설명한 새로운 기술 블로그 게시물의 핵심이었습니다.

솔직히 말해, 리눅스 유저인 제 입장에서 가장 눈에 띄었던 점은 오픈AI가 리눅스에는 이미 필요한 많은 도구가 준비되어 있었다는 사실을 인정한 반면, 윈도우는 처음부터 맞춤형 솔루션을 새롭게 구축해야만 했다는 것입니다. 해당 게시물에 따르면, 코덱스는 셸 명령어를 실행하고, 파일을 편집하고, 테스트를 실행하고, 깃(Git) 브랜치를 생성하며, 기기 로컬 환경에서 개발 환경과 상호 작용할 수 있습니다. 즉, 이것은 브라우저 탭에 머물러 있는 해롭지 않은 챗봇이 아닙니다. 이것은 실제 사용자의 권한을 가지고 시스템에 직접적으로 작용할 수 있는 소프트웨어입니다.

이는 명백한 문제를 만듭니다. 만약 AI 코딩 에이전트가 오작동하거나, 악성 코드를 만나거나, 단순히 잘못된 결정을 내릴 경우, 그 피해는 고장 난 스크립트 수준을 훨씬 넘어설 수 있습니다. 오픈AI는 리눅스와 macOS의 경우 운영체제가 이미 유용한 샌드박싱 기본 요소들을 제공하고 있다고 밝혔습니다. 특히 macOS의 애플 시트벨트(Seatbelt) 프레임워크와 함께 리눅스 기술인 seccomp와 bubblewrap을 명시적으로 언급했습니다. 그러나 윈도우는 코덱스가 필요로 하는 유연한 워크플로우에 적합한 직관적인 대안이 명백히 부족했습니다.

이로 인해 오픈AI는 복잡한 문제 해결의 터널로 빠져들게 되었습니다. 오픈AI는 AppContainer, Windows Sandbox, 필수 무결성 제어(Mandatory Integrity Control) 레이블링과 같은 기존의 윈도우 기술들을 평가했지만, 다양한 이유로 각각을 기각했습니다. 어떤 접근 방식은 너무 엄격했고, 어떤 방식은 실제 개발 환경과의 호환성을 깨뜨렸으며, 어떤 방식은 유용성을 떨어뜨리는 무거운 가상화를 필요로 했습니다.

이 문제를 해결하기 위한 오픈AI의 첫 번째 시도는 이른바 '비상승격(Unelevated) 샌드박스'라고 불리는 것이었습니다. 이 설정은 아웃바운드 인터넷 접속을 차단하기 위해 인조 SID(Synthetic SID), 쓰기 제한 토큰(Write-restricted tokens), ACL(액세스 제어 목록) 조작, 심지어 환경 변수 트릭에 의존했습니다. 하지만 오픈AI는 결국 이 접근 방식이 너무 취약하다는 결론을 내렸습니다. 프로그램이 프록시 설정을 무시하거나 자체 네트워킹 스택을 구현하여 이러한 제한을 우회할 수 있다고 회사는 공공연하게 인정했습니다.

최종 구현은 훨씬 더 정교해졌습니다. 오픈AI에 따르면 현재 윈도우 샌드박스는 CodexSandboxOffline 및 CodexSandboxOnline이라는 전용 로컬 사용자를 생성하고, 아웃바운드 방화벽 제한을 적용하며, 제한된 토큰을 통해 명령을 실행하고, 권한 경계를 올바르게 관리하기 위해 여러 개의 보조 실행 파일에 의존합니다. 어느 시점에서 회사는 윈도우 토큰 처리가 기본 코덱스 프로세스에서 깔끔하게 관리하기에는 너무 복잡해졌기 때문에 단순히 전용 명령 실행 바이너리를 생성해야만 했습니다.

솔직히 말해, 이 글 전체에서 가장 매혹적인 부분은 현대 AI 에이전트와 전통적인 운영체제 보안 모델 간의 충돌을 얼마나 명확하게 보여준다는 것입니다. AI 코딩 도구는 빠르게 반자율적인 소프트웨어 운영자로 진화하고 있습니다. 이들은 파일을 읽고, 하위 프로세스를 생성하고, 리포지토리를 조작하고, 패키지를 설치하며, 잠재적으로 민감한 정보와 상호 작용합니다. 이는 보안에 관한 논의를 극적으로 변화시킵니다.

오픈AI는 리눅스에는 이미 AI 에이전트에 필요한 많은 샌드박싱 도구가 있었지만, 윈도우는 방화벽 규칙, 커스텀 신원, 제한된 토큰 및 광범위한 권한 관리를 포함하는 복잡한 우회 방법이 필요했다고 사실상 말하고 있습니다. 아시다시피, 이것은 리눅스가 매일같이 우연히 얻는 칭찬은 아닙니다.

글쓴이: 브라이언 파지올리 (Brian Fagioli) ✔ 기술 저널리스트이자 NERDS.xyz 설립자

원문 보기

원문 보기 (영어)

When folks talk about AI coding tools like Codex, the conversation usually revolves around productivity. Can it write code? Can it debug software? Can it save developers time? What often gets ignored is the scary part: these tools are increasingly being trusted to run commands directly on people’s computers. That reality was front and center in a new technical post from OpenAI engineer David Wiesen, who detailed the surprisingly messy process of building a secure sandbox for Codex on Windows. And frankly, one thing stood out immediately to me as a Linux guy — OpenAI basically admits Linux already had many of the tools it needed, while Windows forced the company to engineer a custom solution from scratch. According to the post , Codex can execute shell commands, edit files, launch tests, create Git branches, and interact with development environments locally on a machine. In other words, this is not some harmless chatbot sitting in a browser tab. This is software capable of acting on a system with the permissions of a real user. That creates an obvious problem. If an AI coding agent goes rogue, encounters malicious code, or simply makes a bad decision, the damage could extend far beyond a broken script. On Linux and macOS, OpenAI says operating systems already provide useful sandboxing primitives. The company specifically mentions Linux technologies like seccomp and bubblewrap, alongside Apple’s Seatbelt framework on macOS. Windows, however, apparently lacked a straightforward equivalent that worked well for the sort of open-ended workflows Codex requires. That sent OpenAI down a rabbit hole. The company evaluated existing Windows technologies like AppContainer, Windows Sandbox, and Mandatory Integrity Control labeling, but rejected each for various reasons. Some approaches were too restrictive, others broke compatibility with real developer environments, and some required heavyweight virtualization that would have hurt usability. OpenAI’s first attempt at solving the problem involved what it called an “unelevated sandbox.” The setup relied on synthetic SIDs, write-restricted tokens, ACL manipulation, and even environment variable tricks to try blocking outbound internet access. But OpenAI eventually concluded the approach was too weak. The company openly admits that programs could bypass the restrictions by ignoring proxy settings or implementing their own networking stack. That is not exactly the sort of sentence you expect to read from an AI company trying to reassure people about autonomous agents running local commands. The final implementation became much more elaborate. OpenAI says the current Windows sandbox now creates dedicated local users called CodexSandboxOffline and CodexSandboxOnline, applies outbound firewall restrictions, launches commands through restricted tokens, and relies on multiple helper executables to manage privilege boundaries correctly. At one point, the company even had to create a dedicated command runner binary simply because Windows token handling got too complicated to manage cleanly from the main Codex process. Honestly, the most fascinating part of the entire writeup is how clearly it exposes the collision between modern AI agents and traditional operating system security models. AI coding tools are quickly evolving into semi-autonomous software operators. They read files, spawn subprocesses, manipulate repositories, install packages, and potentially interact with sensitive information. That changes the security conversation dramatically. OpenAI effectively says Linux already had many of the sandboxing tools AI agents needed, while Windows required a complicated workaround involving firewall rules, custom identities, restricted tokens, and extensive permission management. You know what? That is not the kind of accidental praise Linux gets every day. Written by Brian Fagioli ✔ Technology journalist and founder of NERDS.xyz Brian Fagioli is a technology journalist and founder of NERDS.xyz. A former BetaNews writer, he has spent over a decade covering Linux, hardware, software, cybersecurity, and AI with a no nonsense approach for real nerds. 📄 More by Brian Fagioli ✖ Follow on X ▶ YouTube @ Threads 🐘 Mastodon

코딩 에이전트 보안 및 샌드박스 오픈AI 코덱스 운영체제 리눅스 및 윈도우