MarkTechPost • 100일 전

오픈AI, 보안 전문가용 GPT-5.4-Cyber 공개

IMP

8/10

핵심 요약

오픈AI는 합법적인 방어 목적의 사이버 보안 작업을 지원하기 위해 전용 모델인 GPT-5.4-Cyber를 도입하고, 신원이 확인된 보안 전문가들에게 접근 권한을 대규모로 확대하는 TAC(Trusted Access for Cyber) 프로그램을 발표했습니다. 이 모델은 기존 AI의 지나친 거부 응답을 줄이고 소스 코드 없는 바이너리 역분석 등 고급 보안 워크플로우를 지원하여 실무자들의 연구 마찰을 크게 줄여줍니다. 단, 악의적인 악성 코드 생성이나 데이터 유출 등은 엄격히 금지되며 사용 정책의 적용을 받습니다.

번역된 본문

인공지능(AI) 인프라 기술 및 사이버 보안은 항상 이중 사용(Dual-use)의 문제를 안고 있었습니다. 방어자가 취약점을 찾는 데 도움이 되는 기술적 지식이 공격자가 이를 악용하는 데에도 사용될 수 있기 때문입니다. AI 시스템에서 이러한 긴장감은 그 어느 때보다 높습니다. 과거에는 피해를 예방하기 위한 제한 조치들이 선의의 보안 연구에 마찰을 일으켰으며, 특정 사이버 행위가 방어 목적인지 악의적인지 구분하기가 매우 어려웠습니다. OpenAI는 이 문제에 대한 구조적인 해결책으로 '신원 확인', '계층화된 접근 권한', 그리고 '방어자를 위한 전용 모델'을 제안하고 있습니다.

OpenAI 팀은 핵심 소프트웨어를 방어하는 임무를 맡은 수백 개의 팀과 수천 명의 개인 방어자를 대상으로 '사이버 보안용 신뢰 접근(Trusted Access for Cyber, TAC)' 프로그램을 확대한다고 발표했습니다. 이번 확장의 핵심은 방어적 사이버 보안 사용 사례에 맞게 미세 조정(fine-tuned)된 GPT-5.4의 변형 모델인 GPT-5.4-Cyber의 도입입니다.

GPT-5.4-Cyber란 무엇이며 기존 모델과 어떻게 다른가요? 보안 작업을 위해 대규모 언어 모델(LLM)을 다뤄본 AI 엔지니어나 데이터 과학자라면, 명백히 연구 목적의 맥락에서조차 모델이 악성코드 분석이나 버퍼 오버플로우 원리 설명을 거부하는 답답한 경험을 해봤을 것입니다. GPT-5.4-Cyber는 확인된(verified) 사용자를 위해 이러한 마찰을 제거하도록 설계되었습니다. 다수의 보안 관련 질의에 대해 일괄적인 거부를 적용하는 표준 GPT-5.4와 달리, GPT-5.4-Cyber는 OpenAI가 설명하는 바와 같이 '사이버 허용적(cyber-permissive)'입니다. 즉, 정당한 방어 목적을 위한 프롬프트에 대해 거부 임계값이 의도적으로 낮게 설정되어 있습니다.

여기에는 바이너리 역분석이 포함되어, 보안 전문가가 소스 코드 없이 컴파일된 소프트웨어를 분석하여 악성코드 가능성, 취약점 및 보안 견고성을 확인할 수 있습니다. 소스 코드 없는 바이너리 역분석은 매우 중요한 기능의 잠금 해제입니다. 실제 현장에서 방어자들은 원본 코드에 접근할 수 없는 상태에서 임베디드 장치의 펌웨어, 타사 라이브러리 또는 의심스러운 악성코드 샘플과 같은 폐쇄형 소스 바이너리를 분석해야 하는 경우가 많습니다. 이 모델은 소스 코드가 없는 바이너리 역분석을 포함한 고급 방어 워크플로우를 지원하고, 기능 제한을 줄이며 추가적인 사이버 역량을 위해 특별히 미세 조정된 GPT-5.4 변형 모델로 설명됩니다.

물론 명확한 한계도 존재합니다. 신뢰할 수 있는 접근 권한을 가진 사용자라도 OpenAI의 사용 정책 및 이용 약관을 준수해야 합니다. 이러한 접근 방식은 방어자의 마찰을 줄이는 동시에 데이터 유출, 악성코드 생성 및 배포, 파괴적이거나 무단으로 이루어지는 테스트 등 금지된 행위를 방지하도록 설계되었습니다. 이러한 구분은 매우 중요합니다. TAC는 정당한 작업에 대한 거부 경계를 낮출 뿐, 어떤 사용자에 대해서도 정책을 유예해 주지는 않습니다.

배포 제약 사항도 있습니다. OpenAI가 사용자, 환경 및 의도에 대한 파악력이 떨어지는 구성이기 때문에, '데이터 미보존(Zero-Data-Retention)' 환경에서의 사용은 제한됩니다. 회사는 이러한 제약을 계층화된 접근 권한 모델에서 필수적인 통제 제어면(control surface)으로 규정하고 있습니다. API 호출을 데이터 미보존 모드로 실행하는 데 익숙한 개발 팀이라면, GPT-5.4-Cyber를 기반으로 파이프라인을 구축하기 전에 이 중요한 구현 제약을 반드시 고려해야 합니다.

계층화된 접근 프레임워크: TAC의 실제 작동 방식 TAC는 단순한 체크박스 기능이 아니라 여러 계층으로 이루어진 신원 및 신뢰 기반의 접근 프레임워크입니다. 귀하 또는 귀하의 조직이 이러한 기능을 통합할 계획이라면 이 구조를 이해하는 것이 중요합니다. 접근 프로세스는 두 가지 경로를 통해 진행됩니다. 개인 사용자는 chatgpt.com/cyber에서 본인 인증을 진행할 수 있습니다. 기업은 OpenAI 담당자를 통해 팀의 신뢰 접근 권한을 요청할 수 있습니다. 두 경로 중 하나를 통해 승인된 고객은 일련의 권한을 부여받게 됩니다.

원문 보기

원문 보기 (영어)

Artificial Intelligence AI Infrastructure Technology AI Shorts Applications Deep Learning Editors Pick Hardware Language Model Large Language Model Machine Learning New Releases Security Software Engineering Staff Tech News Cybersecurity has always had a dual-use problem: the same technical knowledge that helps defenders find vulnerabilities can also help attackers exploit them. For AI systems, that tension is sharper than ever. Restrictions intended to prevent harm have historically created friction for good-faith security work, and it can be genuinely difficult to tell whether any particular cyber action is intended for defensive usage or to cause harm. OpenAI is now proposing a concrete structural solution to that problem: verified identity, tiered access, and a purpose-built model for defenders. OpenAI team announced that it is scaling up its Trusted Access for Cyber (TAC) program to thousands of verified individual defenders and hundreds of teams responsible for defending critical software. The main focus of this expansion is the introduction of GPT-5.4-Cyber , a variant of GPT-5.4 fine-tuned specifically for defensive cybersecurity use cases. What Is GPT-5.4-Cyber and How Does It Differ From Standard Models? If you're an AI engineer or data scientist who has worked with large language models on security tasks, you're likely familiar with the frustrating experience of a model refusing to analyze a piece of malware or explain how a buffer overflow works — even in a clearly research-oriented context. GPT-5.4-Cyber is designed to eliminate that friction for verified users. Unlike standard GPT-5.4, which applies blanket refusals to many dual-use security queries, GPT-5.4-Cyber is described by OpenAI as ‘cyber-permissive' — meaning it has a deliberately lower refusal threshold for prompts that serve a legitimate defensive purpose. That includes binary reverse engineering, enabling security professionals to analyze compiled software for malware potential, vulnerabilities, and security robustness without access to the source code. Binary reverse engineering without source code is a significant capability unlock. In practice, defenders routinely need to analyze closed-source binaries — firmware on embedded devices, third-party libraries, or suspected malware samples — without having access to the original code. That model was described as a GPT-5.4 variant purposely fine-tuned for additional cyber capabilities, with fewer capability restrictions and support for advanced defensive workflows including binary reverse engineering without source code. There are also hard limits. Users with trusted access must still abide by OpenAI's Usage Policies and Terms of Use. The approach is designed to reduce friction for defenders while preventing prohibited behavior, including data exfiltration, malware creation or deployment, and destructive or unauthorized testing. This distinction matters: TAC lowers the refusal boundary for legitimate work, but does not suspend policy for any user. There are also deployment constraints. Use in zero-data-retention environments is limited, given that OpenAI has less visibility into the user, environment, and intent in those configurations — a tradeoff the company frames as a necessary control surface in a tiered-access model. For dev teams accustomed to running API calls in Zero-Data-Retention mode, this is an important implementation constraint to plan around before building pipelines on top of GPT-5.4-Cyber. The Tiered Access Framework: How TAC Actually Works TAC is not a checkbox feature — it is an identity-and-trust-based access framework with multiple tiers. Understanding the structure matters if you or your organization plans to integrate these capabilities. The access process runs through two paths. Individual users can verify their identity at chatgpt.com/cyber. Enterprises can request trusted access for their team through an OpenAI representative. Customers approved through either path gain access to model versions with reduced friction around safeguards that might otherwise trigger on dual-use cyber activity. Approved uses include security education, defensive programming, and responsible vulnerability research. TAC customers who want to go further and authenticate as cyber defenders can express interest in additional access tiers, including GPT-5.4-Cyber. Deployment of the more permissive model is starting with a limited, iterative rollout to vetted security vendors, organizations, and researchers. That means OpenAI is now drawing at least three practical lines instead of one: there is baseline access to general models; there is trusted access to existing models with less accidental friction for legitimate security work; and there is a higher tier of more permissive, more specialized access for vetted defenders who can justify it. The framework is grounded in three explicit principles . The first is democratized access: using objective criteria and methods, including strong KYC and identity verification, to determine who can access more advanced capabilities, with the goal of making those capabilities available to legitimate actors of all sizes, including those protecting critical infrastructure and public services. The second is iterative deployment — OpenAI updates models and safety systems as it learns more about the benefits and risks of specific versions, including improving resilience to jailbreaks and adversarial attacks. The third is ecosystem resilience, which includes targeted grants, contributions to open-source security initiatives, and tools like Codex Security. How the Safety Stack Is Built: From GPT-5.2 to GPT-5.4-Cyber It's worth understanding how OpenAI has structured its safety architecture across model versions — because TAC is built on top of that architecture, not instead of it. OpenAI began cyber-specific safety training with GPT-5.2, then expanded it with additional safeguards through GPT-5.3-Codex and GPT-5.4. A critical milestone in that progression: GPT-5.3-Codex is the first model OpenAI is treating as High cybersecurity capability under its Preparedness Framework, which requires additional safeguards. These safeguards include training the model to refuse clearly malicious requests like stealing credentials. The Preparedness Framework is OpenAI's internal evaluation rubric for classifying how dangerous a given capability level could be. Reaching ‘High' under that framework is what triggered the full cybersecurity safety stack being deployed — not just model-level training, but an additional automated monitoring layer. In addition to safety training, automated classifier-based monitors detect signals of suspicious cyber activity and route high-risk traffic to a less cyber-capable model, GPT-5.2. In other words, if a request looks suspicious enough to exceed a threshold, the platform doesn't just refuse — it silently reroutes the traffic to a safer fallback model. This is a key architectural detail: safety is enforced not only inside model weights, but also at the infrastructure routing layer. GPT-5.4-Cyber extends this stack further upward — more permissive for verified defenders, but wrapped in stronger identity and deployment controls to compensate. Key Takeaways TAC is an access-control solution, not just a model launch. OpenAI's Trusted Access for Cyber program uses verified identity, trust signals, and tiered access to determine who gets enhanced cyber capabilities — shifting the safety boundary away from prompt-level refusal filters toward a full deployment architecture. GPT-5.4-Cyber is purpose-built for defenders, not general users. It is a fine-tuned variant of GPT-5.4 with a deliberately lower refusal boundary for legitimate security work, including binary reverse engineering without source code — a capability that directly addresses how real incident response and malware triage actually happen. Safety is enforced in layers, not just in the model weights. GPT-5.3-Codex —

보안 사이버 방어 GPT-5.4-Cyber 모델 미세 조정 접근 제어