The Decoder • 105일 전

영국 AISI: 클로드 모델, 방어 취약 기업망 자동 해킹 성공

IMP

9/10

핵심 요약

영국 AI 안전국(AISI)의 평가 결과, 앤스로픽의 '클로드 미토스 프리뷰(Claude Mythos Preview)'가 방어가 취약한 기업 네트워크를 처음으로 종단간 자율 침투하는 데 성공했습니다. 이 모델은 전문가 수준의 사이버 공격 훈련인 CTF(Capture the Flag)에서 73%의 높은 성공률을 기록하며 32단계의 전체 망 장악 시뮬레이션을 10회 중 3회 완료했습니다. 다만 실제 환경과 달리 테스트 환경에 능동적인 방어자나 보안 모니터링 시스템이 없었기 때문에, 실제 잘 보호된 시스템에서도 동일한 성능을 발휘할지는 미지수라는 한계가 있습니다.

번역된 본문

영국 AI 안전국(AISI)은 앤스로픽의 클로드 미토스 프리뷰(Claude Mythos Preview)를 대상으로 사이버 공격 역량을 테스트했습니다. 그 결과 AI 모델이 처음으로 소규모이고 방어가 취약한 기업 네트워크를 대상으로 전체 공격 시뮬레이션을 자율적으로 완료했습니다.

AISI에 따르면 미토스 프리뷰는 AI 사이버 역량에서 중요한 도약을 나타냅니다. 불과 2년 전만 해도 시장에서 가장 우수한 모델조차 초보자 수준의 사이버 임무를 간신히 수행할 수 있었습니다. 통제된 평가 환경에서 미토스 프리뷰는 취약한 네트워크를 대상으로 다단계 공격을 실행했으며, 명시적인 지침과 네트워크 접근 권한이 주어지면 보안 취약점을 스스로 식별하고 악용했습니다. AISI는 이러한 작업이 인간 보안 전문가에게는 며칠이 걸릴 것이라고 밝혔습니다.

Capture the Flag(CTF, 공격방어 대회): 전문가 수준에서 73% 성공률

CTF(캡처 더 플래그) 대회에서 AI 모델은 대상 시스템의 취약점을 찾고 악용하여 숨겨진 플래그를 발견해야 합니다. AISI에 따르면 미토스 프리뷰는 수습자(초급) 과제에서 약 85%, 비전문가용 기초 기술 과제에서 약 95%의 성공률(250만 토큰 예산 기준)을 달성했습니다. 이는 GPT-5.4, 코덱스(Codex) 5.3, 클로드 오퍼스(Claude Opus) 4.6과 함께 최상위권에 해당하는 수치입니다.

더 큰 컴퓨팅 예산(5,000만 토큰)이 주어진 경우, 미토스 프리뷰는 실무자 수준 과제에서 약 93%, 전문가 수준 과제에서 73%의 점수를 기록했습니다. 이 전문가 수준의 기록은 특히 주목할 만합니다. AISI에 따르면 2025년 4월 이전에는 어떠한 모델도 전문가 수준의 과제를 해결하지 못했습니다.

앤스로픽의 클로드 미토스, 기업 네트워크 자율 해킹 가능

AISI는 CTF 대회가 개별 기술만을 개별적으로 테스트할 뿐이며, 실제 사이버 공격은 여러 호스트와 네트워크 세그먼트에 걸쳐 수십 단계를 연결해야 한다고 설명했습니다. 이러한 복잡성을 측정하기 위해 연구소는 "The Last Ones"(TLO)라는 시뮬레이션을 개발했습니다. 이는 초기 정찰부터 전체 네트워크 장악까지 이어지는 시뮬레이션된 기업 네트워크에 대한 32단계 공격입니다. AISI는 이 작업에 인간 전문가가 약 20시간이 걸릴 것으로 추정했습니다. 자세한 내용은 함께 제공되는 논문에서 확인할 수 있습니다.

클로드 미토스 프리뷰는 TLO를 종단간(End-to-End)으로 완료한 최초의 모델입니다. 이 모델은 10번의 시도 중 3번에서 전체 네트워크 장악에 성공했습니다. 평균적으로 이 모델은 32단계 중 22단계를 완료했습니다. 다음으로 우수한 모델인 클로드 오퍼스 4.6은 평균 16단계를 기록했습니다. AISI는 더 많은 추론 컴퓨팅이 제공될수록 성능이 계속 향상될 것으로 예상했습니다. 테스트에는 1억 토큰의 예산이 사용되었으며, 해당 한계치까지 성능이 선형적으로 향상되었습니다. 사이버 임무를 위한 추론 스케일링과 관련된 별도의 블로그 게시물에서 이러한 추세에 대해 자세히 다루고 있습니다.

하지만 미토스 프리뷰도 한계는 명확히 존재합니다.

이 모델은 발전소나 공장에서 사용되는 산업 제어 기술(운영 기술, Operational Technology, OT)을 표적으로 하는 별도의 AISI 공격 시뮬레이션을 완료하지 못했습니다. AISI에 따르면 이것이 반드시 모델이 OT 구성 요소 자체에서 실패했다는 것을 의미하지는 않습니다. 모델이 시뮬레이션의 IT 네트워크 단계에서 진행이 지연되어 OT 네트워크 단계에 도달조차 하지 못했기 때문입니다.

AISI는 몇 가지 주의 사항을 강조했습니다. 테스트 환경에는 능동적인 방어자, 보안 모니터링 도구, 보안 소프트웨어가 전혀 없었습니다. 따라서 실제 잘 보호된 기업 환경에서도 동일한 공격이 가능한지는 아직 미지수로 남아있습니다.

원문 보기

원문 보기 (영어)

Claude Mythos can autonomously compromise weakly defended enterprise networks end-to-end Matthias Bastian View the LinkedIn Profile of Matthias Bastian Apr 14, 2026 Nano Banana Pro prompted by THE DECODER Key Points The British AI Security Institute (AISI) has evaluated Anthropic's Claude Mythos Preview for its cyber-attack capabilities, finding that the model achieved a 73 percent success rate in expert-level capture-the-flag challenges. Mythos Preview is the first AI model to complete a full 32-step attack simulation on a simulated corporate network, successfully taking over the entire network in 3 out of 10 attempts. However, the AISI notes significant limitations in the testing setup: the simulated environments lacked active defenders and security monitoring, leaving open whether the model could perform similarly against well-protected, real-world systems. Ask about this article… Search The UK's AI Safety Institute tested Anthropic's Claude Mythos Preview for cyber capabilities. For the first time, an AI model autonomously completed a full attack simulation against a corporate network, as long as the network was small and weakly defended. According to AISI, Mythos Preview represents a significant leap in AI cyber capabilities. Just two years ago, the best available models could barely handle beginner-level cyber tasks. In controlled evaluations, Mythos Preview executed multi-stage attacks on vulnerable networks, identifying and exploiting security holes autonomously when given explicit instructions and network access. These are tasks that would take human security experts days to complete, the AISI says. Capture the flag: 73 percent success rate at expert level In capture-the-flag (CTF) challenges, AI models must find and exploit vulnerabilities in target systems to uncover hidden flags. According to AISI, Mythos Preview achieves about 85 percent on apprentice tasks and roughly 95 percent on beginner-level technical non-expert tasks (with a 2.5 million token budget). That places it in the top tier alongside GPT-5.4, Codex 5.3, and Claude Opus 4.6. Ad With a larger compute budget (50 million tokens), Mythos Preview scores around 93 percent on practitioner tasks and 73 percent on expert-level challenges. That expert-level number is particularly notable: according to AISI, no model could solve expert-level tasks before April 2025. Ad DEC_D_Incontent-1 Anthropic's Claude Mythos can autonomously hack corporate networks CTF challenges only test individual skills in isolation, but real cyberattacks require chaining dozens of steps across multiple hosts and network segments, the AISI says. To measure that kind of complexity, the institute developed a simulation called "The Last Ones" (TLO): a 32-step attack against a simulated corporate network, from initial reconnaissance to full network takeover. AISI estimates this would take human experts around 20 hours. Full details are available in the accompanying paper . Ad Claude Mythos Preview is the first model to complete TLO end-to-end. It achieved a full takeover in 3 out of 10 attempts. On average, the model completed 22 of the 32 steps. The next best model, Claude Opus 4.6, averaged 16. AISI expects performance to continue improving with more inference compute. Testing used a budget of 100 million tokens, and performance scaled all the way to that limit. A separate blog post on inference scaling for cyber tasks covers this trend in more detail. Ad DEC_D_Incontent-2 Mythos Preview did show limits, however. The model failed to complete a separate AISI attack simulation targeting industrial control technology (operational technology, or OT), the kind used in power plants and factories. According to AISI, that doesn't necessarily mean the model would fail on the OT components themselves. It never reached that stage because it stalled in the simulation's IT network during earlier steps. Ad AISI flags some caveats: the test environments had no active defenders, no security tooling, and no consequences for actions that would trigger alarms on a real network. Based on these results alone, there's no way to tell whether Mythos Preview could successfully breach a well-defended system. That said, the model is at least capable of "autonomously attacking small, weakly defended and vulnerable enterprise systems where access to a network has been gained," according to AISI. The institute plans to conduct future evaluations in hardened environments with active monitoring, endpoint detection, and real-time incident response. AI cyber capabilities raise the stakes for basic security hygiene The results underscore the importance of cybersecurity fundamentals, according to AISI: regular patching, strong access controls, secure configurations, and thorough logging. Other models with comparable capabilities are likely not far behind. At the same time, the institute notes that AI cyber capabilities are dual-use. While they pose security risks, they could also significantly strengthen cyber defense. In a joint blog post with the UK's National Cyber Security Centre (NCSC), AISI outlines how defenders can prepare for and leverage frontier AI. AISI has been tracking AI cyber capabilities since 2023 and has steadily raised the bar on its evaluations: from chat-based queries to capture-the-flag challenges to complex multi-stage attack simulations. Is Mythos really too dangerous to release? Anthropic officially launched Claude Mythos in early April. The model is currently available to only about 50 companies , reportedly because of cybersecurity concerns. The AISI results at least partly support that decision: the model can autonomously attack weakly protected networks in controlled environments. Critics argue the restrictions are overblown, just like in 2019, when OpenAI deemed GPT-2 too dangerous to release. The performance gains over previous models aren't large enough to justify limiting access this heavily. Some say it's mainly a marketing play or that Anthropic simply doesn't have the compute capacity to offer the model more broadly. But that's all speculation for now. We'll know for sure when your computer breaks—or doesn't—after Mythos-level AI models have been released to the public. AI News Without the Hype – Curated by Humans Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section. Subscribe now Source: Aisi

사이버 보안 AI 안전 클로드(Claude) AISI 해킹 시뮬레이션