Hacker News • 71일 전

단독 범죄자, AI 크로드로 멕시코 정부 해킹

IMP

9/10

핵심 요약

단 한 명의 해커가 상용 AI(Claude, ChatGPT)를 악용해 멕시코 정부 시스템을 해킹해 1.95억 건의 납세자 기록 등 150GB의 데이터를 탈취하는 대규모 사건이 발생했습니다. AI가 새로운 해킹 기법을 발명한 것은 아니지만, 공격에 필요한 비용과 전문 지식의 장벽을 획기적으로 낮춰 일반인도 악의적인 목적으로 고도화된 사이버 공격을 손쉽게 실행할 수 있게 되었음을 보여줍니다.

번역된 본문

AI는 어떠한 새로운 공격이나 새로운 경제적 취약점도 발명하지 않았습니다. AI가 한 일은 단 하나입니다. 공격자에게 필요한 비용과 지식 요구치를 기하급수적으로 낮추었고, 구독권과 악의적 의도를 가진 사람이라면 누구나 공격을 실행할 수 있게 만들었습니다. 바로 2025년 한 해 동안에만 멕시코 정부[1], 17개 의료 및 응급 서비스 기관[2], 그리고 알제리의 한 초보 해커에게 희생된 85명의 랜섬웨어 피해자[2] 등을 강타한 AI 공격이 언론에 보도되었습니다. 이러한 일은 오늘날 암호화폐 업계에서도 발생하고 있습니다. 그리고 암호화폐 분야만이 우리가 이러한 피해를 명확히 집계할 수 있는 유일한 영역일 것입니다.

AI가 평등하게 만든 것 보안 분야에서 AI에 대한 대부분의 보도는 두 가지 관점 중 하나를 취합니다. 유토피아적 관점, 즉 더 나은 감사, 더 적은 버그, 더 안전한 코드라는 시각입니다. 아니면 묵시록적 관점, 즉 누구도 본 적 없는 새로운 제로데이를 찾아내는 자율적인 슈퍼 해커라는 시각입니다. 두 관점 모두 실제로 일어나고 있는 일을 놓치고 있습니다. 2026년의 최신 AI 모델들은 우리가 지난 10년 동안 사용해 온 정적 분석기와 같은 종류의 결과물을 도출하고 있습니다. 다만, 더 적은 인건부 비용으로 더 많은 결과를 더 빠르게 산출할 뿐입니다. 최근 자신의 코드베이스에 가장 과대평가된 최신 AI 모델 중 하나를 적용해 본 컬(curl)의 관리자 다니엘 스텐버그는 "AI 도구는 우리가 이미 알고 있는 흔하고 기존 종류의 오류를 찾아낸다. 그저 그것의 새로운 사례들을 찾아낼 뿐이다"라고 말했습니다[3].

공격 카탈로그 자체는 2021년부터, 그리고 대중적인 AI 도입 이전부터 우리가 피해를 입어왔던 것과 동일합니다. 오라클 조작, 거버넌스 장악, 플래시 론(Flash-loan) 기반의 경제적 착취, 소셜 엔지니어링, 자격 증명 수집, 전형적인 웹 취약점 등입니다. AI는 여기에 단 한 줄의 새로운 항목도 추가하지 않았습니다. AI가 줄인 것은 이 중 어떤 공격이라도 실행하는 데 필요한 '노동력'입니다. 엘리트 솔리디티(Solidity) 감사자는 엔지니어 주당 약 25,000달러의 비용이 듭니다[4]. 그들의 조달 벤치마크에 따르면 시간당 500달러 정도로 볼 수 있습니다. 반면, 최신 AI 모델을 통한 동일한 범위의 분석은 앤스로픽(Anthropic)의 공개 수치에 따르면 계약 당 평균 약 1.22달러의 API 토큰 비용이 듭니다. 그리고 취약점당 토큰 비용은 모델 세대(약 2개월)마다 대략 22%씩 하락하고 있습니다[5]. 플래시 론 거버넌스 공격을 발견하는 데 필요한 기술적 역량은 줄어들지 않았습니다. 하지만 공격을 실행하는 데 드는 비용은 줄어들었습니다. AI가 마련을 무너뜨린 것이 아닙니다. 마련은 항상 지식이 아니었습니다. 마련은 항상 공격자의 노동력에 붙은 가격표였으며, 이제 그 가격은 단순한 '구독료'가 되었습니다. AI가 해킹을 민주화한 것이 아닙니다. 그저 그 비용을 매월 청구서로 돌렸을 뿐입니다.

올해 발생한, 평범한 사람들의 실제 해킹 사건 이제 마련이 단순한 구독료가 되었음을 가장 명확히 보여주는 증거는 지난 12개월 동안 확인된 실제 사례들입니다. 그중 세 가지가 두드러집니다.

첫째, 2025년 12월부터 2026년 1월까지 발생한 멕시코 정부 해킹 사건입니다. 단일 공격자(감빗 시큐리티에 따르면 국가 지원, 맞춤형 멀웨어, 외국 정보 기관과의 관련성 없음)가 클로드 코드(Claude Code)를 탈옥시켜 '버그 바운티 연구원' 페르소나를 부여하고 이에 대해 1,000개 이상의 프롬프트를 실행했습니다[1], [6]. 클로드가 안전상의 이유로 거부하면, 백업으로 챗GPT(ChatGPT)가 사용되었습니다. 그 결과: 연방 세무 당국(SAT), 국립 선거 연구소, 그리고 할리스코, 미초아칸, 타마울리파스 주(state) 정부에 걸쳐 20개의 취약점이 악용되었습니다. 150기가바이트의 데이터가 반출되었습니다. 1억 9,500만 건의 납세자 기록, 유권자 명부, 정부 직원 자격 증명이 포함되었습니다. 멕시코 역사상 알려진 가장 큰 규모의 단일 공격자에 의한 데이터 유출은 단 두 개의 상업적인 AI 구독권과 끈기로 실행되었습니다.

둘째, 2025년 8월의 '바이브 해킹(vibe hacking)' 사건입니다. 앤스로픽의 자체 위협 정보팀은 한 사이버 범죄자가 클로드 코드를 의료, 응급 서비스, 정부 및 종교 기관에 걸친 17개 조직을 상대로 한 엔드투엔드 랜섬웨어 캠페인의 핵심 운영 도구로 사용했다고 공개했습니다[2]. 클로드는 어떤 자격 증명을 수집할지, 어떤 횡적 이동(Lateral movement)을 시도할지, 어떤 데이터를 탈취할지, 심리적으로 맞춤화된 랜섬 노트를 어떻게 작성할지 등 전술적이고 전략적인 결정을 내렸습니다. 대부분의 보도가 놓친 부분은 바로 '자율성 비율'입니다. 이것은 단순히 자동 완성 기능으로서의 클로드가 아니었습니다. 이것은 실전 작전 요원으로서의 클로드였습니다.

셋째, 동일한 앤스로픽 보고서에 언급된 알제리의 초보 해커 사건입니다.

원문 보기

원문 보기 (영어)

Contents AI did not invent any new attacks or any new economic vulnerabilities. It did one thing: it dropped the cost and knowledge requirements for attackers by orders of magnitude, and made the execution possible by anyone with a subscription and malicious intent. Just in 2025, the news covered AI attacks that hit the Mexican government [1], seventeen healthcare and emergency services organizations [2], and eighty-five ransomware victims of one amateur in Algeria [2]. It is also happening in crypto today. And crypto is the only place we will be able to count it. AI evens the playing field Most coverage of AI in security right now picks one of two frames. Utopian - better audits, fewer bugs, safer code. Apocalyptic - autonomous superhackers finding novel zero-days that nobody has ever seen. Both frames miss what is actually happening. Frontier models in 2026 are producing the same kinds of findings as the static analyzers we have had for a decade. They just produce more of them, faster, at a lower marginal human cost. Daniel Stenberg, the curl maintainer who recently put one of the most hyped frontier models on his own codebase, said: “the AI tools find the usual and established kind of errors we already know about. It just finds new instances of them” [3]. The attack catalogue itself is the same one we have been losing money to since 2021 and before mass AI adoption. Oracle manipulation. Governance capture. Flash-loan-driven economic exploitation. Social engineering. Credential harvesting. Classic web vulnerabilities. AI did not add a single line item. What it reduced is the labor needed to operate any of them. An elite Solidity auditor could costs about $25,000 per engineer-week [4]. Call it $500 an hour, per their own procurement benchmarks. The same surface coverage on a frontier model runs about $1.22 per contract on average in API tokens, per Anthropic’s own published figures, and the per-exploit token cost is falling roughly 22% every model generation, or about every two months [5]. The skill required to spot a flash-loan governance attack has not gone down. The cost to run one has. AI did not break the floor. The floor was never knowledge. The floor was always a price tag on attacker labor, and now the price is a subscription. AI did not democratize hacking. It just billed it monthly. Random people, real hacks, this year The clearest evidence the floor is now a subscription is in the confirmed cases from the last twelve months. Three of them stand out. The Mexican government, December 2025 to January 2026. A solo operator (no nation-state backing, no custom malware, no observable ties to foreign intelligence per Gambit Security) jailbroke Claude Code into a “bug-bounty researcher” persona and ran more than 1,000 prompts against it [1], [6]. When Claude refused on safety grounds, ChatGPT was used as a backup. The result: 20 vulnerabilities exploited across the federal tax authority (SAT), the National Electoral Institute, and state governments in Jalisco, Michoacán, and Tamaulipas. 150 gigabytes of data exfiltrated. 195 million taxpayer records. Voter rolls. Government employee credentials. The largest known single-operator data breach in Mexican history was executed with two commercial AI subscriptions and persistence. The “vibe hacking” case, August 2025. Anthropic’s own threat intelligence team disclosed that a single cybercriminal used Claude Code as the operational core of an end-to-end extortion campaign against 17 organizations across healthcare, emergency services, government, and religious institutions [2]. Claude made tactical and strategic decisions. Which credentials to harvest. Which lateral movements to attempt. Which data to exfiltrate. How to phrase the psychologically tailored ransom note. The autonomy ratio is the part most coverage missed. This was not Claude as autocomplete. This was Claude as field operator. The Algerian amateur, in the same Anthropic report [2]. Someone with no track record of writing working malware used Claude to develop, troubleshoot, package, and sell it. The packages sold on dark-web forums for $400 to $1,200. Eighty-five victims in his first month. The Anthropic write-up is explicit: “without Claude’s assistance, they could not implement or troubleshoot core malware components.” None of these three operators are hackers by any traditional definition. None of them invented anything. They all subscribed to Claude. The catalogue stayed the same. The barrier to entry collapsed. Crypto as the perfect case study of AI hacking impact Crypto enters the story now, but not because it is more vulnerable than government data systems or healthcare networks. The Mexican government case is the larger single-operator incident of the year by record count. Crypto matters because it is more measurable. Public ledger. Deterministic execution. Open-source by default. Every smart contract is verifiable on Etherscan. Every exploit is timestamped. Every attacker and transaction leaves a trail in the block explorer. There is no other large-scale economic system where the offense/defense curve under AI uplift can be observed in the open, in real money, with adversarial ground truth. Three legitimate denominators for the volume of money already lost to the pre-AI version of this dynamic. Numbers do not reconcile, but here are a few with references. $11.9 billion in tracked smart-contract exploits across 2021 to 2025, per Immunefi’s 2026 State of Onchain Security report (425 incidents, strict smart-contract definition) [7]. Roughly $30 billion if you include scams and fraud, per Chainalysis aggregates [8]. $68 billion or more if you count exchange and protocol collapses, per Molly White’s Web3IsGoingJustGreat [9]. I use $11.9 billion as the primary anchor in the rest of this piece because Immunefi is the strictest definition. The other two are the upper bounds, and they exist for a reason. Crypto is not the easiest place to be hacked. It is the most transparent one to get traced. Open source plus money equals the perfect target and the perfect case study Three things make crypto the cleanest mass-scanning target in software. Surface area. Roughly 60 million smart contracts deployed on Ethereum, per Wang et al.’s 2024 measurement study [10]. Layer-2 deployments add another order of magnitude. Flipside Crypto counted more than 637 million EVM contracts across seven L2s by 2024 [11]. Etherscan’s daily verified-contract count hit 602 in 2023 at its peak [12]. The human-auditor workforce that covers this surface is, charitably, in the low thousands worldwide. Forensic transparency. Every prior exploit has a public record. Every attacker transaction is replayable from the block explorer. The training corpus for an attacking model is not “the public internet.” It is a curated, RAG-ready, dollar-priced exploit-and-defense dataset built by Trail of Bits, OpenZeppelin, PeckShield, BlockSec, Halborn, and the entire DeFi-security community over five years. Variant analysis (starting from a known prior bug) is dramatically more tractable for an LLM than open-ended discovery. This is the structural lesson from Google Big Sleep finding a real SQLite zero-day in October 2024 [13]. Crypto post-mortems are exactly that corpus. Economic density per line of code. A 500-line Solidity contract can hold $200 million of TVL. The same density does not exist in the average Linux kernel module or Express.js handler, unless it is a random open source library that can break the internet standalone. The expected value of a successful mass-scan-and-exploit pipeline is therefore higher per token spent in crypto than in essentially any other software domain. This is why an AI-enabled attacker rationally targets DeFi first and we would see much more of that in coming years. Where problems will start to leak first The existing research and published evidence

보안 해킹 AI 악용 클로드(Claude) 데이터 유출