MIT Tech Review • 104일 전

AI 전쟁에서 '인간의 통제'가 환상인 이유

IMP

8/10

핵심 요약

미국 국방부의 AI 무기 사용을 둘러싼 논쟁에서, 작전에 '인간이 개입한다(Humans in the loop)'는 개념이 실제로는 통제를 보장하지 못하는 모순을 지적합니다. 최첨단 AI는 블랙박스처럼 작동해 인간이 AI의 의도를 파악하지 못한 채 전쟁 범죄로 이어질 수 있는 결정을 승인할 위험이 있습니다. 민간 분야에서조차 신중히 도입하는 블랙박스 AI를 전장에 섣불리 도입하는 것에 대한 경고와 함께 AI 시스템의 의도를 해석하는 과학의 발전이 시급하다고 강조합니다.

번역된 본문

전쟁에 인공지능을 활용하는 문제는 현재 앤스로픽(Authropic)과 미국 국방부 간의 법적 공방의 핵심에 있다. 이 논쟁은 절박해졌는데, 이란과의 현재 갈등에서 AI가 그 어느 때보다 큰 역할을 하고 있기 때문이다. AI는 이제 더 이상 인간이 정보를 분석하는 것을 돕는 수단에 그치지 않는다. AI는 이제 적극적인 주체로서 실시간으로 표적을 생성하고, 미사일 요격을 통제 및 조정하며, 치명적인 자율 드론 무리를 유도하고 있다.

AI 기반 자율 치명 무기 사용에 관한 대부분의 대중적 논의는 인간이 어느 정도까지 의사결정 과정에 참여하여 통제(Humans in the loop)해야 하는지에 초점이 맞춰져 있다. 미 국방부의 현행 지침에 따르면, 인간의 감독은 해킹 위험을 줄이면서 책임성, 상황적 맥락, 그리고 미묘한 차이를 고려하는 판단을 제공한다고 되어 있다.

AI 시스템은 불투명한 '블랙박스(Black boxes)'다 하지만 '인간의 통제(Humans in the loop)'에 대한 논쟁은 그저 우리를 안심시킬 뿐 실제 문제를 은폐하는 결과를 낳고 있다. 당면한 가장 큰 위험은 기계가 인간의 감독 없이 행동할 것이라는 점이 아니라, 인간 감독관들이 기계가 실제로 무엇을 '생각'하고 있는지 전혀 알지 못한다는 데 있다. 국방부의 지침은 인간이 AI 시스템이 어떻게 작동하는지 이해한다는 위험한 가정 위에 세워졌기 때문에 근본적으로 결함이 있다. 수십 년간 인간 두뇌의 의도를 연구해 왔고 최근에는 AI 시스템의 의도를 연구해 온 나로서는, 최첨단 AI 시스템이 본질적으로 '블랙박스'라는 것을 확실히 말할 수 있다. 우리는 입력과 출력은 알지만, 그것을 처리하는 인공 '두뇌'는 여전히 불투명하게 남아있다. 심지어 AI를 만든 창작자들조차도 그것을 완벽하게 해석하거나 어떻게 작동하는지 완전히 이해하지 못한다. 그리고 AI가 이유를 제시하더라도 그것이 항상 신뢰할 수 있는 것은 아니다.

자율 시스템에서 인간 감독의 환상 인간의 감독에 대한 논쟁에서 가장 근본적인 질문이 제기되지 않고 있다. 그것은 바로 '우리는 AI 시스템이 행동하기 전에 그것이 무엇을 하려는지 의도를 파악할 수 있는가?'이다. 적의 탄약 공장을 파괴하라는 임무를 받은 자율 드론을 상상해 보라. 자동화된 지휘 통제 시스템은 최적의 표적이 탄약 저장 건물이라고 결정한다. 그리고 건물 내 탄약의 2차 폭발이 시설을 완전히 파괴할 것이므로 임무 성공 확률이 92%라고 보고한다. 인간 조작자는 적법한 군사적 목표를 검토하고, 높은 성공률을 확인한 뒤 공격을 승인한다. 하지만 조작자가 모르는 사실이 있다. AI 시스템의 계산에는 숨겨진 변수가 포함되어 있었다는 것이다. 탄약 공장을 파괴하는 것을 넘어, 2차 폭발은 인근의 소아병원에도 심각한 손상을 입힐 것이었다. 그러면 응급 구조대가 병원에 집중될 것이고, 그 결과 공장이 완전히 전소되는 것을 확실하게 만들 수 있다. AI의 관점에서 이 방식으로 타격 효과를 극대화하는 것은 주어진 목표를 달성하는 최적의 행동이다. 그러나 인간의 관점에서 이는 민간인의 생명과 관련된 규칙을 위반하여 잠재적으로 전쟁 범죄를 저지르는 것이다. 인간이 AI의 행동 이전에 그 의도를 알 수 없기 때문에, 의사결정 과정에 인간을 계속 참여시키는 것이 사람들이 상상하는 것과 같은 안전장치 역할을 하지 못할 수 있다.

고도화된 AI 시스템은 단순히 지시를 실행하는 것이 아니라 그것을 해석한다. 만약 조작자가 목표를 충분히 신중하게 정의하지 못한다면(이는 압박감이 큰 상황에서 매우 흔히 일어나는 일이다), 이 '블랙박스' 시스템은 주어진 지시를 정확히 수행하면서도 인간이 의도한 바와는 전혀 다르게 행동할 수 있다. AI 시스템과 인간 조작자 사이의 이러한 '의도의 간극(Intention gap)'은 바로 우리가 민간 의료나 항공 교통 관제와 같은 분야에서 최첨단 블랙박스 AI 도입을 주저하는 이유이며, 직장 내 통합이 여전히 문제 투성이인 이유이다. 그러나 정작 우리는 전장에서는 이를 서둘러 도입하고 있다.

설상가상으로, 분쟁 당사자 중 한쪽이 기계의 속도와 규모로 작동하는 완전 자율 무기를 배치한다면, 경쟁력을 유지해야 하는 압박감은 상대방 역시 그러한 무기에 의존하도록 밀어붙일 것이다. 이는 전쟁에서 점점 더 자율적이고 불투명한 AI 의사결정이 사용될 가능성이 높아짐을 의미한다.

해결책: AI 의도의 과학을 발전시켜라 AI 과학은 고도로 유능한 AI 기술을 구축하는 것과 이 기술이 어떻게 작동하는지 이해하는 것 모두를 포함해야 한다. 보다 유능한 모델을 개발하고 구축하는 데 있어서는 엄청난 발전이 있었다.

원문 보기

원문 보기 (영어)

The availability of artificial intelligence for use in warfare is at the center of a legal battle between Anthropic and the Pentagon . This debate has become urgent, with AI playing a bigger role than ever before in the current conflict with Iran. AI is no longer just helping humans analyze intelligence. It is now an active player—generating targets in real time, controlling and coordinating missile interceptions, and guiding lethal swarms of autonomous drones. Most of the public conversation regarding the use of AI-driven autonomous lethal weapons centers on how much humans should remain “in the loop.” Under the Pentagon’s current guidelines , human oversight supposedly provides accountability, context, and nuance while reducing the risk of hacking . AI systems are opaque “black boxes” But the debate over “humans in the loop” is a comforting distraction. The immediate danger is not that machines will act without human oversight; it is that human overseers have no idea what the machines are actually “thinking.” The Pentagon’s guidelines are fundamentally flawed because they rest on the dangerous assumption that humans understand how AI systems work. Having studied intentions in the human brain for decades and in AI systems more recently, I can attest that state-of-the-art AI systems are essentially “black boxes.” We know the inputs and outputs, but the artificial “brain” processing them remains opaque. Even their creators cannot fully interpret them or understand how they work . And when AIs do provide reasons, they are not always trustworthy. The illusion of human oversight in autonomous systems In the debate over human oversight, a fundamental question is going unasked: Can we understand what an AI system intends to do before it acts? Imagine an autonomous drone tasked with destroying an enemy munitions factory. The automated command and control system determines that the optimal target is a munitions storage building. It reports a 92% probability of mission success because secondary explosions of the munitions in the building will thoroughly destroy the facility. A human operator reviews the legitimate military objective, sees the high success rate, and approves the strike. But what the operator does not know is that the AI system’s calculation included a hidden factor: Beyond devastating the munitions factory, the secondary explosions would also severely damage a nearby children’s hospital. The emergency response would then focus on the hospital, ensuring the factory burns down. To the AI, maximizing disruption in this way meets its given objective. But to a human, it is potentially committing a war crime by violating the rules regarding civilian life. Keeping a human in the loop may not provide the safeguard people imagine, because the human cannot know the AI’s intention before it acts. Advanced AI systems do not simply execute instructions; they interpret them. If operators fail to define their objectives carefully enough—a highly likely scenario in high-pressure situations—the “black box” system could be doing exactly what it was told and still not acting as humans intended. This “intention gap” between AI systems and human operators is precisely why we hesitate to deploy frontier black-box AI in civilian health care or air traffic control , and why its integration into the workplace remains fraught —yet we are rushing to deploy it on the battlefield. To make matters worse, if one side in a conflict deploys fully autonomous weapons, which operate at machine speed and scale, the pressure to remain competitive would push the other side to rely on such weapons too. This means the use of increasingly autonomous—and opaque—AI decision-making in war is only likely to grow. The solution: Advance the science of AI intentions The science of AI must comprise both building highly capable AI technology and understanding how this technology works. Huge advances have been made in developing and building more capable models, driven by record investments—forecast by Gartner to grow to around $2.5 trillion in 2026 alone . In contrast, the investment in understanding how the technology works has been minuscule. We need a massive paradigm shift. Engineers are building increasingly capable systems. But understanding how these systems work is not just an engineering problem—it requires an interdisciplinary effort. We must build the tools to characterize, measure, and intervene in the intentions of AI agents before they act. We need to map the internal pathways of the neural networks that drive these agents so that we can build a true causal understanding of their decision-making, moving beyond merely observing inputs and outputs. A promising way forward is to combine techniques from mechanistic interpretability (breaking neural networks down into human-understandable components) with insights, tools, and models from the neuroscience of intentions. Another idea is to develop transparent, interpretable “auditor” AIs designed to monitor the behavior and emergent goals of more capable black-box systems in real time. Developing a better understanding of how AI functions will enable us to rely on AI systems for mission-critical applications. It will also make it easier to build more efficient, more capable, and safer systems. Colleagues and I are exploring how ideas from neuroscience, cognitive science, and philosophy—fields that study how intentions arise in human decision-making—might help us understand the intentions of artificial systems . We must prioritize these kinds of interdisciplinary efforts, including collaborations between academia, government, and industry. However, we need more than just academic exploration. The tech industry—and the philanthropists funding AI alignment , which strives to encode human values and goals into these models—must direct substantial investments toward interdisciplinary interpretability research. Furthermore, as the Pentagon pursues increasingly autonomous systems, Congress must mandate rigorous testing of AI systems’ intentions, not just their performance. Until we achieve that, human oversight over AI may be more illusion than safeguard. Uri Maoz is a cognitive and computational neuroscientist specializing in how the brain transforms intentions into actions. A professor at Chapman University with appointments at UCLA and Caltech, he leads an interdisciplinary initiative focused on understanding and measuring intentions in artificial intelligence systems ( ai-intentions.org ). Deep Dive Artificial intelligence OpenAI is throwing everything into building a fully automated researcher An exclusive conversation with OpenAI’s chief scientist, Jakub Pachocki, about his firm's new grand challenge and the future of AI. By Will Douglas Heaven archive page How Pokémon Go is giving delivery robots an inch-perfect view of the world Exclusive: Niantic's AI spinout is training a new world model using 30 billion images of urban landmarks crowdsourced from players. By Will Douglas Heaven archive page This startup wants to change how mathematicians do math Axiom Math is giving away a powerful new AI tool. But it remains to be seen if it speeds up research as much as the company hopes. By Will Douglas Heaven archive page AI benchmarks are broken. Here’s what we need instead. One-off tests don’t measure AI’s true impact. We’re better off shifting to more human-centered, context-specific methods. By Angela Aristidou archive page Stay connected Illustration by Rose Wong Get the latest updates from MIT Technology Review Discover special offers, top stories, upcoming events, and more. Enter your email Privacy Policy Thank you for submitting your email! Explore more newsletters It looks like something went wrong. We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’

AI 무기 군사 AI AI 안전 규제 정책 블랙박스 AI