MIT Tech Review • 92일 전

AI 과대광고와 실제 수익 사이의 빈칸

IMP

8/10

핵심 요약

현재 AI 산업은 기술을 구축하고 혁신을 약속하지만, 그 목표를 어떻게 달성할지에 대한 구체적인 실행 방안(2단계)이 부족한 상태입니다. 최근 연구들은 AI가 여전히 복잡한 실무 작업에서 한계를 보이며, 기존의 업무 프로세스와 결합하는 과정에서 오히려 효율성을 떨어뜨릴 수 있음을 지적합니다. 이는 AI에 대한 맹목적인 기대를 경계하고 실질적인 작업 환경 내 AI 통합 전략을 고민해야 함을 시사합니다.

번역된 본문

이 기사는 처음에 우리의 주간 AI 뉴스레터인 The Algorithm에 실렸습니다. 이와 같은 기사를 가장 먼저 받아보려면 여기에서 가입하세요. 2월, 저는 런던에서 열린 반(Anti) AI 시위에서 전단지를 하나 주웠습니다. 이 전단지를 만든 사람들이 사우스파크(South Park)의 '양말 도둑 노움(Underpants Gnomes)' 에피소드를 패러디하려 했는지는 확실하지 않습니다. 하지만 그랬다면 정확히 맞아떨어지는 훌륭한 패러디였습니다. 전단지에는 이렇게 적혀 있었습니다. "1단계: 디지털 초지능을 키운다. 2단계: ? 3단계: ?" 시위 공동 주최 단체인 국제 활동가 그룹 'Pause AI'가 제작한 이 전단지는 독자들에게 이렇게 호소하며 끝을 맺었습니다. "2단계가 대체 뭔지 알기 전까지 AI를 일시 중단하라."

1998년에 처음 방영된 사우스파크의 '노움(Gnomes)' 에피소드에서 케니, 카일, 카트만, 스탠은 밤마다 몰래 서랍에서 속옷을 훔치는 노움 마을을 발견합니다. 이유가 뭘까요? 노움들은 자신들의 사업 계획서(피치덱)를 이렇게 발표합니다. "1단계: 속옷을 수집한다. 2단계: ? 3단계: 막대한 수익을 창출한다."

이 노움들의 사업 계획은 스타트업 전략부터 정책 제안에 이르기까지 모든 것을 풍자하는 데 사용되며 인터넷 밈의 고전 중 하나가 되었습니다. 밈의 대가인 일론 머스크조차 화성 탐사 임무 자금 조달 방법에 대해 이야기할 때 이를 인용한 적이 있습니다.

현재 이 밈은 AI의 현 상황을 완벽하게 묘사하고 있습니다. 기업들은 기술을 구축했고(1단계), 세상의 변혁을 약속했습니다(3단계). 하지만 그 목표에 어떻게 도달할 것인지는 여전히 큰 물음표입니다. 'Pause AI'는 2단계가 반드시 일종의 규제를 포함해야 한다고 주장합니다. 하지만 구체적으로 어떤 규제를 요구해야 하며 누가 이를 집행할 것인지는 아직 논의 중입니다.

반면 AI 옹호자들은 3단계가 곧 '구원'이라고 확신하며 그 사이의 과정은 대충 얼버무리고 넘어가는 경향이 있습니다. 그들은 오픈AI(OpenAI)의 수석 과학자 야쿠브 파초키(Jakub Pachocki)가 몇 주 전 저에게 말했던 것처럼, '경제적으로 변혁을 일으키는 기술'을 등에 업고 햇살이 비치는 언덕을 향해 질주하고 있다고 봅니다. 그들은 자신들이 가야 할 곳이 어딘지 대략적으로는 알고 있습니다. 다만 그곳이 아직 약간 흐릿하고 도달하기까지 시간이 조금 더 필요해 보일 뿐입니다. 하지만 모두가 각기 다른 경로를 택하고 있습니다. 그들이 모두 목적지에 도달할 수 있을까요? 누군가는 도달할 수 있을까요?

미래에 대한 거창한 주장이 나올 때마다, 현실적인 한계를 지적하며 과대광거를 잠재우는 더 차분한 평가가 존재합니다. 최근 발표된 두 가지 연구를 살펴봅시다.

첫째, 앤스로픽(Anthropic)의 연구는 대형 언어 모델(LLM)이 어떤 유형의 직업에 가장 큰 영향을 미칠지 예측했습니다. (핵심 요약: 관리자, 건축가, 미디어 종사자들은 변화를 준비해야 하지만, 경비원, 건설 노동자, 호스피탈리티 업계 종사자들은 그렇지 않습니다.) 하지만 이러한 예측은 LLM이 실제 직장에서 어떻게 수행되는지가 아니라 어떤 작업에 능한지를 바탕으로 한 추측에 불과합니다.

둘째, AI 채용 스타트업인 머커(Mercor)의 연구원들이 2월에 발표한 또 다른 연구는 인간 은행가, 컨설턴트, 변호사들이 자주 수행하는 480가지 직무 과제에 대해 오픈AI, 앤스로픽, 구글 딥마인드(Google DeepMind)의 최고급 모델이 탑재된 여러 AI 에이전트를 테스트했습니다. 테스트된 모든 에이전트는 자신에게 주어진 대부분의 임무를 완수하는 데 실패했습니다.

왜 이렇게 의견이 엇갈리는 걸까요? 여러 가지 요인이 있습니다. 우선, 누가 (왜) 그런 주장을 하는지 고려하는 것이 중요합니다. 앤스로픽 역시 이 게임의 판에 참여하고 있는 당사자입니다. 게다가 무언가 큰일이 곧 일어날 것이라고 말하는 대부분의 사람들은 AI 코딩 도구가 얼마나 빨리 발전하고 있는지를 바탕으로 그런 결론에 도달했습니다. 하지만 모든 작업이 코딩만으로 해결될 수는 없습니다. 예를 들어, 다른 연구들에 따르면 LLM은 전략적 판단을 내리는 데는 서툰 것으로 나타났습니다.

더 중요한 것은, 이 도구들이 단순히 통제된 청정실(Cleanroom) 환경에 투입되는 것이 아니라는 점입니다. 이 도구들은 사람들과 기존의 업무 프로세스가 얽혀 있는 복잡한 환경에서 작동해야 합니다. 때로는 AI를 추가하는 것이 오히려 상황을 악화시킬 수도 있습니다. 물론 AI가 진정한 변혁을 일으키려면 기존의 업무 프로세스를 완전히 뜯어고치고 새로운 기술 중심으로 재구성해야 할 수도 있습니다. 하지만 그것은 시간이 걸리고 큰 용기가 필요한 일입니다.

바로 저 커다란 구멍이 2단계가 있어야 할 자리입니다. 앞으로 정확히 무슨 일이 일어날지, 그리고 어떻게 일어날지에 대한 합의 부족은 근거 없는 최신의 터무니없는 주장들로 채워지는 정보의 진공 상태를 만듭니다. 우리는 다가올 미래와 그것이 어떻게 배치될지에 대한 실질적인 이해에서 너무나 동떨어져 있기 때문에, 단 하나의 소셜 미디어 게시물이 (그리고 실제로 그럴 수 있습니다)

원문 보기

원문 보기 (영어)

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here . In February, I picked up a flyer at an anti-AI march in London . I can’t say for sure whether or not its writers meant to riff on South Park’s underpants gnomes. But if they did, they nailed it: “Step 1: Grow a digital super mind,” it read. “Step 2: ? Step 3: ?” Produced by Pause AI, an international activist group that co-organized the protest, it ended with this plea to the reader: “Pause AI until we know what the hell Step 2 is.” In the South Park episode “Gnomes,” which first aired in 1998, Kenny, Kyle, Cartman, and Stan discover a community of gnomes that sneak out at night to steal underpants from dressers. Why? The gnomes present their pitch deck. “Phase 1: Collect underpants. Phase 2: ? Phase 3: Profit.” The gnomes’ business plan has since become one of the greats among internet memes , used to satirize everything from startup strategies to policy proposals. Memelord in chief Elon Musk once invoked it in a talk about how he planned to fund a mission to Mars. Right now, it captures the state of AI. Companies have built the tech (Step 1) and promised transformation (Step 3). How they get there is still a big question mark. As far as Pause AI is concerned, Step 2 must involve some kind of regulation. But exactly what it will call for and who will enforce it are up for debate. AI boosters, on the other hand, are convinced that Step 3 is salvation and tend to glaze over the middle bit. They see us racing toward sunny uplands on the back of an “economically transformative technology,” as OpenAI’s chief scientist, Jakub Pachocki, put it to me a few weeks ago. They know where they want to go—more or less: It’s hazy up there and still some way off. But everyone’s taking a different route. Will they all make it? Will anyone? For every big claim about the future, there is a more sober assessment of how the rubber meets the road—one that quells the hype. Consider two recent studies. One, from Anthropic, predicted what types of jobs are going to be most affected by LLMs . (A takeaway: Managers, architects, and people in the media should prepare for change; groundskeepers, construction workers, and those in hospitality, not so much.) But their predictions are really just guesses, based on what kinds of tasks LLMs seem to be good at rather than how they really perform in the workplace. Another study, put out in February by researchers at Mercor, an AI hiring startup, tested several AI agents powered by top-tier models from OpenAI, Anthropic, and Google DeepMind on 480 workplace tasks frequently carried out by human bankers, consultants, and lawyers. Every agent they tested failed to complete most of its duties. Why is there such wide disagreement? There are a number of factors. For a start, it’s crucial to consider who is making the claims (and why). Anthropic has skin in the game. What’s more, most of the people telling us that something big is about to happen have reached that conclusion largely on the basis of how fast AI coding tools are getting. But not all tasks can be hacked with coding. Other studies have found that LLMs are bad at making strategic judgment calls, for example. What’s more, when they’re deployed, the tools aren’t just dropped into a cleanroom. They need to work in places contaminated with people and existing workflows. And sometimes adding AI will make things worse. Sure, maybe those workflows need to be torn up and refashioned around the new technology for it to achieve transformative status, but that will take time (and guts). That big hole? It’s right where Step 2 should be. The lack of agreement on exactly what’s about to happen—and how—creates an information vacuum that gets filled by the latest wild claim of the week, evidence be damned. We’re so unmoored from any real understanding of what’s coming and how it will be deployed that a single social media post can (and does) shake markets. We need fewer guesses and more evidence. But that’s going to require transparency from the model makers, coordination between researchers and businesses, and new ways to evaluate this technology that tell us what really happens when it’s rolled out in the real world. The tech industry (and with it the world’s economy) rests on the held-out promise that AI really will be transformative. But that is not yet a sure bet. Next time you hear bold claims about the future, remember that most businesses are still figuring out what to do with their underpants. Deep Dive Artificial intelligence OpenAI is throwing everything into building a fully automated researcher An exclusive conversation with OpenAI’s chief scientist, Jakub Pachocki, about his firm's new grand challenge and the future of AI. By Will Douglas Heaven archive page How Pokémon Go is giving delivery robots an inch-perfect view of the world Exclusive: Niantic's AI spinout is training a new world model using 30 billion images of urban landmarks crowdsourced from players. By Will Douglas Heaven archive page Want to understand the current state of AI? Check out these charts. According to Stanford’s 2026 AI Index, AI is sprinting, and we’re struggling to keep up. By Michelle Kim archive page This startup wants to change how mathematicians do math Axiom Math is giving away a powerful new AI tool. But it remains to be seen if it speeds up research as much as the company hopes. By Will Douglas Heaven archive page Stay connected Illustration by Rose Wong Get the latest updates from MIT Technology Review Discover special offers, top stories, upcoming events, and more. Enter your email Privacy Policy Thank you for submitting your email! Explore more newsletters It looks like something went wrong. We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.

AI 트렌드 비즈니스 전략 LLM 한계 AI 규제 과대광고