TechCrunch AI • 105일 전

스타트업의 승부수, 토큰 극대화로 차세대 클라우드 강자 도약

IMP

7/10

핵심 요약

AI 개발자들의 핵심 요구인 '빠르고 저렴한 토큰' 제공에 집중하는 스타트업 Parasail(파라세일)이 3,200만 달러의 시리즈 A 투자를 유치했습니다. 이 회사는 자체 칩을 소유하지 않고 전 세계 데이터센터의 유휴 GPU 자원을 유연하게 연결하여, 오픈소스 모델과 AI 에이전트 개발에 필수적인 '추론(Inference)' 비용을 획기적으로 낮추는 인프라를 제공합니다.

번역된 본문

"토큰을 주세요. 그냥 토큰이 필요해요. 빨리, 저렴하게, 당장 주세요." 이것은 생성형 AI 모델 기반으로 소프트웨어를 개발하는 개발자들의 주문이며, 적어도 Parasail의 CEO 마이크 헨리(Mike Henry)가 듣는 핵심 요구사항이다. AI 모델 추론(Inference)을 실행하는 기업들에 클라우드 컴퓨팅 서비스를 제공하는 Parasail은 하루에 5,000억 개의 토큰을 생성한다고 밝혔다. 이른바 '토큰 극대화(Tokenmaxxing)'의 좋은 사례가 아닐 수 없다.

헨리는 LLM에 특화된 칩 제조사인 Groq의 임원 출신으로, AI 모델 기반 소프트웨어 개발자들이 그들의 요구에 맞춰진 특화된 클라우드 처리를 원할 것이라는 점을 일찍이 간파하고 해당 회사의 클라우드 서비스를 구축한 바 있다. 이제 1년 전 스텔스 모드(비공개 개발 단계)를 벗어난 Parasail은 이러한 서비스를 대규모로 확장하기 위해 3,200만 달러 규모의 시리즈 A 투자를 유치했다.

헨리는 물리적 칩 설계 배경을 가지고 있지만, Parasail은 자체 칩을 소유하는 데 집착하지 않는다. 일부 자체 GPU를 보유하고는 있으나, 주로 전 세계 15개국 40개 데이터센터의 프로세서 처리 시간을 빌리고 유동성 시장에서 추가로 구매하여 백그라운드에서 이를 모두 조율하며 추론 요청 비용을 낮춘다. 워크로드를 지능적으로 할당하고 수요 피크(최대치)를 회피함으로써, 자체 실리콘을 보유하고 있어 기존 고객의 약정이나 워크로드에 의해 제약을 받을 수 있는 기업들과 경쟁하려는 목표를 가지고 있다.

이 회사의 잠재력은 프론티어 모델(최첨단 모델)을 개발하는 대형 AI 연구소 외부에서 오픈소스 모델과 AI 에이전트가 지속적으로 확산되는 데 달려 있다. Parasail의 경영진과 투자자들은 이러한 흐름이 Anthropic이나 OpenAI 같은 기업의 서비스를 이용할 때 발생하는 비용 증가와 마찰 요인 때문에 촉발되고 있다고 말한다. 과학 문헌용 리서치 어시스턴트를 개발하기 위해 2,200만 달러 규모의 시리즈 A를 유치한 스타트업 Elicit의 CEO 안드레아스 슈트뮐러(Andreas Stuhlmüller)에 따르면, 대신 하이브리드 아키텍처가 떠오르고 있다. 최상위 제약회사의 고객들은 그의 LLM 기반 도구를 사용하여 수만 편의 과학 논문에서 나온 데이터를 검토하고 분석한다.

[TechCrunch 이벤트 안내 생략]

슈트뮐러는 TechCrunch와의 인터뷰에서 "우리가 오픈 모델로 더 많이 이동한 이유는 엔드포인트에 수십만 건의 요청을 보내는 것이 꽤나 부담스럽기 때문"이라며, 특히 에이전트를 활용해 서비스를 고도화하고, 작업을 분할하며 장기적인 관점에서 더 전략적으로 일을 처리하게 된 지금은 더욱 그렇다고 말했다. 오픈 모델이 초기 스크리닝을 처리하여 작업 비용을 낮추고, 이후 더 유능한 최첨단 프론티어 모델이 최종 답변을 제공하는 방식이다.

에이전트가 소프트웨어 개발에서 점점 더 일반적인 요소로 자리 잡으면서 모델 쿼리가 폭발적으로 증가하고 있으며, 이는 저렴한 추론용 인프라를 제공하는 Parasail 같은 기업에 대한 투자를 이끌고 있다. 이번 투자 라운드를 공동 리드한 Touring Capital의 파트너 사미르 쿠마르(Samir Kumar)는 미래에 소프트웨어 개발 비용 중 추론 비용이 적어도 20%를 차지할 것으로 예상한다.

이 시장에서 Parasail이 차지할 수 있는 몫은 얼마나 될까? 경쟁이 치열한 클라우드 컴퓨팅 시장에서 헨리는 추론에만 집중(모델 학습은 취급하지 않음)하고 장기 계약 없이 스타트업 고객을 수용하려는 의지가, 엔터프라이즈 비즈니스에 집중하는 대형 클라우드 컴퓨팅 회사나 Fireworks AI, Baseten 등 클라우드 추론 분야의 자금력이 더 두터운 경쟁사와 차별화된다고 주장한다. 물론 모든 고객이 초기 시드(Seed) 단계나 시리즈 B 단계의 스타트업일 경우에는 또 다른 종류의 리스크가 존재하게 되지만 말이다.

원문 보기

원문 보기 (영어)

"Give me tokens. Just give me tokens. I want them fast. I want them cheap. I want them now." That's the mantra for developers building software on generative AI models, or at least what Parasail CEO Mike Henry hears. Parasail provides a cloud computing service to companies running AI models for inference, and Henry told TechCrunch it generates 500 billion tokens a day. How's that for tokenmaxxing? Henry was an executive at Groq, the LLM-focused chipmaker, where he built the company's cloud offering, an early recognition that developers building software on AI models would want cloud processing specialized to their needs. Now, after coming out of stealth a year ago, Parasail has raised a $32 million Series A to do that at scale. Henry has a background in physical chip design, but Parasail isn't committed to owning its own chips. While some of its GPUs are its own, the company mainly rents processing time at 40 data centers in 15 countries around the globe, and buys more from liquidity markets, orchestrating that all behind the scenes to drive down the cost of inference requests. By allocating workloads cleverly and avoiding demand peaks, the company aims to compete with firms that own their own silicon and might be constrained by existing customer commitments and workloads. The company's potential relies on the continued proliferation of open-source models and agents outside of frontier labs. Parasail's executives and investors say this is driven by the growing cost and friction of using offerings from companies like Anthropic and OpenAI. Instead, a hybrid architecture is emerging, according to Andreas Stuhlmüller, the CEO of Elicit, a startup that has raised a $22 million Series A to develop a research assistant for scientific literature. His customers at top pharmaceutical companies use the LLM-based tool to review and analyze data from tens of thousands of scientific papers. Techcrunch event Meet your next investor or portfolio startup at Disrupt Your next round. Your next hire. Your next breakout opportunity. Find it at TechCrunch Disrupt 2026, where 10,000+ founders, investors, and tech leaders gather for three days of 250+ tactical sessions, powerful introductions, and market-defining innovation. Register now to save up to $410. Meet your next investor or portfolio startup at Disrupt Your next round. Your next hire. Your next breakout opportunity. Find it at TechCrunch Disrupt 2026, where 10,000+ founders, investors, and tech leaders gather for three days of 250+ tactical sessions, powerful introductions, and market-defining innovation. Register now to save up to $410. San Francisco, CA | October 13-15, 2026 REGISTER NOW "We've moved more towards open models because it's pretty rough sending 100,000s of requests to an API endpoint," Stuhlmüller told TechCrunch, especially now that the company is relying on agents to improve its offering, splitting up tasks and working more strategically over longer time horizons. Open models handle the initial screening to drive down the cost of the work, before a more capable frontier model provides a final answer. The proliferation of model queries, as agents become an increasingly common part of software development, is driving the investment in companies like Parasail that provide the infrastructure for cheap inference. Samir Kumar, a partner at Touring Capital who co-led this round, told TechCrunch he expects inference to be at least 20% of the cost of building software in the future. How much of that market could be Parasail's? In the crowded cloud compute space, Henry argues that his firm's focus on inference (no training allowed) and willingness to take on startup customers without long-term commitments sets his offering apart from larger cloud-computing companies focused on enterprise business, and even better-funded competitors in the cloud inference space, like Fireworks AI and Baseten. Of course, there's a different kind of risk when all of your customers are seed and Series B startups in the unpredictable AI sector. Steve Jang, a partner at Kindred Ventures, the other co-leader in this fundraising, says the economics of deploying models will demand the kind of compute brokerage Parasail provides. And that's before widespread use of models for content generation and robotics. "Everyone thought there was an AI bubble. There's no AI bubble," he told TechCrunch. "Inference demand is far outstripping supply." Topics AI , Exclusive , Parasail Tim Fernholz Tim Fernholz is a journalist who writes about technology, finance and public policy. He has closely covered the rise of the private space industry and is the author of Rocket Billionaires: Elon Musk, Jeff Bezos and the New Space Race. Formerly, he was a senior reporter at Quartz, the global business news site, for more than a decade, and began his career as a political reporter in Washington, D.C. You can contact or verify outreach from Tim by emailing tim.fernholz@techcrunch.com or via an encrypted message to tim_fernholz.21 on Signal. View Bio April 30 San Francisco, CA StrictlyVC kicks off the year in SF. Get in the room for unfiltered fireside chats with industry leaders, insider VC insights, and high-value connections that actually move the needle. Tickets are limited. REGISTER NOW Most Popular An Amazon warehouse worker died on the job at Oregon facility Amanda Silberling Stanford report highlights growing disconnect between AI insiders and everyone else Sarah Perez Sam Altman responds to ‘incendiary’ New Yorker article after attack on his home Anthony Ha Anthropic temporarily banned OpenClaw's creator from accessing Claude Julie Bort France to ditch Windows for Linux to reduce reliance on US tech Zack Whittaker YouTube Premium and YouTube Music are getting more expensive Aisha Malik This founder helped build SpaceX's most powerful rocket engine. Now he's building a ‘fighter jet for orbit.' Tim Fernholz

인프라 클라우드컴퓨팅 AI추론 오픈소스 스타트업