메뉴
HN
Hacker News 43일 전

Fil-C 간소화 모델: 메모리 안전한 C/C++ 구현 원리

IMP
7/10
핵심 요약

최근 화제가 된 C/C++의 메모리 안전 구현체인 Fil-C의 작동 방식을 이해하기 쉽게 소개한 글입니다. 포인터 변수마다 메모리 할당 기록(AllocationRecord)을 추적하여, 컴파일러가 자동으로 바운드 검사(bounds check) 코드를 삽입하는 원리를 코드 변환 예시로 설명합니다. 이를 통해 개발자는 기존 C/C++ 코드를 크게 수정하지 않고도 메모리 안전성을 확보할 수 있습니다.

번역된 본문

최근 메모리 안전한 C/C++ 구현체를 표방하는 Fil-C에 대한 이야기를 많이 접했습니다. 어떻게 이를 달성했는지에 대한 복잡한 세부 사항을 읽어볼 수도 있지만, 처음 접하는 사람들에게는 단순화된 버전을 보여주는 것이 가치 있다고 생각합니다. 단순화된 버전을 이해하고 나면, 실제 프로덕션 수준의 버전을 이해하는 것도 정신적으로 훨씬 수월해지기 때문입니다.

실제 Fil-C는 LLVM IR을 재작성하는 컴파일러 패스(pass)를 사용하지만, 이 간소화된 모델은 C/C++ 소스 코드의 자동 재작성 방식입니다. 즉, 안전하지 않은 코드(unsafe code)가 안전한 코드(safe code)로 변환됩니다.

첫 번째 변환은 모든 함수 내에서 포인터 타입의 모든 지역 변수에 AllocationRecord* 타입의 동반 지역 변수가 추가되는 것입니다. 예를 들어 다음과 같습니다.

[원본 소스코드] void f() { T1* p1; T2* p2; uint64_t x; ...

[Fil-C 변환 후] void f() { T1* p1; AllocationRecord* p1ar = NULL; T2* p2; AllocationRecord* p2ar = NULL; uint64_t x; ...

여기서 AllocationRecord는 다음과 같은 형태입니다: struct AllocationRecord { char * visible_bytes; char * invisible_bytes; size_t length; };

포인터 타입 지역 변수에 대한 사소한 연산들은 AllocationRecord*도 함께 이동하도록 재작성됩니다.

[원본 소스코드] p1 = p2; p1 = p2 + 10; p1 = (T1*)x; x = (uintptr_t)p1;

[Fil-C 변환 후] p1 = p2, p1ar = p2ar; p1 = p2 + 10, p1ar = p2ar; p1 = (T1*)x, p1ar = NULL; x = (uintptr_t)p1;

포인터가 함수로 전달되거나 함수에서 반환될 때, 코드는 원래 포인터와 함께 AllocationRecord*를 포함하도록 재작성됩니다. 특정 표준 라이브러리 함수 호출은 추가로 해당 함수의 Fil-C 버전을 호출하도록 재작성됩니다.

이를 종합하면 다음과 같습니다.

[원본 소스코드] p1 = malloc(x); ... free(p1);

[Fil-C 변환 후] {p1, p1ar} = filc_malloc(x); ... filc_free(p1, p1ar);

filc_malloc의 (간소화된) 구현은 실제로 요청된 할당 하나 대신 세 가지 개별 할당을 수행합니다: void * filc_malloc(size_t length) { AllocationRecord* ar = malloc(sizeof(AllocationRecord)); ar->visible_bytes = malloc(length); ar->invisible_bytes = calloc(length, 1); ar->length = length; return {ar->visible_bytes, ar}; }

포인터 변수가 역참조(dereference)될 때, 동반되는 AllocationRecord*가 바운드 검사(bounds check)를 수행하는 데 사용됩니다:

[원본 소스코드] x = *p1; ... *p2 = x;

[Fil-C 변환 후] assert(p1ar != NULL); uint64_t i = (char*)p1 - p1ar->visible_bytes; assert(i < p1ar->length); assert((p1ar->length - i) >= sizeof(*p1)); x = p1; ... assert(p2ar != NULL); uint64_t i = (char)p2 - p2ar->visible_bytes; assert(i < p2ar->length); assert((p2ar->length - i) >= sizeof(*p2)); *p2 = x;

저장되거나 로드되는 값 자체가 포인터인 경우 상황은 더 흥미로워집니다. 이미 본 것처럼, 포인터 타입의 지역 변수에는 컴파일러에 의해 동반되는 AllocationRecord* 변수가 삽입됩니다. 컴파일러는 모든 지역 변수에 대한 완전한 제어权和 가시성을 가지고 있기 때문에 이 작업을 수행할 수 있습니다.

포인터가 지역 변수가 아닌 힙(heap)에 존재하게 되면 상황은 더 어려워지지만, 이때 invisible_bytes가 사용됩니다. visible_bytes + i에 포인터가 있다면, 그에 동반되는 AllocationRecord*invisible_bytes + i에 위치하게 됩니다. 다시 말해, invisible_bytes는 요소 타입이 AllocationRecord*인 배열입니다. 이 배열에 대한 정상적인 접근을 보장하기 위해 isizeof(AllocationRecord*)의 배수여야 합니다. 이를 위한 추가 로직은 다음과 같습니다.

[원본 소스코드] p2 = *p1; ... *p1 = p2;

[Fil-C 변환 후] assert(p1ar != NULL); uint64_t i = (char*)p1 - p1ar->visible_bytes; assert(i < p1ar->length); assert((p1ar->length - i) >= sizeof(p1)); assert((i % sizeof(AllocationRecord)) == 0); p2 = p1; p2ar = (AllocationRecord)(p1ar->invisible_bytes + i); ... assert(p1ar != NULL); uint64_t i = (char*)p1 - p1ar->visible_bytes; assert(i < p1ar->length); assert((p1ar->length - i) >= sizeof(p1)); assert((i % sizeof(AllocationRecord)) == 0); p1 = p2; (AllocationRecord)(p1ar->invisible_bytes + i) = p2ar;

아직 살펴보지 않은 한 가지는 fi...(원문 일부 누락)

원문 보기
원문 보기 (영어)
I've seen lots of chatter about Fil-C recently, which pitches itself as a memory safe implementation of C/C++. You can read the gritty details of how this is achieved, but for people coming across it for the first time, I think there is value in showing a simplified version, as once you've understood the simplified version it becomes a smaller mental step to then understand the production-quality version. The real Fil-C has a compiler pass which rewrites LLVM IR, whereas the simplified model is an automated rewrite of C/C++ source code: unsafe code is transformed into safe code. The first rewrite is that within every function, every local variable of pointer type gains an accompanying local variable of AllocationRecord* type, for example: Original Source After Fil-C Transform void f () { T1* p1; T2* p2; uint64_t x; ... void f () { T1* p1; AllocationRecord* p1ar = NULL ; T2* p2; AllocationRecord* p2ar = NULL ; uint64_t x; ... Where AllocationRecord is something like: struct AllocationRecord { char * visible_bytes; char * invisible_bytes; size_t length; }; Trivial operations on local variables of pointer type are rewritten to also move around the AllocationRecord* : Original Source After Fil-C Transform p1 = p2; p1 = p2, p1ar = p2ar; p1 = p2 + 10; p1 = p2 + 10, p1ar = p2ar; p1 = (T1*)x; p1 = (T1*)x, p1ar = NULL; x = (uintptr_t)p1; x = (uintptr_t)p1; When pointers are passed-to or returned-from functions, the code is rewritten to include the AllocationRecord* as well as the original pointer. Calls to particular standard library functions are additionally rewritten to call Fil-C versions of those functions. Putting this together, we get: Original Source After Fil-C Transform p1 = malloc (x); ... free (p1); {p1, p1ar} = filc_malloc (x); ... filc_free (p1, p1ar); The (simplified) implementation of filc_malloc actually performs three distinct allocations rather than just the requested one: void * filc_malloc(size_t length) { AllocationRecord* ar = malloc ( sizeof (AllocationRecord)); ar->visible_bytes = malloc (length); ar->invisible_bytes = calloc (length, 1 ); ar->length = length; return {ar->visible_bytes, ar}; } When a pointer variable is dereferenced, the accompanying AllocationRecord* is used to perform bounds checks: Original Source After Fil-C Transform x = *p1; ... *p2 = x; assert(p1ar != NULL); uint64_t i = ( char *)p1 - p1ar->visible_bytes; assert(i < p1ar->length); assert((p1ar->length - i) >= sizeof (*p1)); x = *p1; ... assert(p2ar != NULL); uint64_t i = ( char *)p2 - p2ar->visible_bytes; assert(i < p2ar->length); assert((p2ar->length - i) >= sizeof (*p2)); *p2 = x; Things become more interesting when the value being stored or loaded is itself a pointer. As already seen, local variables of pointer type have their accompanying AllocationRecord* variable inserted by the compiler, which the compiler can do because it has full control and visibility of all local variables. Once pointers exist in the heap rather than just in local variables, things become harder, but this is where invisible_bytes comes in: if there is a pointer at visible_bytes + i , then its accompanying AllocationRecord* is at invisible_bytes + i . In other words, invisible_bytes is an array with element type AllocationRecord* . To ensure sane access to this array, i must be a multiple of sizeof(AllocationRecord*) . The extra logic for this is highlighted in green: Original After Fil-C Transform p2 = *p1; ... *p1 = p2; assert(p1ar != NULL); uint64_t i = ( char *)p1 - p1ar->visible_bytes; assert(i < p1ar->length); assert((p1ar->length - i) >= sizeof (*p1)); assert((i % sizeof (AllocationRecord*)) == 0 ); p2 = *p1; p2ar = *(AllocationRecord**)(p1ar->invisible_bytes + i); ... assert(p1ar != NULL); uint64_t i = ( char *)p1 - p1ar->visible_bytes; assert(i < p1ar->length); assert((p1ar->length - i) >= sizeof (*p1)); assert((i % sizeof (AllocationRecord*)) == 0 ); *p1 = p2; *(AllocationRecord**)(p1ar->invisible_bytes + i) = p2ar; One thing we've not yet seen is filc_free , which does something like: void filc_free ( void * p, AllocationRecord* par) { if (p != NULL) { assert(par != NULL); assert(p == par->visible_bytes); free (par->visible_bytes); free (par->invisible_bytes); par->visible_bytes = NULL; par->invisible_bytes = NULL; par->length = 0 ; } } The eagle-eyed will note that filc_malloc made three allocations, but filc_free only frees two of them: the AllocationRecord object isn't freed by filc_free . This gap gets covered by the addition of a garbage collector (GC). You heard that right - this is C/C++ with a GC. The production-quality Fil-C has a parallel concurrent incremental collector , but a stop-the-world collector suffices for a simple model. The collector traces through AllocationRecord objects, and frees any unreachable ones. It also does two more things: Upon freeing an unreachable AllocationRecord , call filc_free on it. If an AllocationRecord has length 0, any pointers to that AllocationRecord will be changed to point at a single canonical AllocationRecord with length 0. Point 1 means that if you're using Fil-C, forgetting to call free is no longer a memory leak: the memory will be automatically freed by the GC. That isn't to say that calling free is useless, as it allows memory to be freed earlier than the GC might otherwise choose to. Point 2 means that after calling free on something, the accompanying AllocationRecord will eventually become unreachable, and thus itself eventually be freed. Once a GC is present, it becomes tempting to use it more. One such use is making it safe to take the address of local variables, even if the resultant pointer is used after the local variable goes out of scope. If the compiler sees that a local variable has its address taken, and cannot prove that the address doesn't escape beyond the lifetime of the local variable, then the Fil-C transform will promote that local variable to be heap-allocated via malloc rather than stack-allocated. A matching free doesn't need to be inserted, as the GC will pick it up. The final thing I want to highlight is the Fil-C version of memmove . This function from the C standard library manipulates arbitrary memory, and the compiler has no knowledge of what pointers might be present in that memory. To get past this problem, a reasonable heuristic is used: any pointers within arbitrary memory need to be completely within arbitrary memory, and need to be correctly aligned. This has the interesting consequence that memmove of eight aligned bytes behaves differently to eight separate 1-byte memmove s of the constituent bytes: the former will also memmove the corresponding range of invisible_bytes , whereas the latter will not. That wraps up the simplified model. Some of the additional complications in the production-quality version include: Threads: Concurrency makes the GC more complex. It also means that filc_free can't immediately free anything, as the free-ing thread might be racing with a different thread trying to access the underlying memory. Atomic operations on pointers also need some extra magic, as the default rewriting of a pointer load or store is to two loads or stores, which breaks atomicity. Function pointers: An additional piece of metadata in AllocationRecord is used to denote that the visible_bytes pointer is a pointer to executable code rather than regular data. Calls through a function pointer p1 check that p1 == p1ar->visible_bytes and that p1ar denotes a function pointer. To avoid type confusion attacks on function pointers, the function calling ABI also needs to verify that the type signature is correct. One way of doing this is to make all functions take the same type signature: all parameters are passed as if they were packed into a structure and passed through memory, and at ABI boundaries, every function expects to receive just a single AllocationRecord corresponding to that structure. Memory usage optimization: It is very tempting to have filc_malloc avoid immediately allocatin