Gc
Simple, zero-dependency garbage collection for C
Install / Use
/learn @mkirchner/GcREADME
gc: mark & sweep garbage collection for C
gc is an implementation of a conservative, thread-local, mark-and-sweep
garbage collector. The implementation provides a fully functional replacement
for the standard POSIX malloc(), calloc(), realloc(), and free() calls.
The focus of gc is to provide a conceptually clean implementation of
a mark-and-sweep GC, without delving into the depths of architecture-specific
optimization (see e.g. the [Boehm GC][boehm] for such an undertaking). It
should be particularly suitable for learning purposes and is open for all kinds
of optimization (PRs welcome!).
The original motivation for gc is my desire to write [my own LISP][stutter]
in C, entirely from scratch - and that required garbage collection.
Acknowledgements
This work would not have been possible without the ability to read the work of others, most notably the [Boehm GC][boehm], orangeduck's [tgc][tgc] (which also follows the ideals of being tiny and simple), and [The Garbage Collection Handbook][garbage_collection_handbook].
Table of contents
Documentation Overview
- Read the quickstart below to see how to get started quickly
- The concepts section describes the basic concepts and design
decisions that went into the implementation of
gc. - Interleaved with the concepts, there are implementation sections that detail the implementation of the core components, see hash map implementation, dumping registers on the stack, finding roots, and depth-first, recursive marking.
Quickstart
Download, compile and test
$ git clone git@github.com:mkirchner/gc.git
$ cd gc
To compile using the clang compiler:
$ make test
To use the GNU Compiler Collection (GCC):
$ make test CC=gcc
The tests should complete successfully. To create the current coverage report:
$ make coverage
Basic usage
...
#include "gc.h"
...
void some_fun() {
...
int* my_array = gc_calloc(&gc, 1024, sizeof(int));
for (size_t i=0; i<1024; ++i) {
my_array[i] = 42;
}
...
// look ma, no free!
}
int main(int argc, char* argv[]) {
gc_start(&gc, &argc);
...
some_fun();
...
gc_stop(&gc);
return 0;
}
Core API
This describes the core API, see gc.h for more details and the low-level API.
Starting, stopping, pausing, resuming and running GC
In order to initialize and start garbage collection, use the gc_start()
function and pass a bottom-of-stack address:
void gc_start(GarbageCollector* gc, void* bos);
The bottom-of-stack parameter bos needs to point to a stack-allocated
variable and marks the low end of the stack from where root
finding (scanning) starts.
Garbage collection can be stopped, paused and resumed with
void gc_stop(GarbageCollector* gc);
void gc_pause(GarbageCollector* gc);
void gc_resume(GarbageCollector* gc);
and manual garbage collection can be triggered with
size_t gc_run(GarbageCollector* gc);
Memory allocation and deallocation
gc supports malloc(), calloc()and realloc()-style memory allocation.
The respective function signatures mimick the POSIX functions (with the
exception that we need to pass the garbage collector along as the first
argument):
void* gc_malloc(GarbageCollector* gc, size_t size);
void* gc_calloc(GarbageCollector* gc, size_t count, size_t size);
void* gc_realloc(GarbageCollector* gc, void* ptr, size_t size);
It is possible to pass a pointer to a destructor function through the extended interface:
void* dtor(void* obj) {
// do some cleanup work
obj->parent->deregister();
obj->db->disconnect()
...
// no need to free obj
}
...
SomeObject* obj = gc_malloc_ext(gc, sizeof(SomeObject), dtor);
...
gc supports static allocations that are garbage collected only when the
GC shuts down via gc_stop(). Just use the appropriate helper function:
void* gc_malloc_static(GarbageCollector* gc, size_t size, void (*dtor)(void*));
Static allocation expects a pointer to a finalization function; just set to
NULL if finalization is not required.
Note that gc currently does not guarantee a specific ordering when it
collects static variables, If static vars need to be deallocated in a
particular order, the user should call gc_free() on them in the desired
sequence prior to calling gc_stop(), see below.
It is also possible to trigger explicit memory deallocation using
void gc_free(GarbageCollector* gc, void* ptr);
Calling gc_free() is guaranteed to (a) finalize/destruct on the object
pointed to by ptr if applicable and (b) to free the memory that ptr points to
irrespective of the current scheduling for garbage collection and will also
work if GC has been paused using gc_pause() above.
Helper functions
gc also offers a strdup() implementation that returns a garbage-collected
copy:
char* gc_strdup (GarbageCollector* gc, const char* s);
Basic Concepts
The fundamental idea behind garbage collection is to automate the memory allocation/deallocation cycle. This is accomplished by keeping track of all allocated memory and periodically triggering deallocation for memory that is still allocated but unreachable.
Many advanced garbage collectors also implement their own approach to memory
allocation (i.e. replace malloc()). This often enables them to layout memory
in a more space-efficient manner or for faster access but comes at the price of
architecture-specific implementations and increased complexity. gc sidesteps
these issues by falling back on the POSIX *alloc() implementations and keeping
memory management and garbage collection metadata separate. This makes gc
much simpler to understand but, of course, also less space- and time-efficient
than more optimized approaches.
Data Structures
The core data structure inside gc is a hash map that maps the address of
allocated memory to the garbage collection metadata of that memory:
The items in the hash map are allocations, modeled with the Allocation
struct:
typedef struct Allocation {
void* ptr; // mem pointer
size_t size; // allocated size in bytes
char tag; // the tag for mark-and-sweep
void (*dtor)(void*); // destructor
struct Allocation* next; // separate chaining
} Allocation;
Each Allocation instance holds a pointer to the allocated memory, the size of
the allocated memory at that location, a tag for mark-and-sweep (see below), an
optional pointer to the destructor function and a pointer to the next
Allocation instance (for separate chaining, see below).
The allocations are collected in an AllocationMap
typedef struct AllocationMap {
size_t capacity;
size_t min_capacity;
double downsize_factor;
double upsize_factor;
double sweep_factor;
size_t sweep_limit;
size_t size;
Allocation** allocs;
} AllocationMap;
that, together with a set of static functions inside gc.c, provides hash
map semantics for the implementation of the public API.
The AllocationMap is the central data structure in the GarbageCollector
struct which is part of the public API:
typedef struct GarbageCollector {
struct AllocationMap* allocs;
bool paused;
void *bos;
size_t min_size;
} GarbageCollector;
With the basic data structures in place, any gc_*alloc() memory allocation
request is a two-step procedure: first, allocate the memory through system (i.e.
standard malloc()) functionality and second, add or update the associated
metadata to the hash map.
For gc_free(), use the pointer to locate the metadata in the hash map,
determine if the deallocation requires a destructor call, call if required,
free the managed memory and delete the metadata entry from the hash map.
These data structures and the associated interfaces enable the management of the metadata required to build a garbage collector.
Garbage collection
gc triggers collection under two circumstances: (a) when any of the calls to
the system allocation fail (in the hope to deallocate sufficient memory to
fulfill the current request); and (b) when the number of entries in the hash
map passes a dynamically adjusted high water mark.
If either of these cases occurs, gc stops the world and starts a
mark-and-sweep garbage collection run over all current allocations. This
functionality is implemented in the gc_run() function which is part of the
public API and delegates all work to the gc_mark() and gc_sweep() functions
that are part of the private API.
gc_mark() has the task of finding roots and tagging all
known
Related Skills
node-connect
337.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
83.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
337.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
83.2kCommit, push, and open a PR
