This material is largely based on discussions with whitequark, who is (among other things) a friend of mine and a notable compiler engineer.
Why reference counting?
1) One goal of PoML is to explore new ideas. Tracing garbage collection has been far more extensively developed than reference counting, because naive reference counting is much slower than simple tracing GC, due to the large number of increments and decrements required. However, relatively easy optimizations can make reference counting competitive with at least simple tracing GC, so I consider that an underexplored approach.
2) Concurrent reference counting is easier to implement than concurrent tracing GC.
deferred reference updates
These optimizations are loosely based on this paper. This approach eliminates reference count updates from short-lived objects, which most objects are.
See also this paper for an approach to reducing pause times.
All pointers and refcounts have a
bit used as a "dirty/clean" flag.
Pointers also have a bit for a
"counted/uncounted" flag, which
indicates if the target is
refcounted.
The compiler maintains a circular
buffer B1 of (pointers to) pointers
to process during collection, and a
circular buffer B2 of objects to
potentially delete.
New
objects are created in a special
"nursery" memory region.
When
a pointer inside the nursery is
mutated, nothing else happens.
When a dirty pointer is mutated,
nothing else happens.
When a
clean pointer is mutated in an
object outside the nursery:
The
mutated pointer is marked dirty and
added to B1.
If the old pointer
is counted, its target has its refcount decremented
immediately. If this causes a clean
refcount to hit 0, that refcount is
marked dirty and its object is added
to B2.
At collection time:
1) The nursery is scanned. Each
object not marked as expired is
copied outside the nursery, and has
all its counted pointers checked. If a
pointer target is refcounted, that
refcount is incremented; otherwise,
the pointer is marked uncounted.
2) Every pointer in B1 is checked.
If its source object is not marked
as expired: the pointer is marked
clean, and if the pointer is
counted, the refcount of its target
is incremented. (B1 is then
cleared.)
3) The refcount of
every object in B2 is checked. If
>0, the refcount is marked clean.
Otherwise, the object is marked as
expired, the targets of its
uncounted pointers are marked as
expired, and the refcounts of all
its clean counted pointers are decremented.
If this causes a clean refcount to
hit 0, that object is marked as
expired. Then, this process recurses
for any objects thus marked as
expired.
(B2 is then cleared.)
To
reduce pause lengths, it's possible
to only process and clear part of
B2, then process more of B2 when
time is available.
It's also
possible to scan the nursery more
often than B1 and B2 are processed.
Depth-first reclaiming of tree
structures requires less GC memory;
this could be done by recursively
processing the first expired pointer
target in each object, and
prepending others to B2 by inserting
them before the B2 index then moving
that index backwards.
inherited references
Suppose we have an array A1
containing pointers to some objects,
and another array A2 containing many
pointers to objects in A1.
If
the programmer knows that A1 can be
allocated and reclaimed as a whole,
then we only need a single reference
count for all of A1, which is only
modified when A2 is created and
destroyed. If every object in A1 was
reference-counted separately, we
would need to update object
reference counts every time a
pointer in A2 was modified.
In PoML, we can declare that A1 is
reference-counted, but not the
objects in A1. This means that the
compiler will normally not allow
pointers to those A1 objects to be
copied.
Because pointers to
non-reference-counted targets are
unique, the compiler knows that
children of A1 are "owned" by the
nearest reference counted node, which is
A1.
When we define A2, we can
include an unused immutable pointer
to A1. All children of A2 then
inherit that reference. (But that
inheritance can be broken by
reference-counted nodes in a chain.)
We can do this by putting A2 in a
tuple: (A2, $A1).
Now, A2
elements inheriting that A1
reference can contain pointers to
objects in A1, without causing
reference count updates. When
copying a pointer to a
non-reference-counted target object,
the compiler simply checks if the
pointer-containing object inherits
an immutable reference to the owner
of the target object. The compiler
is pessimistic: if it's possible
that a pointer goes somewhere not
reference counted that doesn't have
an inherited reference, copying that
pointer is not allowed.
As of
the time of writing (Nov. 23 2019),
this is (I believe) a novel
optimization of reference counting
garbage collection.