This adapts the allocation process to start from either end of the heap
when searching for free space. The default behavior is identical to the
existing behavior where it starts with the lowest block and looks higher.
Now it can also look from the highest block and lower depending on the
long_lived parameter to gc_alloc. As the heap fills, the two sections may
overlap. When they overlap, a collect may be triggered in order to keep
the long lived section compact. However, free space is always eligable
for each type of allocation.
By starting from either of the end of the heap we have ability to separate
short lived objects from long lived ones. This separation reduces heap
fragmentation because long lived objects are easy to densely pack.
Most objects are short lived initially but may be made long lived when
they are referenced by a type or module. This involves copying the
memory and then letting the collect phase free the old portion.
QSTR pools and chunks are always long lived because they are never freed.
The reallocation, collection and free processes are largely unchanged. They
simply also maintain an index to the highest free block as well as the lowest.
These indices are used to speed up the allocation search until the next collect.
In practice, this change may slightly slow down import statements with the
benefit that memory is much less fragmented afterwards. For example, a test
import into a 20k heap that leaves ~6k free previously had the largest
continuous free space of ~400 bytes. After this change, the largest continuous
free space is over 3400 bytes.
This is mostly a workaround for forceful rebuilding of mpy-cross on every
codebase change. If this file has debug logging enabled (by patching),
mpy-cross build failed.
There can be stray pointers in memory blocks that are not properly zero'd
after allocation. This patch adds a new config option to always zero all
allocated memory (via gc_alloc and gc_realloc) and hence help to eliminate
stray pointers.
See issue #2195.
Previous to this patch all interned strings lived in their own malloc'd
chunk. On average this wastes N/2 bytes per interned string, where N is
the number-of-bytes for a quanta of the memory allocator (16 bytes on 32
bit archs).
With this patch interned strings are concatenated into the same malloc'd
chunk when possible. Such chunks are enlarged inplace when possible,
and shrunk to fit when a new chunk is needed.
RAM savings with this patch are highly varied, but should always show an
improvement (unless only 3 or 4 strings are interned). New version
typically uses about 70% of previous memory for the qstr data, and can
lead to savings of around 10% of total memory footprint of a running
script.
Costs about 120 bytes code size on Thumb2 archs (depends on how many
calls to gc_realloc are made).
This patch consolidates all global variables in py/ core into one place,
in a global structure. Root pointers are all located together to make
GC tracing easier and more efficient.
gc.enable/disable are now the same as CPython: they just control whether
automatic garbage collection is enabled or not. If disabled, you can
still allocate heap memory, and initiate a manual collection.
It seems most sensible to use size_t for measuring "number of bytes" in
malloc and vstr functions (since that's what size_t is for). We don't
use mp_uint_t because malloc and vstr are not Micro Python specific.
Blanket wide to all .c and .h files. Some files originating from ST are
difficult to deal with (license wise) so it was left out of those.
Also merged modpyb.h, modos.h, modstm.h and modtime.h in stmhal/.
Also add some more debugging output to gc_dump_alloc_table().
Now that newly allocated heap is always zero'd, maybe we just make this
a policy for the uPy API to keep it simple (ie any new implementation of
memory allocation must zero all allocations). This follows the D
language philosophy.
Before this patch, a previously used memory block which had pointers in
it may still retain those pointers if the new user of that block does
not actually use the entire block. Eg, if I want 5 blocks worth of
heap, I actually get 8 (round up to nearest 4). Then I never use the
last 3, so they keep their old values, which may be pointers pointing to
the heap, hence preventing GC.
In rare (or maybe not that rare) cases, this leads to long, unintentional
"linked lists" within the GC'd heap, filling it up completely. It's
pretty rare, because you have to reuse exactly that memory which is part
of this "linked list", and reuse it in just the right way.
This should fix issue #522, and might have something to do with
issue #510.
Previously, a failed malloc/realloc would throw an exception, which was
not caught. I think it's better to keep the parser free from NLR
(exception throwing), hence this patch.