This simplifies allocating outside of the VM because the VM doesn't
take up all remaining memory by default.
On ESP we delegate to the IDF for allocations. For all other ports,
we use TLSF to manage an outer "port" heap. The IDF uses TLSF
internally and we use their fork for the other ports.
This also removes the dynamic C stack sizing. It wasn't often used
and is not possible with a fixed outer heap.
Fixes#8512. Fixes#7334.
When set, the split heap is automatically extended with new areas on
demand, and shrunk if a heap area becomes empty during a GC pass or soft
reset.
To save code size the size allocation for a new heap block (including
metadata) is estimated at 103% of the failed allocation, rather than
working from the more complex algorithm in gc_try_add_heap(). This appears
to work well except in the extreme limit case when almost all RAM is
exhausted (~last few hundred bytes). However in this case some allocation
is likely to fail soon anyhow.
Currently there is no API to manually add a block of a given size to the
heap, although that could easily be added if necessary.
This work was funded through GitHub Sponsors.
Signed-off-by: Angus Gratton <angus@redyak.com.au>
This commit adds a new option MICROPY_GC_SPLIT_HEAP (disabled by default)
which, when enabled, allows the GC heap to be split over multiple memory
areas/regions. The first area is added with gc_init() and subsequent areas
can be added with gc_add(). New areas can be added at runtime. Areas are
stored internally as a linked list, and calls to gc_alloc() can be
satisfied from any area.
This feature has the following use-cases (among others):
- The ESP32 has a fragmented OS heap, so to use all (or more) of it the
GC heap must be split.
- Other MCUs may have disjoint RAM regions and are now able to use them
all for the GC heap.
- The user could explicitly increase the size of the GC heap.
- Support a dynamic heap while running on an OS, adding more heap when
necessary.
This lets the BLE stack run through the wait period after a VM run
when it may be waiting for more writes due to an auto-reload.
User BLE functionality will have their events stopped. Scanning and
advertising is also stopped.
The older "bool has_finaliser" gets recast as GC_ALLOC_FLAG_HAS_FINALISER=1
so this is a backwards compatible change to the signature. Since bool gets
implicitly converted to 1 this patch doesn't include conversion of all
calls.
This patch adds the gc_sweep_all() function which does a garbage collection
without tracing any root pointers, so frees all the memory, and most
importantly runs any remaining finalisers.
This helps primarily for soft reset: it will close any open files, any open
sockets, and help to get the system back to a clean state upon soft reset.
This adapts the allocation process to start from either end of the heap
when searching for free space. The default behavior is identical to the
existing behavior where it starts with the lowest block and looks higher.
Now it can also look from the highest block and lower depending on the
long_lived parameter to gc_alloc. As the heap fills, the two sections may
overlap. When they overlap, a collect may be triggered in order to keep
the long lived section compact. However, free space is always eligable
for each type of allocation.
By starting from either of the end of the heap we have ability to separate
short lived objects from long lived ones. This separation reduces heap
fragmentation because long lived objects are easy to densely pack.
Most objects are short lived initially but may be made long lived when
they are referenced by a type or module. This involves copying the
memory and then letting the collect phase free the old portion.
QSTR pools and chunks are always long lived because they are never freed.
The reallocation, collection and free processes are largely unchanged. They
simply also maintain an index to the highest free block as well as the lowest.
These indices are used to speed up the allocation search until the next collect.
In practice, this change may slightly slow down import statements with the
benefit that memory is much less fragmented afterwards. For example, a test
import into a 20k heap that leaves ~6k free previously had the largest
continuous free space of ~400 bytes. After this change, the largest continuous
free space is over 3400 bytes.
The code conventions suggest using header guards, but do not define how
those should look like and instead point to existing files. However, not
all existing files follow the same scheme, sometimes omitting header guards
altogether, sometimes using non-standard names, making it easy to
accidentally pick a "wrong" example.
This commit ensures that all header files of the MicroPython project (that
were not simply copied from somewhere else) follow the same pattern, that
was already present in the majority of files, especially in the py folder.
The rules are as follows.
Naming convention:
* start with the words MICROPY_INCLUDED
* contain the full path to the file
* replace special characters with _
In addition, there are no empty lines before #ifndef, between #ifndef and
one empty line before #endif. #endif is followed by a comment containing
the name of the guard macro.
py/grammar.h cannot use header guards by design, since it has to be
included multiple times in a single C file. Several other files also do not
need header guards as they are only used internally and guaranteed to be
included only once:
* MICROPY_MPHALPORT_H
* mpconfigboard.h
* mpconfigport.h
* mpthreadport.h
* pin_defs_*.h
* qstrdefs*.h
Currently, the only place that clears the bit is in gc_collect.
So if a block with a finalizer is allocated, and subsequently
freed, and then the block is reallocated with no finalizer then
the bit remains set.
This could also be fixed by having gc_alloc clear the bit, but
I'm pretty sure that free is called way less than alloc, so doing
it in free is more efficient.
Previous to this patch all interned strings lived in their own malloc'd
chunk. On average this wastes N/2 bytes per interned string, where N is
the number-of-bytes for a quanta of the memory allocator (16 bytes on 32
bit archs).
With this patch interned strings are concatenated into the same malloc'd
chunk when possible. Such chunks are enlarged inplace when possible,
and shrunk to fit when a new chunk is needed.
RAM savings with this patch are highly varied, but should always show an
improvement (unless only 3 or 4 strings are interned). New version
typically uses about 70% of previous memory for the qstr data, and can
lead to savings of around 10% of total memory footprint of a running
script.
Costs about 120 bytes code size on Thumb2 archs (depends on how many
calls to gc_realloc are made).
This patch consolidates all global variables in py/ core into one place,
in a global structure. Root pointers are all located together to make
GC tracing easier and more efficient.
gc.enable/disable are now the same as CPython: they just control whether
automatic garbage collection is enabled or not. If disabled, you can
still allocate heap memory, and initiate a manual collection.
Blanket wide to all .c and .h files. Some files originating from ST are
difficult to deal with (license wise) so it was left out of those.
Also merged modpyb.h, modos.h, modstm.h and modtime.h in stmhal/.