113 Commits

Author SHA1 Message Date
Damien George
6bd78741c1 py/gc: When GC threshold is hit don't unnecessarily collect twice.
Without this, if GC threshold is hit and there is not enough memory left to
satisfy the request, gc_collect() will run a second time and the search for
memory will happen again and will fail again.

Thanks to @adritium for pointing out this issue, see #3786.
2018-05-21 13:36:21 +10:00
Damien George
749b16174b py/mpstate.h: Adjust start of root pointer section to exclude non-ptrs.
This patch moves the start of the root pointer section in mp_state_ctx_t
so that it skips entries that are not pointers and don't need scanning.

Previously, the start of the root pointer section was at the very beginning
of the mp_state_ctx_t struct (which is the beginning of mp_state_thread_t).
This was the original assembler version of the NLR code was hard-coded to
have the nlr_top pointer at the start of this state structure.  But now
that the NLR code is partially written in C there is no longer this
restriction on the location of nlr_top (and a comment to this effect has
been removed in this patch).

So now the root pointer section starts part way through the
mp_state_thread_t structure, after the entries which are not root pointers.

This patch also moves the non-pointer entries for MICROPY_ENABLE_SCHEDULER
outside the root pointer section.

Moving non-pointer entries out of the root pointer section helps to make
the GC more precise and should help to prevent some cases of collectable
garbage being kept.

This patch also has a measurable improvement in performance of the
pystone.py benchmark: on unix x86-64 and stm32 there was an improvement of
roughly 0.6% (tested with both gcc 7.3 and gcc 8.1).
2018-05-13 22:53:28 +10:00
Damien George
2a0cbc0d38 py/gc: Update comment now that gc_drain_stack is called gc_mark_subtree. 2018-02-19 16:08:20 +11:00
Ayke van Laethem
736faef223 py/gc: Make GC stack pointer a local variable.
This saves a bit in code size, and saves some precious .bss RAM:

                 .text  .bss
minimal CROSS=1: -28    -4
unix (64-bit):   -64    -8
2018-02-19 16:05:46 +11:00
Ayke van Laethem
5c9e5618e0 py/gc: Rename gc_drain_stack to gc_mark_subtree and pass it first block.
This saves a bit in code size:

minimal CROSS=1: -44
unix:            -96
2018-02-19 16:00:59 +11:00
Ayke van Laethem
ea7cf2b738 py/gc: Reduce code size by specialising VERIFY_MARK_AND_PUSH macro.
This macro is written out explicitly in the two locations that it is used
and then the code is optimised, opening possibilities for further
optimisations and reducing code size:

unix:            -48
minimal CROSS=1: -32
stm32:           -32
2018-02-19 15:58:49 +11:00
Damien George
02d830c035 py: Introduce a Python stack for scoped allocation.
This patch introduces the MICROPY_ENABLE_PYSTACK option (disabled by
default) which enables a "Python stack" that allows to allocate and free
memory in a scoped, or Last-In-First-Out (LIFO) way, similar to alloca().

A new memory allocation API is introduced along with this Py-stack.  It
includes both "local" and "nonlocal" LIFO allocation.  Local allocation is
intended to be equivalent to using alloca(), whereby the same function must
free the memory.  Nonlocal allocation is where another function may free
the memory, so long as it's still LIFO.

Follow-up patches will convert all uses of alloca() and VLA to the new
scoped allocation API.  The old behaviour (using alloca()) will still be
available, but when MICROPY_ENABLE_PYSTACK is enabled then alloca() is no
longer required or used.

The benefits of enabling this option are (or will be once subsequent
patches are made to convert alloca()/VLA):
- Toolchains without alloca() can use this feature to obtain correct and
  efficient scoped memory allocation (compared to using the heap instead
  of alloca(), which is slower).
- Even if alloca() is available, enabling the Py-stack gives slightly more
  efficient use of stack space when calling nested Python functions, due to
  the way that compilers implement alloca().
- Enabling the Py-stack with the stackless mode allows for even more
  efficient stack usage, as well as retaining high performance (because the
  heap is no longer used to build and destroy stackless code states).
- With Py-stack and stackless enabled, Python-calling-Python is no longer
  recursive in the C mp_execute_bytecode function.

The micropython.pystack_use() function is included to measure usage of the
Python stack.
2017-12-11 13:49:09 +11:00
Paul Sokolovsky
dea3fb93c7 py/gc: In sweep debug output, print pointer as a pointer.
Or it will be truncated on a 64-bit platform.
2017-12-09 01:54:01 +02:00
Paul Sokolovsky
5453d88d5d py/gc: Factor out a macro to trace GC mark operations.
To allow easier override it for custom tracing.
2017-12-09 01:48:26 +02:00
Paul Sokolovsky
9ef4be8b41 py/gc: Add CLEAR_ON_SWEEP option to debug mis-traced objects.
Accessing them will crash immediately instead still working for some time,
until overwritten by some other data, leading to much less deterministic
crashes.
2017-12-08 00:10:44 +02:00
Damien George
74fad3536b py/gc: In gc_realloc, convert pointer sanity checks to assertions.
These checks are assumed to be true in all cases where gc_realloc is
called with a valid pointer, so no need to waste code space and time
checking them in a non-debug build.
2017-11-29 17:17:08 +11:00
Damien George
a3dc1b1957 all: Remove inclusion of internal py header files.
Header files that are considered internal to the py core and should not
normally be included directly are:
    py/nlr.h - internal nlr configuration and declarations
    py/bc0.h - contains bytecode macro definitions
    py/runtime0.h - contains basic runtime enums

Instead, the top-level header files to include are one of:
    py/obj.h - includes runtime0.h and defines everything to use the
        mp_obj_t type
    py/runtime.h - includes mpstate.h and hence nlr.h, obj.h, runtime0.h,
        and defines everything to use the general runtime support functions

Additional, specific headers (eg py/objlist.h) can be included if needed.
2017-10-04 12:37:50 +11:00
Stefan Naumann
ace9fb5405 py: Add verbose debug compile-time flag MICROPY_DEBUG_VERBOSE.
It enables all the DEBUG_printf outputs in the py/ source code.
2017-08-15 11:53:36 +10:00
Alexander Steffen
55f33240f3 all: Use the name MicroPython consistently in comments
There were several different spellings of MicroPython present in comments,
when there should be only one.
2017-07-31 18:35:40 +10:00
Damien George
12d4fa9b37 py/gc: Refactor assertions in gc_free function.
gc_free() expects either NULL or a valid pointer into the heap, so the
checks for a valid pointer can be turned into assertions.
2017-07-12 12:17:38 +10:00
Damien George
c7e8c6f7de py/gc: Execute finaliser code in a protected environment.
If a finaliser raises an exception then it must not propagate through the
GC sweep function.  This patch protects against such a thing by running
finaliser code via the mp_call_function_1_protected call.

This patch also adds scheduler lock/unlock calls around the finaliser
execution to further protect against any possible reentrancy issues: the
memory manager is already locked when doing a collection, but we also don't
want to allow any scheduled code to run, KeyboardInterrupts to interupt the
code, nor threads to switch.
2017-04-12 13:52:04 +10:00
Damien George
5ffe1d8dc0 py/gc: Add MICROPY_GC_CONSERVATIVE_CLEAR option to always zero memory.
There can be stray pointers in memory blocks that are not properly zero'd
after allocation.  This patch adds a new config option to always zero all
allocated memory (via gc_alloc and gc_realloc) and hence help to eliminate
stray pointers.

See issue #2195.
2016-08-26 15:35:26 +10:00
Paul Sokolovsky
93e353e384 py/gc: Implement GC running by allocation threshold.
Currently, MicroPython runs GC when it could not allocate a block of memory,
which happens when heap is exhausted. However, that policy can't work well
with "inifinity" heaps, e.g. backed by a virtual memory - there will be a
lot of swap thrashing long before VM will be exhausted. Instead, in such
cases "allocation threshold" policy is used: a GC is run after some number of
allocations have been made. Details vary, for example, number or total amount
of allocations can be used, threshold may be self-adjusting based on GC
outcome, etc.

This change implements a simple variant of such policy for MicroPython. Amount
of allocated memory so far is used for threshold, to make it useful to typical
finite-size, and small, heaps as used with MicroPython ports. And such GC policy
is indeed useful for such types of heaps too, as it allows to better control
fragmentation. For example, if a threshold is set to half size of heap, then
for an application which usually makes big number of small allocations, that
will (try to) keep half of heap memory in a nice defragmented state for an
occasional large allocation.

For an application which doesn't exhibit such behavior, there won't be any
visible effects, except for GC running more frequently, which however may
affect performance. To address this, the GC threshold is configurable, and
by default is off so far. It's configured with gc.threshold(amount_in_bytes)
call (can be queries without an argument).
2016-07-21 00:37:30 +03:00
Paul Sokolovsky
749cbaca7f py/gc: Calculate (and report) maximum contiguous free block size.
Just as maximum allocated block size, it's reported in allocation units
(not bytes).
2016-07-01 00:09:55 +03:00
Paul Sokolovsky
6a6e0b7e05 py/gc: Be sure to count last allocated block at heap end in stats.
Previously, if there was chain of allocated blocks ending with the last
block of heap, it wasn't included in number of 1/2-block or max block
size stats.
2016-06-30 12:56:21 +03:00
Damien George
a1c93a62b1 py: Don't use gc or qstr mutex when the GIL is enabled.
There is no need since the GIL already makes gc and qstr operations
atomic.
2016-06-28 11:28:50 +01:00
Damien George
3653f5144a py/gc: Fix GC+thread bug where ptr gets lost because it's not computed.
GC_EXIT() can cause a pending thread (waiting on the mutex) to be
scheduled right away.  This other thread may trigger a garbage
collection.  If the pointer to the newly-allocated block (allocated by
the original thread) is not computed before the switch (so it's just left
as a block number) then the block will be wrongly reclaimed.

This patch makes sure the pointer is computed before allowing any thread
switch to occur.
2016-06-28 11:28:49 +01:00
Damien George
e33806aaff py/gc: Fix 2 cases of concurrent access to ATB and FTB. 2016-06-28 11:28:49 +01:00
Damien George
c93d9caa8b py/gc: Make memory manager and garbage collector thread safe.
By using a single, global mutex, all memory-related functions (alloc,
free, realloc, collect, etc) are made thread safe.  This means that only
one thread can be in such a function at any one time.
2016-06-28 11:28:49 +01:00
Damien George
330165a2cc py: Add MP_STATE_THREAD to hold state specific to a given thread. 2016-06-28 11:09:31 +01:00
Paul Sokolovsky
68a7a92cec py/gc: gc_dump_alloc_table(): Dump heap offset instead of actual address.
Address printed was truncated anyway and in general confusing to outsider.
A line which dumps it is still left in the source, commented, for peculiar
cases when it may be needed (e.g. when running under debugger).
2016-05-13 00:16:38 +03:00
Paul Sokolovsky
9a8751b006 gc: gc_dump_alloc_table(): Use '=' char for tail blocks.
'=' is pretty natural character for tail, and gives less dense picture
where it's easier to see what object types are actually there.
2016-05-13 00:16:38 +03:00
Paul Sokolovsky
bc04dc277e py/gc: Make (byte)array type dumping conditional on these types being enabled. 2016-05-11 19:21:53 +03:00
Paul Sokolovsky
3d7f3f00e0 py/gc: gc_dump_alloc_table(): Show byte/str and (byte)array objects.
These are typical consumers of large chunks of memory, so it's useful to
see at least their number (how much memory isn't clearly shown, as the data
for these objects is allocated elsewhere).
2016-05-11 19:00:15 +03:00
Paul Sokolovsky
3ea03a1188 py/gc: Improve mark/sweep debug output.
Previously, mark operation weren't logged at all, while it's quite useful
to see cascade of marks in case of over-marking (and in other cases too).
Previously, sweep was logged for each block of object in memory, but that
doesn't make much sense and just lead to longer output, harder to parse
by a human. Instead, log sweep only once per object. This is similar to
other memory manager operations, e.g. an object is allocated, then freed.
Or object is allocated, then marked, otherwise swept (one log entry per
operation, with the same memory address in each case).
2015-12-27 20:40:36 +02:00
Damien George
acaccb37ec py/gc: When printing info, use %u instead of UINT_FMT for size_t args.
Ideally we'd use %zu for size_t args, but that's unlikely to be supported
by all runtimes, and we would then need to implement it in mp_printf.
So simplest and most portable option is to use %u and cast the argument
to uint(=unsigned int).

Note: reason for the change is that UINT_FMT can be %llu (size suitable
for mp_uint_t) which is wider than size_t and prints incorrect results.
2015-12-18 12:52:45 +00:00
Damien George
d977d268e8 py/gc: Use size_t instead of mp_uint_t to count things related to heap.
size_t is the correct type to use to count things related to the size of
the address space.  Using size_t (instead of mp_uint_t) is important for
the efficiency of ports that configure mp_uint_t to larger than the
machine word size.
2015-12-16 20:09:11 -05:00
Damien George
f7782f8082 py/gc: For finaliser, interpret a pointer into the heap as concrete obj. 2015-12-16 19:45:42 -05:00
Damien George
969e4bbe6a py/gc: Scan GC blocks as an array of pointers, not an array of objects.
The GC should search for pointers within the heap.  This patch makes a
difference when an object is larger than a pointer (eg 64-bit NaN
boxing).
2015-12-16 19:41:37 -05:00
Paul Sokolovsky
75feece208 py/gc: Make GC block size be configurable. 2015-12-03 01:40:52 +02:00
Damien George
999cedb90f py: Wrap all obj-ptr conversions in MP_OBJ_TO_PTR/MP_OBJ_FROM_PTR.
This allows the mp_obj_t type to be configured to something other than a
pointer-sized primitive type.

This patch also includes additional changes to allow the code to compile
when sizeof(mp_uint_t) != sizeof(void*), such as using size_t instead of
mp_uint_t, and various casts.
2015-11-29 14:25:35 +00:00
Damien George
94fe6e523d py/gc: Move away from using mp_uint_t, instead use uintptr_t and size_t.
The GC works with concrete pointers and so the types should reflect this.
2015-11-29 14:25:04 +00:00
Dave Hylands
7f3c0d1ea8 py: Clear finalizer flag when calling gc_free.
Currently, the only place that clears the bit is in gc_collect.
So if a block with a finalizer is allocated, and subsequently
freed, and then the block is reallocated with no finalizer then
the bit remains set.

This could also be fixed by having gc_alloc clear the bit, but
I'm pretty sure that free is called way less than alloc, so doing
it in free is more efficient.
2015-11-07 14:26:11 +00:00
Damien George
3a2171e406 py: Eliminate some cases which trigger unused parameter warnings. 2015-09-04 16:53:46 +01:00
Damien George
ade9a05236 py: Improve allocation policy of qstr data.
Previous to this patch all interned strings lived in their own malloc'd
chunk.  On average this wastes N/2 bytes per interned string, where N is
the number-of-bytes for a quanta of the memory allocator (16 bytes on 32
bit archs).

With this patch interned strings are concatenated into the same malloc'd
chunk when possible.  Such chunks are enlarged inplace when possible,
and shrunk to fit when a new chunk is needed.

RAM savings with this patch are highly varied, but should always show an
improvement (unless only 3 or 4 strings are interned).  New version
typically uses about 70% of previous memory for the qstr data, and can
lead to savings of around 10% of total memory footprint of a running
script.

Costs about 120 bytes code size on Thumb2 archs (depends on how many
calls to gc_realloc are made).
2015-07-14 22:56:32 +01:00
Damien George
e72cda99fd py: Convert occurrences of non-debug printf to mp_printf. 2015-04-16 14:30:16 +00:00
Damien George
12ab9eda8d py: Make heap printing compatible with 16-bit word size. 2015-04-03 14:11:13 +01:00
Damien George
e1e359ff59 py: Put mp_sys_path, mp_sys_argv and gc_collected in mp_state_ctx_t.
Without mp_sys_path and mp_sys_argv in the root pointer section of the
state, their memory was being incorrectly collected by GC.
2015-02-07 17:24:10 +00:00
Damien George
abc1959e2c py, unix, lib: Allow to compile with -Wold-style-definition. 2015-01-12 22:34:38 +00:00
Damien George
ec21405821 py: Add (commented out) code to gc_dump_alloc_table for qstr info. 2015-01-11 14:37:06 +00:00
stijn
afd6c8e1d2 Remove obsolete bss-related code/build features
GC for unix/windows builds doesn't make use of the bss section anymore,
so we do not need the (sometimes complicated) build features and code related to it
2015-01-08 15:29:44 +01:00
Damien George
b4b10fd350 py: Put all global state together in state structures.
This patch consolidates all global variables in py/ core into one place,
in a global structure.  Root pointers are all located together to make
GC tracing easier and more efficient.
2015-01-07 20:33:00 +00:00
Damien George
fd40a9c38e py: Make GC's STACK_SIZE definition a proper MICROPY_ config variable. 2015-01-01 22:04:46 +00:00
Damien George
51dfcb4bb7 py: Move to guarded includes, everywhere in py/ core.
Addresses issue #1022.
2015-01-01 20:32:09 +00:00
Damien George
7860c2a68a py: Fix some macros defines; cleanup some includes. 2014-11-05 21:16:41 +00:00