This commit makes gc_lock_depth have one counter per thread, instead of one
global counter. This makes threads properly independent with respect to
the GC, in particular threads can now independently lock the GC for
themselves without locking it for other threads. It also means a given
thread can run a hard IRQ without temporarily locking the GC for all other
threads and potentially making them have MemoryError exceptions at random
locations (this really only occurs on MCUs with multiple cores and no GIL,
eg on the rp2 port).
The commit also removes protection of the GC lock/unlock functions, which
is no longer needed when the counter is per thread (and this also fixes the
cas where a hard IRQ calling gc_lock() may stall waiting for the mutex).
It also puts the check for `gc_lock_depth > 0` outside the GC mutex in
gc_alloc, gc_realloc and gc_free, to potentially prevent a hard IRQ from
waiting on a mutex if it does attempt to allocate heap memory (and putting
the check outside the GC mutex is now safe now that there is a
gc_lock_depth per thread).
Signed-off-by: Damien George <damien@micropython.org>
This was added a long time ago in 75abee206d
when USB host support was added to the stm (now stm32) port, and when this
pyexec code was actually part of the stm port. It's unlikely to work as
intended anymore. If it is needed in the future then generic hook macros
can be added in pyexec.
Background: the friendly/normal REPL is intended for human use whereas the
raw REPL is for computer use/automation. Raw REPL is used for things like
pyboard.py script_to_run.py. The normal REPL has built-in flow control
because it echos back the characters. That's not so with raw REPL and flow
control is just implemented by rate limiting the amount of data that goes
in. Currently it's fixed at 256 byte chunks every 10ms. This is sometimes
too fast for slow MCUs or systems with small stdin buffers. It's also too
slow for a lot of higher-end MCUs, ie it could be a lot faster.
This commit adds a new raw REPL mode which includes flow control: the
device will echo back a character after a certain number of bytes are sent
to the host, and the host can use this to regulate the data going out to
the device. The amount of characters is controlled by the device and sent
to the host before communication starts. This flow control allows getting
the maximum speed out of a serial link, regardless of the link or the
device at the other end.
Also, this new raw REPL mode parses and compiles the incoming data as it
comes in. It does this by creating a "stdin reader" object which is then
passed to the lexer. The lexer requests bytes from this "stdin reader"
which retrieves bytes from the host, and does flow control. What this
means is that no memory is used to store the script (in the existing raw
REPL mode the device needs a big buffer to read in the script before it can
pass it on to the lexer/parser/compiler). The only memory needed on the
device is enough to parse and compile.
Finally, it would be possible to extend this new raw REPL to allow bytecode
(.mpy files) to be sent as well as text mode scripts (but that's not done
in this commit).
Some results follow. The test was to send a large 33k script that contains
mostly comments and then prints out the heap, run via pyboard.py large.py.
On PYBD-SF6, prior to this PR:
$ ./pyboard.py large.py
stack: 524 out of 23552
GC: total: 392192, used: 34464, free: 357728
No. of 1-blocks: 12, 2-blocks: 2, max blk sz: 2075, max free sz: 22345
GC memory layout; from 2001a3f0:
00000: h=hhhh=======================================hhBShShh==h=======h
00400: =====hh=B........h==h===========================================
00800: ================================================================
00c00: ================================================================
01000: ================================================================
01400: ================================================================
01800: ================================================================
01c00: ================================================================
02000: ================================================================
02400: ================================================================
02800: ================================================================
02c00: ================================================================
03000: ================================================================
03400: ================================================================
03800: ================================================================
03c00: ================================================================
04000: ================================================================
04400: ================================================================
04800: ================================================================
04c00: ================================================================
05000: ================================================================
05400: ================================================================
05800: ================================================================
05c00: ================================================================
06000: ================================================================
06400: ================================================================
06800: ================================================================
06c00: ================================================================
07000: ================================================================
07400: ================================================================
07800: ================================================================
07c00: ================================================================
08000: ================================================================
08400: ===============================================.....h==.........
(349 lines all free)
(the big blob of used memory is the large script).
Same but with this PR:
$ ./pyboard.py large.py
stack: 524 out of 23552
GC: total: 392192, used: 1296, free: 390896
No. of 1-blocks: 12, 2-blocks: 3, max blk sz: 40, max free sz: 24420
GC memory layout; from 2001a3f0:
00000: h=hhhh=======================================hhBShShh==h=======h
00400: =====hh=h=B......h==.....h==....................................
(381 lines all free)
The only thing in RAM is the compiled script (and some other unrelated
items).
Time to download before this PR: 1438ms, data rate: 230,799 bits/sec.
Time to download with this PR: 119ms, data rate: 2,788,991 bits/sec.
So it's more than 10 times faster, and uses significantly less RAM.
Results are similar on other boards. On an stm32 board that connects via
UART only at 115200 baud, the data rate goes from 80kbit/sec to
113kbit/sec, so gets close to saturating the UART link without loss of
data.
The new raw REPL mode also supports a single ctrl-C to break out of this
flow-control mode, so that a ctrl-C can always get back to a known state.
It's also backwards compatible with the original raw REPL mode, which is
still supported with the same sequence of commands. The new raw REPL
mode is activated by ctrl-E, which gives an error on devices that do not
support the new mode.
Signed-off-by: Damien George <damien@micropython.org>
mp_irq_init() is useful when the IRQ object is allocated by the caller.
The mp_irq_methods_t.init method is not used anywhere so has been removed.
Signed-off-by: Damien George <damien@micropython.org>
In relatively unusual circumstances, such as entering `l = 17 ** 17777`
at the REPL, you could hit ctrl-c, but not get KeyboardInterrupt.
This can lead to a condition where the display would stop updating (#2689).
So it can be unconditionally included in a port's build even if certain
configurations in that port do not use its features, to simplify the
Makefile.
Signed-off-by: Damien George <damien@micropython.org>
With only `sp_func_proto_paren = remove` set there are some cases where
uncrustify misses removing a space between the function name and the
opening '('. This sets all of the related options to `force` as well.
No functionality change is intended with this commit, it just consolidates
the separate implementations of GC helper code to the lib/utils/ directory
as a general set of helper functions useful for any port. This reduces
duplication of code, and makes it easier for future ports or embedders to
get the GC implementation correct.
Ports should now link against gchelper_native.c and either gchelper_m0.s or
gchelper_m3.s (currently only Cortex-M is supported but other architectures
can follow), or use the fallback gchelper_generic.c which will work on
x86/x64/ARM.
The gc_helper_get_sp function from gchelper_m3.s is not really GC related
and was only used by cc3200, so it has been moved to that port and renamed
to cortex_m3_get_sp.
Note: the uncrustify configuration is explicitly set to 'add' instead of
'force' in order not to alter the comments which use extra spaces after //
as a means of indenting text for clarity.
This is a more logical place to clear the KeyboardInterrupt traceback,
right before it is set as a pending exception. The clearing is also
optimised from a function call to a simple store of NULL.
This function is tightly coupled to the state and behaviour of the
scheduler, and is a core part of the runtime: to schedule a pending
exception. So move it there.
Pending exceptions would otherwise be handled later on where there may not
be an NLR handler in place.
A similar fix is also made to the unix port's REPL handler.
Fixes issues #4921 and #5488.
By simply reordering the enums for pyexec_mode_kind_t it eliminates a data
variable which costs ROM to initialise it. And the minimal build now has
nothing in the data section.
It seems the compiler is smart enough so that the generated code for
if-logic which tests these enum values is unchanged.
For the 3 ports that already make use of this feature (stm32, nrf and
teensy) this doesn't make any difference, it just allows to disable it from
now on.
For other ports that use pyexec, this decreases code size because the debug
printing code is dead (it can't be enabled) but the compiler can't deduce
that, so code is still emitted.
Protocols are nice, but there is no way for C code to verify whether
a type's "protocol" structure actually implements some particular
protocol. As a result, you can pass an object that implements the
"vfs" protocol to one that expects the "stream" protocol, and the
opposite of awesomeness ensues.
This patch adds an OPTIONAL (but enabled by default) protocol identifier
as the first member of any protocol structure. This identifier is
simply a unique QSTR chosen by the protocol designer and used by each
protocol implementer. When checking for protocol support, instead of
just checking whether the object's type has a non-NULL protocol field,
use `mp_proto_get` which implements the protocol check when possible.
The existing protocols are now named:
protocol_framebuf
protocol_i2c
protocol_pin
protocol_stream
protocol_spi
protocol_vfs
(most of these are unused in CP and are just inherited from MP; vfs and
stream are definitely used though)
I did not find any crashing examples, but here's one to give a flavor of what
is improved, using `micropython_coverage`. Before the change,
the vfs "ioctl" protocol is invoked, and the result is not intelligible
as json (but it could have resulted in a hard fault, potentially):
>>> import uos, ujson
>>> u = uos.VfsPosix('/tmp')
>>> ujson.load(u)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: syntax error in JSON
After the change, the vfs object is correctly detected as not supporting
the stream protocol:
>>> ujson.load(p)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: stream operation not supported
This PR refines the _bleio API. It was originally motivated by
the addition of a new CircuitPython service that enables reading
and modifying files on the device. Moving the BLE lifecycle outside
of the VM motivated a number of changes to remove heap allocations
in some APIs.
It also motivated unifying connection initiation to the Adapter class
rather than the Central and Peripheral classes which have been removed.
Adapter now handles the GAP portion of BLE including advertising, which
has moved but is largely unchanged, and scanning, which has been enhanced
to return an iterator of filtered results.
Once a connection is created (either by us (aka Central) or a remote
device (aka Peripheral)) it is represented by a new Connection class.
This class knows the current connection state and can discover and
instantiate remote Services along with their Characteristics and
Descriptors.
Relates to #586
mp_compile no longer takes an emit_opt argument, rather this setting is now
provided by the global default_emit_opt variable.
Now, when -X emit=native is passed as a command-line option, the emitter
will be set for all compiled modules (included imports), not just the
top-level script.
In the future there could be a way to also set this variable from a script.
Fixes issue #4267.