We don't want to add a feature flag to .mpy files that indicate float
support because it will get complex and difficult to use. Instead the .mpy
is built using whatever precision it chooses (float or double) and the
native glue API will convert between this choice and what the host runtime
actually uses.
This commit adds a new tool called mpy_ld.py which is essentially a linker
that builds .mpy files directly from .o files. A new header file
(dynruntime.h) and makefile fragment (dynruntime.mk) are also included
which allow building .mpy files from C source code. Such .mpy files can
then be dynamically imported as though they were a normal Python module,
even though they are implemented in C.
Converting .o files directly (rather than pre-linked .elf files) allows the
resulting .mpy to be more efficient because it has more control over the
relocations; for example it can skip PLT indirection. Doing it this way
also allows supporting more architectures, such as Xtensa which has
specific needs for position-independent code and the GOT.
The tool supports targets of x86, x86-64, ARM Thumb and Xtensa (windowed
and non-windowed). BSS, text and rodata sections are supported, with
relocations to all internal sections and symbols, as well as relocations to
some external symbols (defined by dynruntime.h), and linking of qstrs.
Implements text, rodata and bss generalised relocations, as well as generic
qstr-object linking. This allows importing dynamic native modules on all
supported architectures in a unified way.
With the memcpy() call placed last it avoids the effects of registers
clobbering. It's definitely effective in non-inlined functions, but even
here it is still making a small difference. For example, on stm32, this
saves an extra `ldr` instruction to load `o->vstr` after the memcpy()
returns.
The string length being longer than the allowed qstr length can happen in
many locations, for example in the parser with very long variable names.
Without an explicit check that the length is within range (as done in this
patch) the code would exhibit crashes and strange behaviour with truncated
strings.
This commit adds a sys.implementation.mpy entry when the system supports
importing .mpy files. This entry is a 16-bit integer which encodes two
bytes of information from the header of .mpy files that are supported by
the system being run: the second and third bytes, .mpy version, and flags
and native architecture. This allows determining the supported .mpy file
dynamically by code, and also for the user to find it out by inspecting
this value. It's further possible to dynamically detect if the system
supports importing .mpy files by `hasattr(sys.implementation, 'mpy')`.
Replace the is_running field with a tri-state variable to indicate
running/not-running/pending-exception.
Update tests to cover the various cases.
This allows cancellation in uasyncio even if the coroutine hasn't been
executed yet. Fixes#5242
This wasn't necessary as the wrapped function already has a reference to
its globals. But it had a dual purpose of tracking whether the function
was currently running, so replace it with a bool.
runtime0.h is part of the MicroPython ABI so it's simpler if it's
independent of config options, like MICROPY_PY_REVERSE_SPECIAL_METHODS.
What's effectively done here is to move MP_BINARY_OP_DIVMOD and
MP_BINARY_OP_CONTAINS up in the enum, then remove the #if
MICROPY_PY_REVERSE_SPECIAL_METHODS conditional.
Without this change .mpy files would need to have a feature flag for
MICROPY_PY_REVERSE_SPECIAL_METHODS (when embedding native code that uses
this enum).
This commit has no effect when MICROPY_PY_REVERSE_SPECIAL_METHODS is
disabled. With this option enabled this commit reduces code size by about
60 bytes.
For consistency with "umachine". Now that weak links are enabled
by default for built-in modules, this should be a no-op, but allows
extension of the bluetooth module by user code.
Also move registration of ubluetooth to objmodule rather than
port-specific.
This commit implements automatic module weak links for all built-in
modules, by searching for "ufoo" in the built-in module list if "foo"
cannot be found. This means that all modules named "ufoo" are always
available as "foo". Also, a port can no longer add any other weak links,
which makes strict the definition of a weak link.
It saves some code size (about 100-200 bytes) on ports that previously had
lots of weak links.
Some changes from the previous behaviour:
- It doesn't intern the non-u module names (eg "foo" is not interned),
which saves code size, but will mean that "import foo" creates a new qstr
(namely "foo") in RAM (unless the importing module is frozen).
- help('modules') no longer lists non-u module names, only the u-variants;
this reduces duplication in the help listing.
Weak links are effectively the same as having a set of symbolic links on
the filesystem that is searched last. So an "import foo" will search
built-in modules first, then all paths in sys.path, then weak links last,
importing "ufoo" if it exists. Thus a file called "foo.py" somewhere in
sys.path will still have precedence over the weak link of "foo" to "ufoo".
See issues: #1740, #4449, #5229, #5241.
When loading a manifest file, e.g. by include(), it will chdir first to the
directory of that manifest. This means that all file operations within a
manifest are relative to that manifest's location.
As a consequence of this, additional environment variables are needed to
find absolute paths, so the following are added: $(MPY_LIB_DIR),
$(PORT_DIR), $(BOARD_DIR). And rename $(MPY) to $(MPY_DIR) to be
consistent.
Existing manifests are updated to match.
This introduces a new build variable FROZEN_MANIFEST which can be set to a
manifest listing (written in Python) that describes the set of files to be
frozen in to the firmware.
Instead of encoding 4 zero bytes as placeholders for the simple_name and
source_file qstrs, and storing the qstrs after the bytecode, store the
qstrs at the location of these 4 bytes. This saves 4 bytes per bytecode
function stored in a .mpy file (for example lcd160cr.mpy drops by 232
bytes, 4x 58 functions). And resulting code size is slightly reduced on
ports that use this feature.
In which case place the native function prelude in a bytes object, linked
from the const_table of that function. An architecture should define
N_PRELUDE_AS_BYTES_OBJ to 1 before including py/emitnative.c to emit
correct machine code, then enable MICROPY_EMIT_NATIVE_PRELUDE_AS_BYTES_OBJ
so the runtime can correctly handle the prelude being in a bytes object.
Such that args/return regs for the parent are different to args/return regs
for child calls. For an architecture to use this feature it should define
the REG_PARENT_xxx macros before including py/emitnative.c.
Prior to this commit, when unwinding through an active finally the stack
was not being correctly popped/folded, which resulting in the VM crashing
for complicated unwinding of nested finallys.
This should be fixed with this commit, and more tests for return/break/
continue within a finally have been added to exercise this.
As of 7d58a197cf, `NULL` should no longer be
here because it's allowed (MP_QSTRnull took its place). This entry was
preventing the use of MP_QSTR_NULL to mean "NULL" (although this is not
currently used).
A blacklist should not be needed because it should be possible to intern
all strings.
Fixes issue #5140.
This check follows CPython's behaviour, because 'import *' always populates
the globals with the imported names, not locals.
Since it's safe to do this (doesn't lead to a crash or undefined behaviour)
the check is only enabled for MICROPY_CPYTHON_COMPAT.
Fixes issue #5121.
This patch compresses the second part of the bytecode prelude which
contains the source file name, function name, source-line-number mapping
and cell closure information. This part of the prelude now begins with a
single varible length unsigned integer which encodes 2 numbers, being the
byte-size of the following 2 sections in the header: the "source info
section" and the "closure section". After decoding this variable unsigned
integer it's possible to skip over one or both of these sections very
easily.
This scheme saves about 2 bytes for most functions compared to the original
format: one in the case that there are no closure cells, and one because
padding was eliminated.
The start of the bytecode prelude contains 6 numbers telling the amount of
stack needed for the Python values and exceptions, and the signature of the
function. Prior to this patch these numbers were all encoded one after the
other (2x variable unsigned integers, then 4x bytes), but using so many
bytes is unnecessary.
An entropy analysis of around 150,000 bytecode functions from the CPython
standard library showed that the optimal Shannon coding would need about
7.1 bits on average to encode these 6 numbers, compared to the existing 48
bits.
This patch attempts to get close to this optimal value by packing the 6
numbers into a single, varible-length unsigned integer via bit-wise
interleaving. The interleaving scheme is chosen to minimise the average
number of bytes needed, and at the same time keep the scheme simple enough
so it can be implemented without too much overhead in code size or speed.
The scheme requires about 10.5 bits on average to store the 6 numbers.
As a result most functions which originally took 6 bytes to encode these 6
numbers now need only 1 byte (in 80% of cases).
From the beginning of this project the RAISE_VARARGS opcode was named and
implemented following CPython, where it has an argument (to the opcode)
counting how many args the raise takes:
raise # 0 args (re-raise previous exception)
raise exc # 1 arg
raise exc from exc2 # 2 args (chained raise)
In the bytecode this operation therefore takes 2 bytes, one for
RAISE_VARARGS and one for the number of args.
This patch splits this opcode into 3, where each is now a single byte.
This reduces bytecode size by 1 byte for each use of raise. Every byte
counts! It also has the benefit of reducing code size (on all ports except
nanbox).
To make progress towards MicroPython supporting Python 3.5, adding the
matmul operator is important because it's a really "low level" part of the
language, being a new token and modifications to the grammar.
It doesn't make sense to make it configurable because 1) it would make the
grammar and lexer complicated/messy; 2) no other operators are
configurable; 3) it's not a feature that can be "dynamically plugged in"
via an import.
And matmul can be useful as a general purpose user-defined operator, it
doesn't have to be just for numpy use.
Based on work done by Jim Mussared.
Prior to this patch mp_opcode_format would calculate the incorrect size of
the MP_BC_UNWIND_JUMP opcode, missing the additional byte. But, because
opcodes below 0x10 are unused and treated as bytes in the .mpy load/save
and freezing code, this bug did not show any symptoms, since nested unwind
jumps would rarely (if ever) reach a depth of 16 (so the extra byte of this
opcode would be between 0x01 and 0x0f and be correctly loaded/saved/frozen
simply as an undefined opcode).
This patch fixes this bug by correctly accounting for the additional byte.
.
With this patch alignment is done relative to the start of the buffer that
is being unpacked, not the raw pointer value, as per CPython.
Fixes issue #3314.
This commit adds support for sys.settrace, allowing to install Python
handlers to trace execution of Python code. The interface follows CPython
as closely as possible. The feature is disabled by default and can be
enabled via MICROPY_PY_SYS_SETTRACE.
Prior to this patch the line number for a lambda would be "line 1" if the
body of the lambda contained only a simple expression (with no line number
stored in the parse node). Now the line number is always reported
correctly.
mp_compile no longer takes an emit_opt argument, rather this setting is now
provided by the global default_emit_opt variable.
Now, when -X emit=native is passed as a command-line option, the emitter
will be set for all compiled modules (included imports), not just the
top-level script.
In the future there could be a way to also set this variable from a script.
Fixes issue #4267.
With this patch exceptions that are re-raised have improved tracebacks
(less confusing, match CPython), and it makes re-raise slightly more
efficient (in time and RAM) because they no longer need to add a traceback.
Also general VM performance is not measurably affected.
Partially fixes issue #2928.
With this patch exception tracebacks that go through a finally are improved
(less confusing, match CPython), and it makes finally's slightly more
efficient (in time and RAM) because they no longer need to add a traceback.
Partially fixes issue #2928.
It's really an opcode that's not implemented, so use "opcode" instead of
"byte code". And remove the redundant "not implemented" text because that
is already implied by the exception type. There's no need to have a long
error message for an exception that is almost never encountered. Saves
about 20 bytes of code size on most ports.
Enabled via MICROPY_PY_URE_DEBUG, disabled by default (but enabled on unix
coverage build). This is a rarely used feature that costs a lot of code
(500-800 bytes flash). Debugging of regular expressions can be done
offline with other tools.
Recent versions of gcc perform optimisations which can lead to the
following code from the MP_NLR_JUMP_HEAD macro being omitted:
top->ret_val = val; \
MP_NLR_RESTORE_PYSTACK(top); \
*_top_ptr = top->prev; \
This is noticeable (at least) in the unix coverage on x86-64 built with gcc
9.1.0. This is because the nlr_jump function is marked as no-return, so
gcc deduces that the above code has no effect.
Adding MP_UNREACHABLE tells the compiler that the asm code may branch
elsewhere, and so it cannot optimise away the code.
As per PEP 485, this function appeared in for Python 3.5. Configured via
MICROPY_PY_MATH_ISCLOSE which is disabled by default, but enabled for the
ports which already have MICROPY_PY_MATH_SPECIAL_FUNCTIONS enabled.
Prior to this patch the amount of free space in an array (including
bytearray) was not being maintained correctly for the case of slice
assignment which changed the size of the array. Under certain cases (as
encoded in the new test) it was possible that the array could grow beyond
its allocated memory block and corrupt the heap.
Fixes issue #4127.