uint types in viper mode can now be used for all binary operators except
floor-divide and modulo.
Fixes issue #1847 and issue #6177.
Signed-off-by: Damien George <damien@micropython.org>
The loop searches backwards for a target, but doesn't stop after finding
the first result, meaning that it'll always end up at the outermost
exception handler.
This previously made the native emitter incompatible with the bytecode
emitter, and mp_resume (and subsequently mp_obj_generator_resume) expects
the bytecode emitter behavior (i.e. throw==NULL).
In which case place the native function prelude in a bytes object, linked
from the const_table of that function. An architecture should define
N_PRELUDE_AS_BYTES_OBJ to 1 before including py/emitnative.c to emit
correct machine code, then enable MICROPY_EMIT_NATIVE_PRELUDE_AS_BYTES_OBJ
so the runtime can correctly handle the prelude being in a bytes object.
Such that args/return regs for the parent are different to args/return regs
for child calls. For an architecture to use this feature it should define
the REG_PARENT_xxx macros before including py/emitnative.c.
This patch compresses the second part of the bytecode prelude which
contains the source file name, function name, source-line-number mapping
and cell closure information. This part of the prelude now begins with a
single varible length unsigned integer which encodes 2 numbers, being the
byte-size of the following 2 sections in the header: the "source info
section" and the "closure section". After decoding this variable unsigned
integer it's possible to skip over one or both of these sections very
easily.
This scheme saves about 2 bytes for most functions compared to the original
format: one in the case that there are no closure cells, and one because
padding was eliminated.
The start of the bytecode prelude contains 6 numbers telling the amount of
stack needed for the Python values and exceptions, and the signature of the
function. Prior to this patch these numbers were all encoded one after the
other (2x variable unsigned integers, then 4x bytes), but using so many
bytes is unnecessary.
An entropy analysis of around 150,000 bytecode functions from the CPython
standard library showed that the optimal Shannon coding would need about
7.1 bits on average to encode these 6 numbers, compared to the existing 48
bits.
This patch attempts to get close to this optimal value by packing the 6
numbers into a single, varible-length unsigned integer via bit-wise
interleaving. The interleaving scheme is chosen to minimise the average
number of bytes needed, and at the same time keep the scheme simple enough
so it can be implemented without too much overhead in code size or speed.
The scheme requires about 10.5 bits on average to store the 6 numbers.
As a result most functions which originally took 6 bytes to encode these 6
numbers now need only 1 byte (in 80% of cases).
mpy-cross uses MICROPY_DYNAMIC_COMPILER and MICROPY_EMIT_NATIVE but does
not actually need to execute native functions, and does not need
mp_fun_table. This commit makes it so mp_fun_table and all its entries are
not built when MICROPY_DYNAMIC_COMPILER is enabled, significantly reducing
the size of the mpy-cross executable and allowing it to be built on more
machines/OS's.
Prior to this commit, building the unix port with `DEBUG=1` and
`-finstrument-functions` the compilation would fail with an error like
"control reaches end of non-void function". This change fixes this by
removing the problematic "if (0)" branches. Not all branches affect
compilation, but they are all removed for consistency.
This commit adds support for saving and loading .mpy files that contain
native code (native, viper and inline-asm). A lot of the ground work was
already done for this in the form of removing pointers from generated
native code. The changes here are mainly to link in qstr values to the
native code, and change the format of .mpy files to contain native code
blocks (possibly mixed with bytecode).
A top-level summary:
- @micropython.native, @micropython.viper and @micropython.asm_thumb/
asm_xtensa are now allowed in .py files when compiling to .mpy, and they
work transparently to the user.
- Entire .py files can be compiled to native via mpy-cross -X emit=native
and for the most part the generated .mpy files should work the same as
their bytecode version.
- The .mpy file format is changed to 1) specify in the header if the file
contains native code and if so the architecture (eg x86, ARMV7M, Xtensa);
2) for each function block the kind of code is specified (bytecode,
native, viper, asm).
- When native code is loaded from a .mpy file the native code must be
modified (in place) to link qstr values in, just like bytecode (see
py/persistentcode.c:arch_link_qstr() function).
In addition, this now defines a public, native ABI for dynamically loadable
native code generated by other languages, like C.
The new compile-time option is MICROPY_DEBUG_MP_OBJ_SENTINELS, disabled by
default. This is to allow finer control of whether this debugging feature
is enabled or not (because, for example, this setting must be the same for
mpy-cross and the MicroPython main code when using native code generation).
POP_BLOCK and POP_EXCEPT are now the same, and are always followed by a
JUMP. So this optimisation reduces code size, and RAM usage of bytecode by
two bytes for each try-except handler.
So these constant objects can be loaded by dereferencing the REG_FUN_TABLE
pointer instead of loading immediate values. This reduces the size of
generated native code (when such constants are used), and means that
pointers to these constants are no longer stored in the assembly code.
The maximum index into mp_fun_table is currently less than 1024 and should
stay that way to keep things efficient for all architectures, so there is
no need to handle loading the pointer directly via a literal in this
function.
All architectures now have a dedicated register to hold the pointer to the
native function table mp_fun_table, and so they all need to load this
register at the start of the native function. This commit makes the
loading of this register uniform across architectures by passing the
pointer in the constant table for the native function, and then loading the
register from the constant table. Doing it this way means that the pointer
is not stored in the assembly code, helping to make the code more portable.
Instead of storing the function pointer directly in the assembly code.
This makes the generated code more independent of the runtime (so easier to
relocate the code), and reduces the generated code size.