This also adds a bit of code everywhere we DISPATCH(), but the net is
+232 bytes free on Feather M0 Adalogger.
Key assumption: All of the offsets in mp_execute_bytecode fit in 16 bits;
it is not clear whether the compiler will verify this assumption (e.g.,
by warning that a constant will be truncated)
This string is recognised by uncrustify, to disable formatting in the
region marked by these comments. This is necessary in the qstrdef*.h files
to prevent modification of the strings within the Q(...). In other places
it is used to prevent excessive reformatting that would make the code less
readable.
Introduces a way to place CircuitPython code and data into
tightly coupled memory (TCM) which is accessible by the CPU in a
single cycle. It also frees up room in the corresponding cache for
intermittent data. Loading from external flash is slow!
The data cache is also now enabled.
Adds support for the iMX RT 1021 chip. Adds three new boards:
* iMX RT 1020 EVK
* iMX RT 1060 EVK
* Teensy 4.0
Related to #2492, #2472 and #2477. Fixes#2475.
From the beginning of this project the RAISE_VARARGS opcode was named and
implemented following CPython, where it has an argument (to the opcode)
counting how many args the raise takes:
raise # 0 args (re-raise previous exception)
raise exc # 1 arg
raise exc from exc2 # 2 args (chained raise)
In the bytecode this operation therefore takes 2 bytes, one for
RAISE_VARARGS and one for the number of args.
This patch splits this opcode into 3, where each is now a single byte.
This reduces bytecode size by 1 byte for each use of raise. Every byte
counts! It also has the benefit of reducing code size (on all ports except
nanbox).
POP_BLOCK and POP_EXCEPT are now the same, and are always followed by a
JUMP. So this optimisation reduces code size, and RAM usage of bytecode by
two bytes for each try-except handler.
This patch allows the following code to run without allocating on the heap:
super().foo(...)
Before this patch such a call would allocate a super object on the heap and
then load the foo method and call it right away. The super object is only
needed to perform the lookup of the method and not needed after that. This
patch makes an optimisation to allocate the super object on the C stack and
discard it right after use.
Changes in code size due to this patch are:
bare-arm: +128
minimal: +232
unix x64: +416
unix nanbox: +364
stmhal: +184
esp8266: +340
cc3200: +128
With the previous patch combining 3 emit functions into 1, it now makes
sense to also combine the corresponding VM opcodes, which is what this
patch does. This eliminates 2 opcodes which simplifies the VM and reduces
code size, in bytes: bare-arm:44, minimal:64, unix(NDEBUG,x86-64):272,
stmhal:92, esp8266:200. Profiling (with a simple script that creates many
list/dict/set comprehensions) shows no measurable change in performance.
Fixes#1684 and makes "not" match Python semantics. The code is also
simplified (the separate MP_BC_NOT opcode is removed) and the patch saves
68 bytes for bare-arm/ and 52 bytes for minimal/.
Previously "not x" was implemented as !mp_unary_op(x, MP_UNARY_OP_BOOL),
so any given object only needs to implement MP_UNARY_OP_BOOL (and the VM
had a special opcode to do the ! bit).
With this patch "not x" is implemented as mp_unary_op(x, MP_UNARY_OP_NOT),
but this operation is caught at the start of mp_unary_op and dispatched as
!mp_obj_is_true(x). mp_obj_is_true has special logic to test for
truthness, and is the correct way to handle the not operation.
Previous to this patch each time a bytes object was referenced a new
instance (with the same data) was created. With this patch a single
bytes object is created in the compiler and is loaded directly at execute
time as a true constant (similar to loading bignum and float objects).
This saves on allocating RAM and means that bytes objects can now be
used when the memory manager is locked (eg in interrupts).
The MP_BC_LOAD_CONST_BYTES bytecode was removed as part of this.
Generated bytecode is slightly larger due to storing a pointer to the
bytes object instead of the qstr identifier.
Code size is reduced by about 60 bytes on Thumb2 architectures.
Hashing is now done using mp_unary_op function with MP_UNARY_OP_HASH as
the operator argument. Hashing for int, str and bytes still go via
fast-path in mp_unary_op since they are the most common objects which
need to be hashed.
This lead to quite a bit of code cleanup, and should be more efficient
if anything. It saves 176 bytes code space on Thumb2, and 360 bytes on
x86.
The only loss is that the error message "unhashable type" is now the
more generic "unsupported type for __hash__".
Previous to this patch, a big-int, float or imag constant was interned
(made into a qstr) and then parsed at runtime to create an object each
time it was needed. This is wasteful in RAM and not efficient. Now,
these constants are parsed straight away in the parser and turned into
objects. This allows constants with large numbers of digits (so
addresses issue #1103) and takes us a step closer to #722.
This patch makes the MICROPY_PY_BUILTINS_SLICE compile-time option
fully disable the builtin slice operation (when set to 0). This
includes removing the slice sytanx from the grammar. Now, enabling
slice costs 4228 bytes on unix x64, and 1816 bytes on stmhal.
This patch makes MICROPY_PY_BUILTINS_SET compile-time option fully
disable the builtin set object (when set to 0). This includes removing
set constructor/comprehension from the grammar, the compiler and the
emitters. Now, enabling set costs 8168 bytes on unix x64, and 3576
bytes on stmhal.
There is a lot potential in compress bytecodes and make more use of the
coding space. This patch introduces "multi" bytecodes which have their
argument included in the bytecode (by addition).
UNARY_OP and BINARY_OP now no longer take a 1 byte argument for the
opcode. Rather, the opcode is included in the first byte itself.
LOAD_FAST_[0,1,2] and STORE_FAST_[0,1,2] are removed in favour of their
multi versions, which can take an argument between 0 and 15 inclusive.
The majority of LOAD_FAST/STORE_FAST codes fit in this range and so this
saves a byte for each of these.
LOAD_CONST_SMALL_INT_MULTI is used to load small ints between -16 and 47
inclusive. Such ints are quite common and now only need 1 byte to
store, and now have much faster decoding.
In all this patch saves about 2% RAM for typically bytecode (1.8% on
64-bit test, 2.5% on pyboard test). It also reduces the binary size
(because bytecodes are simplified) and doesn't harm performance.
Blanket wide to all .c and .h files. Some files originating from ST are
difficult to deal with (license wise) so it was left out of those.
Also merged modpyb.h, modos.h, modstm.h and modtime.h in stmhal/.
3 emitter functions are needed only for emitcpy, and so we can #if them
out when compiling with emitcpy support.
Also remove unused SETUP_LOOP bytecode.