When just the bytecode emitter is needed there is no need to have a
dynamic method table for the emitter back-end, and we can instead
directly call the mp_emit_bc_XXX functions. This gives a significant
reduction in code size and a very slight performance boost for the
compiler.
This patch saves 1160 bytes code on Thumb2 and 972 bytes on x86, when
native emitters are disabled.
Overall savings in code over the last 3 commits are:
bare-arm: 1664 bytes.
minimal: 2136 bytes.
stmhal: 584 bytes (it has native emitter enabled).
cc3200: 1736 bytes.
First pass for the compiler is computing the scope (eg if an identifier
is local or not) and originally had an entire table of methods dedicated
to this, most of which did nothing. With changes from previous commit,
this set of methods can be removed and the methods from the bytecode
emitter used instead, with very little modification -- this is what is
done in this commit.
This factoring has little to no impact on the speed of the compiler
(tested by compiling 3763 Python scripts and timing it).
This factoring reduces code size by about 270-300 bytes on Thumb2 archs,
and 400 bytes on x86.
mp_obj_t internal representation doesn't have to be a pointer to object,
it can be anything.
There's also a support for back-conversion in the form of MP_OBJ_UNCAST.
This is kind of optimization/status quo preserver to minimize patching the
existing code and avoid doing potentially expensive MP_OBJ_CAST over and
over. But then one may imagine implementations where MP_OBJ_UNCAST is very
expensive. But such implementations are unlikely interesting in practice.
Despite initial guess, this code factoring does not hamper performance.
In fact it seems to improve speed by a little: running pystone(1.2) on
pyboard (which gives a very stable result) this patch takes pystones
from 1729.51 up to 1742.16. Also, pystones on x64 increase by around
the same proportion (but it's much noisier).
Taking a look at the generated machine code, stack usage with this patch
is unchanged, and call is tail-optimised with all arguments in
registers. Code size decreases by about 50 bytes on Thumb2 archs.
"Base" should rather refer to "base type"."Base object for attribute
lookup" should rather be just "object".
Also, a case of common subexpression elimination.
Given that there's already support for "fixed table" maps, which are
essentially ordered maps, the implementation of OrderedDict just extends
"fixed table" maps by adding an "is ordered" flag and add/remove
operations, and reuses 95% of objdict code, just making methods tolerant
to both dict and OrderedDict.
Some things are missing so far, like CPython-compatible repr and comparison.
OrderedDict is Disabled by default; enabled on unix and stmhal ports.
These allow to fine-tune the compiler to select whether it optimises
tuple assignments of the form a, b = c, d and a, b, c = d, e, f.
Sensible defaults are provided.
This is rarely used feature which takes enough code to implement, so is
controlled by MICROPY_PY_ARRAY_SLICE_ASSIGN config setting, default off.
But otherwise it may be useful, as allows to update arbitrary-sized data
buffers in-place.
Slice is yet to implement, and actually, slice assignment implemented in
such a way that RHS of assignment should be array of the exact same item
typecode as LHS. CPython has it more relaxed, where RHS can be any sequence
of compatible types (e.g. it's possible to assign list of int's to a
bytearray slice).
Overall, when all "slice write" features are implemented, it may cost ~1KB
of code.
This makes exception traceback info self contained (ie doesn't rely on
list object, which was a bit of a hack), reduces code size, and reduces
RAM footprint of exception by eliminating the list object.
Addresses part of issue #1126.
The implementation of these functions is very large (order 4k) and they
are rarely used, so we don't enable them by default.
They are however enabled in stmhal and unix, since we have the room.
Most of printing infrastructure now uses streams, but mp_obj_print() used
libc's printf(), which led to weird buffering issues in output. So, switch
mp_obj_print() to streams too, even though it may make sense to move it to
a separate file, as it is purely a debugging function now.
Relative imports are based of a package, so we're currently at a module
within a package, we should get to package first.
Also, factor out path travsering operation, but this broke testing for
boundary errors with relative imports. TODO: reintroduce them, together
with proper tests.
Traceback allocation for exception will now never lead to recursive
MemoryError exception - if there's no memory for traceback, it simply
won't be created.
Pushing same NLR record twice would lead to "infinite loop" in nlr_jump
(but more realistically, it will crash as soon as NLR record on stack is
overwritten).