It's much more efficient in RAM and code size to do implicit literal string
concatenation in the lexer, as opposed to the compiler.
RAM usage is reduced because the concatenation can be done right away in the
tokeniser by just accumulating the string/bytes literals into the lexer's
vstr. Prior to this patch adjacent strings/bytes would create a parse tree
(one node per string/bytes) and then in the compiler a whole new chunk of
memory was allocated to store the concatenated string, which used more than
double the memory compared to just accumulating in the lexer.
This patch also significantly reduces code size:
bare-arm: -204
minimal: -204
unix x64: -328
stmhal: -208
esp8266: -284
cc3200: -224
Previous to this patch there was an explicit check for errors with line
continuation (where backslash was not immediately followed by a newline).
But this check is not necessary: if there is an error then the remaining
logic of the tokeniser will reject the backslash and correctly produce a
syntax error.
Since the table of keywords is sorted, we can use strcmp to do the search
and stop part way through the search if the comparison is less-than.
Because all tokens that are names are subject to this search, this
optimisation will improve the overall speed of the lexer when processing
a script.
The change also decreases code size by a little bit because we now use
strcmp instead of the custom str_strn_equal function.
Keywords only needs to be searched for if the token is a MP_TOKEN_NAME, so
we can move the seach to the part of the code that does the tokenising for
MP_TOKEN_NAME.
Grammar rules have 2 variants: ones that are attached to a specific
compile function which is called to compile that grammar node, and ones
that don't have a compile function and are instead just inspected to see
what form they take.
In the compiler there is a table of all grammar rules, with each entry
having a pointer to the associated compile function. Those rules with no
compile function have a null pointer. There are 120 such rules, so that's
120 words of essentially wasted code space.
By grouping together the compile vs no-compile rules we can put all the
no-compile rules at the end of the list of rules, and then we don't need
to store the null pointers. We just have a truncated table and it's
guaranteed that when indexing this table we only index the first half,
the half with populated pointers.
This patch implements such a grouping by having a specific macro for the
compile vs no-compile grammar rules (DEF_RULE vs DEF_RULE_NC). It saves
around 460 bytes of code on 32-bit archs.
Allows to iterate over the following without allocating on the heap:
- tuple
- list
- string, bytes
- bytearray, array
- dict (not dict.keys, dict.values, dict.items)
- set, frozenset
Allows to call the following without heap memory:
- all, any, min, max, sum
TODO: still need to allocate stack memory in bytecode for iter_buf.
This improves efficiency of GIL release within the VM, by only doing the
release after a fixed number of jump-opcodes have executed in the current
thread.
It's more efficient using the system mutexs instead of synthetic ones with
a busy-wait loop. The system can do proper scheduling and blocking of the
threads waiting on the mutex.
Previous to this patch, for large chunks of bytecode that originated from
a single source-code line, the bytecode-line mapping would generate
something like (for 42 bytecode bytes and 1 line):
BC_SKIP=31 LINE_SKIP=1
BC_SKIP=11 LINE_SKIP=0
This would mean that any errors in the last 11 bytecode bytes would be
reported on the following line. This patch fixes it to generate instead:
BC_SKIP=31 LINE_SKIP=0
BC_SKIP=11 LINE_SKIP=1
This patch implements support for class methods __delattr__ and __setattr__
for customising attribute access. It is controlled by the config option
MICROPY_PY_DELATTR_SETATTR and is disabled by default.
It seems that the gcc toolchain on the RaspberryPi
likes %progbits instead of @progbits. I verified that
%progbits also works under x86, so this should
fix#2848 and fix#2842
I verified that unix and mpy-cross both compile
on my RaspberryPi and on my x64 machine.
The internal map/set functions now use size_t exclusively for computing
addresses. size_t is enough to reach all of available memory when
computing addresses so is the right type to use. In particular, for
nanbox builds it saves quite a bit of code size and RAM compared to the
original use of mp_uint_t (which is 64-bits on nanbox builds).
For archs that have 16-bit pointers, the asmxtensa.h file can give compiler
warnings about left-shift being greater than the width of the type (due to
the inline functions in this header file). Explicitly casting the
constants to uint32_t stops these warnings.
This patch fixes two main things:
- dicts can be printed directly using '%s' % dict
- %-formatting should not crash when passed a non-dict to, eg, '%(foo)s'
Updated modbuiltin.c to add conditional support for 3-arg calls to
pow() using MICROPY_PY_BUILTINS_POW3 config parameter. Added support in
objint_mpz.c for for optimised implementation.