It makes much more sense to do constant folding in the parser while the
parse tree is being built. This eliminates the need to create parse
nodes that will just be folded away. The code is slightly simpler and a
bit smaller as well.
Constant folding now has a configuration option,
MICROPY_COMP_CONST_FOLDING, which is enabled by default.
With this patch parse nodes are allocated sequentially in chunks. This
reduces fragmentation of the heap and prevents waste at the end of
individually allocated parse nodes.
Saves roughly 20% of RAM during parse stage.
4 spaces are added at start of line to match previous indent, and if
previous line ended in colon.
Backspace deletes 4 space if only spaces begin a line.
Configurable via MICROPY_REPL_AUTO_INDENT. Disabled by default.
unix-cpy was originally written to get semantic equivalent with CPython
without writing functional tests. When writing the initial
implementation of uPy it was a long way between lexer and functional
tests, so the half-way test was to make sure that the bytecode was
correct. The idea was that if the uPy bytecode matched CPython 1-1 then
uPy would be proper Python if the bytecodes acted correctly. And having
matching bytecode meant that it was less likely to miss some deep
subtlety in the Python semantics that would require an architectural
change later on.
But that is all history and it no longer makes sense to retain the
ability to output CPython bytecode, because:
1. It outputs CPython 3.3 compatible bytecode. CPython's bytecode
changes from version to version, and seems to have changed quite a bit
in 3.5. There's no point in changing the bytecode output to match
CPython anymore.
2. uPy and CPy do different optimisations to the bytecode which makes it
harder to match.
3. The bytecode tests are not run. They were never part of Travis and
are not run locally anymore.
4. The EMIT_CPYTHON option needs a lot of extra source code which adds
heaps of noise, especially in compile.c.
5. Now that there is an extensive test suite (which tests functionality)
there is no need to match the bytecode. Some very subtle behaviour is
tested with the test suite and passing these tests is a much better
way to stay Python-language compliant, rather than trying to match
CPy bytecode.
This patch makes configurable, via MICROPY_QSTR_BYTES_IN_HASH, the
number of bytes used for a qstr hash. It was originally fixed at 2
bytes, and now defaults to 2 bytes. Setting it to 1 byte will save
ROM and RAM at a small expense of hash collisions.
Previous to this patch all interned strings lived in their own malloc'd
chunk. On average this wastes N/2 bytes per interned string, where N is
the number-of-bytes for a quanta of the memory allocator (16 bytes on 32
bit archs).
With this patch interned strings are concatenated into the same malloc'd
chunk when possible. Such chunks are enlarged inplace when possible,
and shrunk to fit when a new chunk is needed.
RAM savings with this patch are highly varied, but should always show an
improvement (unless only 3 or 4 strings are interned). New version
typically uses about 70% of previous memory for the qstr data, and can
lead to savings of around 10% of total memory footprint of a running
script.
Costs about 120 bytes code size on Thumb2 archs (depends on how many
calls to gc_realloc are made).
The TimeoutError is useful for some modules, specially the the
socket module. TimeoutError can then be alised to socket.timeout
and then Python code can differentiate between socket.error and
socket.timeout.
Previous to this patch a call such as list.append(1, 2) would lead to a
seg fault. This is because list.append is a builtin method and the first
argument to such methods is always assumed to have the correct type.
Now, when a builtin method is extracted like this it is wrapped in a
checker object which checks the the type of the first argument before
calling the builtin function.
This feature is contrelled by MICROPY_BUILTIN_METHOD_CHECK_SELF_ARG and
is enabled by default.
See issue #1216.
From https://docs.python.org/3/library/constants.html#NotImplemented :
"Special value which should be returned by the binary special methods
(e.g. __eq__(), __lt__(), __add__(), __rsub__(), etc.) to indicate
that the operation is not implemented with respect to the other type;
may be returned by the in-place binary special methods (e.g. __imul__(),
__iand__(), etc.) for the same purpose. Its truth value is true."
Some people however appear to abuse it to mean "no value" when None is
a legitimate value (don't do that).
The implementation is very basic and non-compliant and provided solely for
CPython compatibility. The function itself is bad Python2 heritage, its
usage is discouraged.
Adds support for the following Thumb2 VFP instructions, via the option
MICROPY_EMIT_INLINE_THUMB_FLOAT:
vcmp
vsqrt
vneg
vcvt_f32_to_s32
vcvt_s32_to_f32
vmrs
vmov
vldr
vstr
vadd
vsub
vmul
vdiv
Previous to this patch the printing mechanism was a bit of a tangled
mess. This patch attempts to consolidate printing into one interface.
All (non-debug) printing now uses the mp_print* family of functions,
mainly mp_printf. All these functions take an mp_print_t structure as
their first argument, and this structure defines the printing backend
through the "print_strn" function of said structure.
Printing from the uPy core can reach the platform-defined print code via
two paths: either through mp_sys_stdout_obj (defined pert port) in
conjunction with mp_stream_write; or through the mp_plat_print structure
which uses the MP_PLAT_PRINT_STRN macro to define how string are printed
on the platform. The former is only used when MICROPY_PY_IO is defined.
With this new scheme printing is generally more efficient (less layers
to go through, less arguments to pass), and, given an mp_print_t*
structure, one can call mp_print_str for efficiency instead of
mp_printf("%s", ...). Code size is also reduced by around 200 bytes on
Thumb2 archs.
splitlines() occurs ~179 times in CPython3 standard library, so was
deemed worthy to implement. The method has subtle semantic differences
from just .split("\n"). It is also defined as working for any end-of-line
combination, but this is currently not implemented - it works only with
LF line-endings (which should be OK for text strings on any platforms,
but not OK for bytes).
I.e. in this mode, C stack will never be used to call a Python function,
but if there's no free heap for a call, it will be reported as
RuntimeError (as expected), not MemoryError.
Given that there's already support for "fixed table" maps, which are
essentially ordered maps, the implementation of OrderedDict just extends
"fixed table" maps by adding an "is ordered" flag and add/remove
operations, and reuses 95% of objdict code, just making methods tolerant
to both dict and OrderedDict.
Some things are missing so far, like CPython-compatible repr and comparison.
OrderedDict is Disabled by default; enabled on unix and stmhal ports.
These allow to fine-tune the compiler to select whether it optimises
tuple assignments of the form a, b = c, d and a, b, c = d, e, f.
Sensible defaults are provided.
This is rarely used feature which takes enough code to implement, so is
controlled by MICROPY_PY_ARRAY_SLICE_ASSIGN config setting, default off.
But otherwise it may be useful, as allows to update arbitrary-sized data
buffers in-place.
Slice is yet to implement, and actually, slice assignment implemented in
such a way that RHS of assignment should be array of the exact same item
typecode as LHS. CPython has it more relaxed, where RHS can be any sequence
of compatible types (e.g. it's possible to assign list of int's to a
bytearray slice).
Overall, when all "slice write" features are implemented, it may cost ~1KB
of code.
The implementation of these functions is very large (order 4k) and they
are rarely used, so we don't enable them by default.
They are however enabled in stmhal and unix, since we have the room.
pyexec_friendly_repl_process_char() and friends, useful for ports which
integrate into existing cooperative multitasking system.
Unlike readline() refactor before, this was implemented in less formal,
trial&error process, minor functionality regressions are still known
(like soft&hard reset support). So, original loop-based pyexec_friendly_repl()
is left intact, specific implementation selectable by config setting.
Native code has GC-heap pointers in it so it must be scanned. But on
unix port memory for native functions is mmap'd, and so it must have
explicit code to scan it for root pointers.
This new config option sets how many fixed-number-of-bytes to use to
store the length of each qstr. Previously this was hard coded to 2,
but, as per issue #1056, this is considered overkill since no-one
needs identifiers longer than 255 bytes.
With this patch the number of bytes for the length is configurable, and
defaults to 1 byte. The configuration option filters through to the
makeqstrdata.py script.
Code size savings going from 2 to 1 byte:
- unix x64 down by 592 bytes
- stmhal down by 1148 bytes
- bare-arm down by 284 bytes
Also has RAM savings, and will be slightly more efficient in execution.
Compiler optimises lookup of module.CONST when enabled (an existing
feature). Disabled by default; enabled for unix, windows, stmhal.
Costs about 100 bytes ROM on stmhal.
This allows to enable mem-info functions in micropython module, even if
MICROPY_MEM_STATS is not enabled. In this case, you get mem_info and
qstr_info but not mem_{total,current,peak}.
This is a simple optimisation inspired by JITing technology: we cache in
the bytecode (using 1 byte) the offset of the last successful lookup in
a map. This allows us next time round to check in that location in the
hash table (mp_map_t) for the desired entry, and if it's there use that
entry straight away. Otherwise fallback to a normal map lookup.
Works for LOAD_NAME, LOAD_GLOBAL, LOAD_ATTR and STORE_ATTR opcodes.
On a few tests it gives >90% cache hit and greatly improves speed of
code.
Disabled by default. Enabled for unix and stmhal ports.
This patch consolidates all global variables in py/ core into one place,
in a global structure. Root pointers are all located together to make
GC tracing easier and more efficient.