This new compile-time option allows to make the bytecode compiler
configurable at runtime by setting the fields in the mp_dynamic_compiler
structure. By using this feature, the compiler can generate bytecode
that targets any MicroPython runtime/VM, regardless of the host and
target compile-time settings.
Options so far that fall under this dynamic setting are:
- maximum number of bits that a small int can hold;
- whether caching of lookups is used in the bytecode;
- whether to use unicode strings or not (lexer behaviour differs, and
therefore generated string constants differ).
These can be used to insert arbitrary checks, polling, etc into the VM.
They are left general because the VM is a highly tuned loop and it should
be up to a given port how that port wants to modify the VM internals.
One common use would be to insert a polling check, but only done after
a certain number of opcodes were executed, so as not to slow down the VM
too much. For example:
#define MICROPY_VM_HOOK_COUNT (30)
#define MICROPY_VM_HOOK_INIT static uint vm_hook_divisor = MICROPY_VM_HOOK_COUNT
#define MICROPY_VM_HOOK_POLL if (--vm_hook_divisor == 0) { \
vm_hook_divisor = MICROPY_VM_HOOK_COUNT;
extern void vm_hook_function(void);
vm_hook_function();
}
#define MICROPY_VM_HOOK_LOOP MICROPY_VM_HOOK_POLL
#define MICROPY_VM_HOOK_RETURN MICROPY_VM_HOOK_POLL
For these 3 bitwise operations there are now fast functions for
positive-only arguments, and general functions for arbitrary sign
arguments (the fast functions are the existing implementation).
By default the fast functions are not used (to save space) and instead
the general functions are used for all operations.
Enable MICROPY_OPT_MPZ_BITWISE to use the fast functions for positive
arguments.
Functions added are:
- randint
- randrange
- choice
- random
- uniform
They are enabled with configuration variable
MICROPY_PY_URANDOM_EXTRA_FUNCS, which is disabled by default. It is
enabled for unix coverage build and stmhal.
Seedable and reproducible pseudo-random number generator. Implemented
functions are getrandbits(n) (n <= 32) and seed().
The algorithm used is Yasmarang by Ilya Levin:
http://www.literatecode.com/yasmarang
POSIX doesn't guarantee something like that to work, but it works on any
system with careful signal implementation. Roughly, the requirement is
that signal handler is executed in the context of the process, its main
thread, etc. This is true for Linux. Also tested to work without issues
on MacOSX.
This makes all tests pass again for 64bit windows builds which would
previously fail for anything printing ranges (builtin_range/unpack1)
because they were printed as range( ld, ld ).
This is done by reusing the mp_vprintf implementation for MICROPY_OBJ_REPR_D
for 64bit windows builds (both msvc and mingw-w64) since the format specifier
used for 64bit integers is also %lld, or %llu for the unsigned version.
Note these specifiers used to be fetched from inttypes.h, which is the
C99 way of working with printf/scanf in a portable way, but mingw-w64
wants to be backwards compatible with older MS C runtimes and uses
the non-portable %I64i instead of %lld in inttypes.h, so remove the use
of said header again in mpconfig.h and define the specifiers manually.
MICROPY_ENABLE_COMPILER can be used to enable/disable the entire compiler,
which is useful when only loading of pre-compiled bytecode is supported.
It is enabled by default.
MICROPY_PY_BUILTINS_EVAL_EXEC controls support of eval and exec builtin
functions. By default they are only included if MICROPY_ENABLE_COMPILER
is enabled.
Disabling both options saves about 40k of code size on 32-bit x86.
To use, put the following in mpconfigport.h:
#define MICROPY_OBJ_REPR (MICROPY_OBJ_REPR_D)
#define MICROPY_FLOAT_IMPL (MICROPY_FLOAT_IMPL_DOUBLE)
typedef int64_t mp_int_t;
typedef uint64_t mp_uint_t;
#define UINT_FMT "%llu"
#define INT_FMT "%lld"
Currently does not work with native emitter enabled.
- add mp_int_t/mp_uint_t typedefs in mpconfigport.h
- fix integer suffixes/formatting in mpconfig.h and mpz.h
- use MICROPY_NLR_SETJMP=1 in Makefile since the current nlrx64.S
implementation causes segfaults in gc_free()
- update README
MICROPY_PERSISTENT_CODE must be enabled, and then enabling
MICROPY_PERSISTENT_CODE_LOAD/SAVE (either or both) will allow loading
and/or saving of code (at the moment just bytecode) from/to a .mpy file.
Main changes when MICROPY_PERSISTENT_CODE is enabled are:
- qstrs are encoded as 2-byte fixed width in the bytecode
- all pointers are removed from bytecode and put in const_table (this
includes const objects and raw code pointers)
Ultimately this option will enable persistence for not just bytecode but
also native code.
This patch adds/subtracts a constant from the 30-bit float representation
so that str/qstr representations are favoured: they now have all the high
bits set to zero. This makes encoding/decoding qstr strings more
efficient (and they are used more often than floats, which are now
slightly less efficient to encode/decode).
Saves about 300 bytes of code space on Thumb 2 arch.
This new object representation puts floats into the object word instead
of on the heap, at the expense of reducing their precision to 30 bits.
It only makes sense when the word size is 32-bits.
Cortex-M0, M0+ and M1 only have ARMv6-M Thumb/Thumb2 instructions. M3,
M4 and M7 have a superset of these, named ARMv7-M. This patch adds a
config option to enable support of the superset of instructions.
It makes much more sense to do constant folding in the parser while the
parse tree is being built. This eliminates the need to create parse
nodes that will just be folded away. The code is slightly simpler and a
bit smaller as well.
Constant folding now has a configuration option,
MICROPY_COMP_CONST_FOLDING, which is enabled by default.
With this patch parse nodes are allocated sequentially in chunks. This
reduces fragmentation of the heap and prevents waste at the end of
individually allocated parse nodes.
Saves roughly 20% of RAM during parse stage.
4 spaces are added at start of line to match previous indent, and if
previous line ended in colon.
Backspace deletes 4 space if only spaces begin a line.
Configurable via MICROPY_REPL_AUTO_INDENT. Disabled by default.
unix-cpy was originally written to get semantic equivalent with CPython
without writing functional tests. When writing the initial
implementation of uPy it was a long way between lexer and functional
tests, so the half-way test was to make sure that the bytecode was
correct. The idea was that if the uPy bytecode matched CPython 1-1 then
uPy would be proper Python if the bytecodes acted correctly. And having
matching bytecode meant that it was less likely to miss some deep
subtlety in the Python semantics that would require an architectural
change later on.
But that is all history and it no longer makes sense to retain the
ability to output CPython bytecode, because:
1. It outputs CPython 3.3 compatible bytecode. CPython's bytecode
changes from version to version, and seems to have changed quite a bit
in 3.5. There's no point in changing the bytecode output to match
CPython anymore.
2. uPy and CPy do different optimisations to the bytecode which makes it
harder to match.
3. The bytecode tests are not run. They were never part of Travis and
are not run locally anymore.
4. The EMIT_CPYTHON option needs a lot of extra source code which adds
heaps of noise, especially in compile.c.
5. Now that there is an extensive test suite (which tests functionality)
there is no need to match the bytecode. Some very subtle behaviour is
tested with the test suite and passing these tests is a much better
way to stay Python-language compliant, rather than trying to match
CPy bytecode.
This patch makes configurable, via MICROPY_QSTR_BYTES_IN_HASH, the
number of bytes used for a qstr hash. It was originally fixed at 2
bytes, and now defaults to 2 bytes. Setting it to 1 byte will save
ROM and RAM at a small expense of hash collisions.
Previous to this patch all interned strings lived in their own malloc'd
chunk. On average this wastes N/2 bytes per interned string, where N is
the number-of-bytes for a quanta of the memory allocator (16 bytes on 32
bit archs).
With this patch interned strings are concatenated into the same malloc'd
chunk when possible. Such chunks are enlarged inplace when possible,
and shrunk to fit when a new chunk is needed.
RAM savings with this patch are highly varied, but should always show an
improvement (unless only 3 or 4 strings are interned). New version
typically uses about 70% of previous memory for the qstr data, and can
lead to savings of around 10% of total memory footprint of a running
script.
Costs about 120 bytes code size on Thumb2 archs (depends on how many
calls to gc_realloc are made).
The TimeoutError is useful for some modules, specially the the
socket module. TimeoutError can then be alised to socket.timeout
and then Python code can differentiate between socket.error and
socket.timeout.
Previous to this patch a call such as list.append(1, 2) would lead to a
seg fault. This is because list.append is a builtin method and the first
argument to such methods is always assumed to have the correct type.
Now, when a builtin method is extracted like this it is wrapped in a
checker object which checks the the type of the first argument before
calling the builtin function.
This feature is contrelled by MICROPY_BUILTIN_METHOD_CHECK_SELF_ARG and
is enabled by default.
See issue #1216.
From https://docs.python.org/3/library/constants.html#NotImplemented :
"Special value which should be returned by the binary special methods
(e.g. __eq__(), __lt__(), __add__(), __rsub__(), etc.) to indicate
that the operation is not implemented with respect to the other type;
may be returned by the in-place binary special methods (e.g. __imul__(),
__iand__(), etc.) for the same purpose. Its truth value is true."
Some people however appear to abuse it to mean "no value" when None is
a legitimate value (don't do that).
The implementation is very basic and non-compliant and provided solely for
CPython compatibility. The function itself is bad Python2 heritage, its
usage is discouraged.
Adds support for the following Thumb2 VFP instructions, via the option
MICROPY_EMIT_INLINE_THUMB_FLOAT:
vcmp
vsqrt
vneg
vcvt_f32_to_s32
vcvt_s32_to_f32
vmrs
vmov
vldr
vstr
vadd
vsub
vmul
vdiv
Previous to this patch the printing mechanism was a bit of a tangled
mess. This patch attempts to consolidate printing into one interface.
All (non-debug) printing now uses the mp_print* family of functions,
mainly mp_printf. All these functions take an mp_print_t structure as
their first argument, and this structure defines the printing backend
through the "print_strn" function of said structure.
Printing from the uPy core can reach the platform-defined print code via
two paths: either through mp_sys_stdout_obj (defined pert port) in
conjunction with mp_stream_write; or through the mp_plat_print structure
which uses the MP_PLAT_PRINT_STRN macro to define how string are printed
on the platform. The former is only used when MICROPY_PY_IO is defined.
With this new scheme printing is generally more efficient (less layers
to go through, less arguments to pass), and, given an mp_print_t*
structure, one can call mp_print_str for efficiency instead of
mp_printf("%s", ...). Code size is also reduced by around 200 bytes on
Thumb2 archs.
splitlines() occurs ~179 times in CPython3 standard library, so was
deemed worthy to implement. The method has subtle semantic differences
from just .split("\n"). It is also defined as working for any end-of-line
combination, but this is currently not implemented - it works only with
LF line-endings (which should be OK for text strings on any platforms,
but not OK for bytes).
I.e. in this mode, C stack will never be used to call a Python function,
but if there's no free heap for a call, it will be reported as
RuntimeError (as expected), not MemoryError.
Given that there's already support for "fixed table" maps, which are
essentially ordered maps, the implementation of OrderedDict just extends
"fixed table" maps by adding an "is ordered" flag and add/remove
operations, and reuses 95% of objdict code, just making methods tolerant
to both dict and OrderedDict.
Some things are missing so far, like CPython-compatible repr and comparison.
OrderedDict is Disabled by default; enabled on unix and stmhal ports.
These allow to fine-tune the compiler to select whether it optimises
tuple assignments of the form a, b = c, d and a, b, c = d, e, f.
Sensible defaults are provided.
This is rarely used feature which takes enough code to implement, so is
controlled by MICROPY_PY_ARRAY_SLICE_ASSIGN config setting, default off.
But otherwise it may be useful, as allows to update arbitrary-sized data
buffers in-place.
Slice is yet to implement, and actually, slice assignment implemented in
such a way that RHS of assignment should be array of the exact same item
typecode as LHS. CPython has it more relaxed, where RHS can be any sequence
of compatible types (e.g. it's possible to assign list of int's to a
bytearray slice).
Overall, when all "slice write" features are implemented, it may cost ~1KB
of code.
The implementation of these functions is very large (order 4k) and they
are rarely used, so we don't enable them by default.
They are however enabled in stmhal and unix, since we have the room.
pyexec_friendly_repl_process_char() and friends, useful for ports which
integrate into existing cooperative multitasking system.
Unlike readline() refactor before, this was implemented in less formal,
trial&error process, minor functionality regressions are still known
(like soft&hard reset support). So, original loop-based pyexec_friendly_repl()
is left intact, specific implementation selectable by config setting.
Native code has GC-heap pointers in it so it must be scanned. But on
unix port memory for native functions is mmap'd, and so it must have
explicit code to scan it for root pointers.
This new config option sets how many fixed-number-of-bytes to use to
store the length of each qstr. Previously this was hard coded to 2,
but, as per issue #1056, this is considered overkill since no-one
needs identifiers longer than 255 bytes.
With this patch the number of bytes for the length is configurable, and
defaults to 1 byte. The configuration option filters through to the
makeqstrdata.py script.
Code size savings going from 2 to 1 byte:
- unix x64 down by 592 bytes
- stmhal down by 1148 bytes
- bare-arm down by 284 bytes
Also has RAM savings, and will be slightly more efficient in execution.
Compiler optimises lookup of module.CONST when enabled (an existing
feature). Disabled by default; enabled for unix, windows, stmhal.
Costs about 100 bytes ROM on stmhal.
This allows to enable mem-info functions in micropython module, even if
MICROPY_MEM_STATS is not enabled. In this case, you get mem_info and
qstr_info but not mem_{total,current,peak}.
This is a simple optimisation inspired by JITing technology: we cache in
the bytecode (using 1 byte) the offset of the last successful lookup in
a map. This allows us next time round to check in that location in the
hash table (mp_map_t) for the desired entry, and if it's there use that
entry straight away. Otherwise fallback to a normal map lookup.
Works for LOAD_NAME, LOAD_GLOBAL, LOAD_ATTR and STORE_ATTR opcodes.
On a few tests it gives >90% cache hit and greatly improves speed of
code.
Disabled by default. Enabled for unix and stmhal ports.
This patch consolidates all global variables in py/ core into one place,
in a global structure. Root pointers are all located together to make
GC tracing easier and more efficient.
This patch adds a configuration option (MICROPY_CAN_OVERRIDE_BUILTINS)
which, when enabled, allows to override all names within the builtins
module. A builtins override dict is created the first time the user
assigns to a name in the builtins model, and then that dict is searched
first on subsequent lookups. Note that this implementation doesn't
allow deleting of names.
This patch also does some refactoring of builtins code, creating the
modbuiltins.c file.
Addresses issue #959.
Going from MICROPY_ERROR_REPORTING_NORMAL to
MICROPY_ERROR_REPORTING_TERSE now saves 2020 bytes ROM for ARM Thumb2,
and 2200 bytes ROM for 32-bit x86.
This is about a 2.5% code size reduction for bare-arm.
With this patch a port can enable module weak link support and provide
a dict of qstr->module mapping. This mapping is looked up only if an
import fails to find the requested module in the filesystem.
This allows to have the builtin module named, eg, usocket, and provide
a weak link of "socket" to the same module, but this weak link can be
overridden if a file by the name "socket.py" is found in the import
path.
Type representing signed size doesn't have to be int, so use special value
which defaults to SSIZE_MAX, but as it's not defined by C standard (but rather
by POSIX), allow ports to set it.
Because (for Thumb) a function pointer has the LSB set, pointers to
dynamic functions in RAM (eg native, viper or asm functions) were not
being traced by the GC. This patch is a comprehensive fix for this.
Addresses issue #820.
The user code should call micropython.alloc_emergency_exception_buf(size)
where size is the size of the buffer used to print the argument
passed to the exception.
With the test code from #732, and a call to
micropython.alloc_emergenncy_exception_buf(100) the following error is
now printed:
```python
>>> import heartbeat_irq
Uncaught exception in Timer(4) interrupt handler
Traceback (most recent call last):
File "0://heartbeat_irq.py", line 14, in heartbeat_cb
NameError: name 'led' is not defined
```
Such mechanism is important to get stable Python functioning, because Python
function calling is handled with C stack. The idea is to sprinkle
STACK_CHECK() calls in places where there can be C recursion.
TODO: Add more STACK_CHECK()'s.
This completes non-automatic interning of strings in the parser, so that
doc strings don't take up RAM. It complicates the parser and compiler,
and bloats stmhal by about 300 bytes. It's complicated because now
there are 2 kinds of parse-nodes that can be strings: interned leaves
and non-interned structs.
io.FileIO is binary I/O, ans actually optional. Default file type is
io.TextIOWrapper, which provides str results. CPython3 explicitly describes
io.TextIOWrapper as buffered I/O, but we don't have buffering support yet
anyway.
You can now do:
X = const(123)
Y = const(456 + X)
and the compiler will replace X and Y with their values.
See discussion in issue #266 and issue #573.
There are 2 locations in parser, and 1 in compiler, where memory
allocation is not precise. In the parser it's the rule stack and result
stack, in the compiler it's the array for the identifiers in the current
scope. All other mallocs are exact (ie they don't allocate more than is
needed).
This patch adds tuning options (MP_ALLOC_*) to mpconfig.h for these 3
inexact allocations.
The inexact allocations in the parser should actually be close to
logarithmic: you need an exponentially larger script (absent pathological
cases) to use up more room on the rule and result stacks. As such, the
default allocation policy for these is now to start with a modest sized
stack, but grow only in small increments.
For the identifier arrays in the compiler, these now start out quite
small (4 entries, since most functions don't have that many ids), and
grow incrementally by 6 (since if you have more ids than 4, you probably
have quite a few more, but it wouldn't be exponentially more).
Partially addresses issue #560.
Blanket wide to all .c and .h files. Some files originating from ST are
difficult to deal with (license wise) so it was left out of those.
Also merged modpyb.h, modos.h, modstm.h and modtime.h in stmhal/.
On stmhal, computed gotos make the binary about 1k bigger, but makes it
run faster, and we have the room, so why not. All tests pass on
pyboard using computed gotos.