mp_lexer_t type is exposed, mp_token_t type is removed, and simple lexer
functions (like checking current token kind) are now inlined.
This saves 784 bytes ROM on 32-bit unix, 348 bytes on stmhal, and 460
bytes on bare-arm. It also saves a tiny bit of RAM since mp_lexer_t
is a bit smaller. Also will run a bit more efficiently.
Behaviour of array initialisation is subtly different for bytes,
bytearray and array.array when argument has buffer protocol. This patch
gets us CPython conformant (except we allow initialisation of
array.array by buffer with length not a multiple of typecode).
By using the buffer protocol for these array operations, we now allow
addition of memoryview objects, and objects with "incompatible"
typecodes (in this case it just adds bytes naively). This is an
extension to CPython which seems sensible. It also reduces the code
size.
Before, __repl_print__() used libc printf(), while print() used uPy streams
and own printf() implementation. This led to subtle, but confusing
differences in output when just doing "foo" vs "print(foo)" on interactive
prompt.
Currently compilation sporadically fails, because the automatic
dependency gets created *during* the compilation of objects.
OBJ is a auperset of PY_O and the dependencies apply to all objects.
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Going from MICROPY_ERROR_REPORTING_NORMAL to
MICROPY_ERROR_REPORTING_TERSE now saves 2020 bytes ROM for ARM Thumb2,
and 2200 bytes ROM for 32-bit x86.
This is about a 2.5% code size reduction for bare-arm.
When compiler optimization has been turned on, gcc knows that this code
block is not going to be executed. But with -O0 it complains about
path_items being used uninitialized.
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
This turns failing assertions to type exceptions for things like
b"123".find(...). We still don't support operations like this on bytes
objects (unlike CPython), but at least it no longer crashes.
Eg b"123" + bytearray(2) now works. This patch actually decreases code
size while adding functionality: 32-bit unix down by 128 bytes, stmhal
down by 84 bytes.
Uninitialised struct members get a default value of 0/false, so this is
not strictly needed. But it actually decreases code size because when
all members are initialised the compiler doesn't need to insert a call
to memset to clear everything. In other words, setting 1 extra member
to 0 uses less code than calling memset.
ROM savings in bytes: 32-bit unix: 100; bare-arm: 44; stmhal: 52.
gc.enable/disable are now the same as CPython: they just control whether
automatic garbage collection is enabled or not. If disabled, you can
still allocate heap memory, and initiate a manual collection.
msvc does not treat 1L a 64bit integer hence all occurences of shifting it left or right
result in undefined behaviour since the maximum allowed shift count for 32bit ints is 31.
Forcing the correct type explicitely, stored in MPZ_LONG_1, solves this.
It should be fair to say that almost in all cases where some API call
expects string, it should be also possible to pass byte string. For example,
it should be open/delete/rename file with name as bytestring. Note that
similar change was done quite a long ago to mp_obj_str_get_data().
Support for packages as argument not implemented, but otherwise error and
exit handling should be correct. This for example will allow to do:
pip-micropython install micropython-test.pystone
micropython -m test.pystone
This allows to implement KeyboardInterrupt on unix, and a much safer
ctrl-C in stmhal port. First ctrl-C is a soft one, with hope that VM
will notice it; second ctrl-C is a hard one that kills anything (for
both unix and stmhal).
One needs to check for a pending exception in the VM only for jump
opcodes. Others can't produce an infinite loop (infinite recursion is
caught by stack check).
There is a lot potential in compress bytecodes and make more use of the
coding space. This patch introduces "multi" bytecodes which have their
argument included in the bytecode (by addition).
UNARY_OP and BINARY_OP now no longer take a 1 byte argument for the
opcode. Rather, the opcode is included in the first byte itself.
LOAD_FAST_[0,1,2] and STORE_FAST_[0,1,2] are removed in favour of their
multi versions, which can take an argument between 0 and 15 inclusive.
The majority of LOAD_FAST/STORE_FAST codes fit in this range and so this
saves a byte for each of these.
LOAD_CONST_SMALL_INT_MULTI is used to load small ints between -16 and 47
inclusive. Such ints are quite common and now only need 1 byte to
store, and now have much faster decoding.
In all this patch saves about 2% RAM for typically bytecode (1.8% on
64-bit test, 2.5% on pyboard test). It also reduces the binary size
(because bytecodes are simplified) and doesn't harm performance.
This saves a lot of RAM for 2 reasons:
1. For functions that don't have default values, var args or var kw
args (which is a large number of functions in the general case), the
mp_obj_fun_bc_t type now fits in 1 GC block (previously needed 2 because
of the extra pointer to point to the arg_names array). So this saves 16
bytes per function (32 bytes on 64-bit machines).
2. Combining separate memory regions generally saves RAM because the
unused bytes at the end of the GC block are saved for 1 of the blocks
(since that block doesn't exist on its own anymore). So generally this
saves 8 bytes per function.
Tested by importing lots of modules:
- 64-bit Linux gave about an 8% RAM saving for 86k of used RAM.
- pyboard gave about a 6% RAM saving for 31k of used RAM.