By using the buffer protocol for these array operations, we now allow
addition of memoryview objects, and objects with "incompatible"
typecodes (in this case it just adds bytes naively). This is an
extension to CPython which seems sensible. It also reduces the code
size.
Before, __repl_print__() used libc printf(), while print() used uPy streams
and own printf() implementation. This led to subtle, but confusing
differences in output when just doing "foo" vs "print(foo)" on interactive
prompt.
Currently compilation sporadically fails, because the automatic
dependency gets created *during* the compilation of objects.
OBJ is a auperset of PY_O and the dependencies apply to all objects.
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Going from MICROPY_ERROR_REPORTING_NORMAL to
MICROPY_ERROR_REPORTING_TERSE now saves 2020 bytes ROM for ARM Thumb2,
and 2200 bytes ROM for 32-bit x86.
This is about a 2.5% code size reduction for bare-arm.
When compiler optimization has been turned on, gcc knows that this code
block is not going to be executed. But with -O0 it complains about
path_items being used uninitialized.
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
This turns failing assertions to type exceptions for things like
b"123".find(...). We still don't support operations like this on bytes
objects (unlike CPython), but at least it no longer crashes.
Eg b"123" + bytearray(2) now works. This patch actually decreases code
size while adding functionality: 32-bit unix down by 128 bytes, stmhal
down by 84 bytes.
Uninitialised struct members get a default value of 0/false, so this is
not strictly needed. But it actually decreases code size because when
all members are initialised the compiler doesn't need to insert a call
to memset to clear everything. In other words, setting 1 extra member
to 0 uses less code than calling memset.
ROM savings in bytes: 32-bit unix: 100; bare-arm: 44; stmhal: 52.
gc.enable/disable are now the same as CPython: they just control whether
automatic garbage collection is enabled or not. If disabled, you can
still allocate heap memory, and initiate a manual collection.
msvc does not treat 1L a 64bit integer hence all occurences of shifting it left or right
result in undefined behaviour since the maximum allowed shift count for 32bit ints is 31.
Forcing the correct type explicitely, stored in MPZ_LONG_1, solves this.
It should be fair to say that almost in all cases where some API call
expects string, it should be also possible to pass byte string. For example,
it should be open/delete/rename file with name as bytestring. Note that
similar change was done quite a long ago to mp_obj_str_get_data().
Support for packages as argument not implemented, but otherwise error and
exit handling should be correct. This for example will allow to do:
pip-micropython install micropython-test.pystone
micropython -m test.pystone
This allows to implement KeyboardInterrupt on unix, and a much safer
ctrl-C in stmhal port. First ctrl-C is a soft one, with hope that VM
will notice it; second ctrl-C is a hard one that kills anything (for
both unix and stmhal).
One needs to check for a pending exception in the VM only for jump
opcodes. Others can't produce an infinite loop (infinite recursion is
caught by stack check).
There is a lot potential in compress bytecodes and make more use of the
coding space. This patch introduces "multi" bytecodes which have their
argument included in the bytecode (by addition).
UNARY_OP and BINARY_OP now no longer take a 1 byte argument for the
opcode. Rather, the opcode is included in the first byte itself.
LOAD_FAST_[0,1,2] and STORE_FAST_[0,1,2] are removed in favour of their
multi versions, which can take an argument between 0 and 15 inclusive.
The majority of LOAD_FAST/STORE_FAST codes fit in this range and so this
saves a byte for each of these.
LOAD_CONST_SMALL_INT_MULTI is used to load small ints between -16 and 47
inclusive. Such ints are quite common and now only need 1 byte to
store, and now have much faster decoding.
In all this patch saves about 2% RAM for typically bytecode (1.8% on
64-bit test, 2.5% on pyboard test). It also reduces the binary size
(because bytecodes are simplified) and doesn't harm performance.
This saves a lot of RAM for 2 reasons:
1. For functions that don't have default values, var args or var kw
args (which is a large number of functions in the general case), the
mp_obj_fun_bc_t type now fits in 1 GC block (previously needed 2 because
of the extra pointer to point to the arg_names array). So this saves 16
bytes per function (32 bytes on 64-bit machines).
2. Combining separate memory regions generally saves RAM because the
unused bytes at the end of the GC block are saved for 1 of the blocks
(since that block doesn't exist on its own anymore). So generally this
saves 8 bytes per function.
Tested by importing lots of modules:
- 64-bit Linux gave about an 8% RAM saving for 86k of used RAM.
- pyboard gave about a 6% RAM saving for 31k of used RAM.
This makes open() and _io.FileIO() more CPython compliant.
The mode kwarg is fully iplemented.
The encoding kwarg is allowed but not implemented; mainly to allow
the tests to specify encoding for CPython, see #874
Also, usocket.readinto(). Known issue is that .readinto() should be available
only for binary files, but micropython uses single method table for both
binary and text files.
Just like they handled in other read*(). Note that behavior of readline()
in case there's no data when it's called is underspecified in Python lib
spec, implemented to behave as read() - return None.
With this patch a port can enable module weak link support and provide
a dict of qstr->module mapping. This mapping is looked up only if an
import fails to find the requested module in the filesystem.
This allows to have the builtin module named, eg, usocket, and provide
a weak link of "socket" to the same module, but this weak link can be
overridden if a file by the name "socket.py" is found in the import
path.
This has benefits all round: code factoring for parse/compile/execute,
proper context save/restore for exec, allow to sepcify globals/locals
for eval, and reduced ROM usage by >100 bytes on stmhal and unix.
Also, the call to mp_parse_compile_execute is tail call optimised for
the import code, so it doesn't increase stack memory usage.
In CPython IOError (and EnvironmentError) is deprecated and aliased to
OSError. All modules that used to raise IOError now raise OSError (or a
derived exception).
In Micro Python we never used IOError (except 1 place, incorrectly) and
so don't need to keep it.
See http://legacy.python.org/dev/peps/pep-3151/ for background.
Viper can now do the following:
def store(p:ptr8, c:int):
p[0] = c
This does a store of c to the memory pointed to by p using a machine
instructions inline in the code.
It seems most sensible to use size_t for measuring "number of bytes" in
malloc and vstr functions (since that's what size_t is for). We don't
use mp_uint_t because malloc and vstr are not Micro Python specific.
mp_parse_node_free now frees the memory associated with non-interned
strings. And the parser calls mp_parse_node_free when discarding a
non-used node (such as a doc string).
Also, the compiler now frees the parse tree explicitly just before it
exits (as opposed to relying on the caller to do this).
Addresses issue #708 as best we can.
Stack is full descending and must be 8-byte aligned. It must start off
pointing to just above the last byte of RAM.
Previously, stack started pointed to last byte of RAM (eg 0x2001ffff)
and so was not 8-byte aligned. This caused a bug in combination with
alloca.
This patch also updates some debug printing code.
Addresses issue #872 (among many other undiscovered issues).
Heap RAM was being allocated to print dicts and do some other types of
iterating. Now these iterations use 1 word of state on the stack.
Deleting elements from a dict was not allowing the value to be reclaimed
by the GC. This is now fixed.
sys.exit always raises SystemExit so doesn't need a special
implementation for each port. If C exit() is really needed, use the
standard os._exit function.
Also initialise mp_sys_path and mp_sys_argv in teensy port.
Eventually, viper wants to be able to use raw pointers to strings and
arrays for efficient access. But for now, let's just load strings as a
Python object so they can be used as normal. This will anyway be
compatible with eventual intended viper behaviour.
Addresses issue #857.
Type representing signed size doesn't have to be int, so use special value
which defaults to SSIZE_MAX, but as it's not defined by C standard (but rather
by POSIX), allow ports to set it.
Previously, mpz was restricted to using at most 15 bits in each digit,
where a digit was a uint16_t.
With this patch, mpz can use all 16 bits in the uint16_t (improvement
to mpn_div was required). This gives small inprovements in speed and
RAM usage. It also yields savings in ROM code size because all of the
digit masking operations become no-ops.
Also, mpz can now use a uint32_t as the digit type, and hence use 32
bits per digit. This will give decent improvements in mpz speed on
64-bit machines.
Test for big integer division added.
Code-info size, block name, source name, n_state and n_exc_stack now use
variable length encoded uints. This saves 7-9 bytes per bytecode
function for most functions.