With this patch parse nodes are allocated sequentially in chunks. This
reduces fragmentation of the heap and prevents waste at the end of
individually allocated parse nodes.
Saves roughly 20% of RAM during parse stage.
This patch adds more fine grained error message control for errors when
parsing integers (now has terse, normal and detailed). When detailed is
enabled, the error now escapes bytes when printing them so they can be
more easily seen.
When creating constant mpz's, the length of the mpz must be exactly how
many digits are used (not allocated) otherwise these numbers are not
compatible with dynamically allocated numbers.
Addresses issue #1448.
4 spaces are added at start of line to match previous indent, and if
previous line ended in colon.
Backspace deletes 4 space if only spaces begin a line.
Configurable via MICROPY_REPL_AUTO_INDENT. Disabled by default.
This optimises (in speed and code size) for the common case where the
binary op for the bool object is supported. Unsupported binary ops
still behave the same.
Function annotations are only needed when the native emitter is enabled
and when the current scope is emitted in viper mode. All other times
the annotations can be skipped completely.
Fetch the current usb mode and return a string representation when
pyb.usb_mode() is called with no args. The possible string values are interned
as qstr's. None will be returned if an incorrect mode is set.
Indeed, this flag efectively selects architecture target, and must
consistently apply to all compiles and links, including 3rd-party
libraries, unlike CFLAGS, which have MicroPython-specific setting.
unix-cpy was originally written to get semantic equivalent with CPython
without writing functional tests. When writing the initial
implementation of uPy it was a long way between lexer and functional
tests, so the half-way test was to make sure that the bytecode was
correct. The idea was that if the uPy bytecode matched CPython 1-1 then
uPy would be proper Python if the bytecodes acted correctly. And having
matching bytecode meant that it was less likely to miss some deep
subtlety in the Python semantics that would require an architectural
change later on.
But that is all history and it no longer makes sense to retain the
ability to output CPython bytecode, because:
1. It outputs CPython 3.3 compatible bytecode. CPython's bytecode
changes from version to version, and seems to have changed quite a bit
in 3.5. There's no point in changing the bytecode output to match
CPython anymore.
2. uPy and CPy do different optimisations to the bytecode which makes it
harder to match.
3. The bytecode tests are not run. They were never part of Travis and
are not run locally anymore.
4. The EMIT_CPYTHON option needs a lot of extra source code which adds
heaps of noise, especially in compile.c.
5. Now that there is an extensive test suite (which tests functionality)
there is no need to match the bytecode. Some very subtle behaviour is
tested with the test suite and passing these tests is a much better
way to stay Python-language compliant, rather than trying to match
CPy bytecode.
Previous to this patch there were some cases where line numbers for
errors were 0 (unknown). Now the compiler attempts to give a better
line number where possible, in some cases giving the line number of the
closest statement, and other cases the line number of the inner-most
scope of the error (eg the line number of the start of the function).
This helps to give good (and sometimes exact) line numbers for
ViperTypeError exceptions.
This patch also makes sure that the first compile error (eg SyntaxError)
that is encountered is reported (previously it was the last one that was
reported).
When looking to see if the REPL input needs to be continued on the next
line, don't look inside strings for unmatched ()[]{} ''' or """.
Addresses issue #1387.
ViperTypeError now includes filename and function name where the error
occurred. The line number is the line number of the start of the
function definition, which is the best that can be done without a lot
more work.
Partially addresses issue #1381.
This patch makes configurable, via MICROPY_QSTR_BYTES_IN_HASH, the
number of bytes used for a qstr hash. It was originally fixed at 2
bytes, and now defaults to 2 bytes. Setting it to 1 byte will save
ROM and RAM at a small expense of hash collisions.
Previous to this patch all interned strings lived in their own malloc'd
chunk. On average this wastes N/2 bytes per interned string, where N is
the number-of-bytes for a quanta of the memory allocator (16 bytes on 32
bit archs).
With this patch interned strings are concatenated into the same malloc'd
chunk when possible. Such chunks are enlarged inplace when possible,
and shrunk to fit when a new chunk is needed.
RAM savings with this patch are highly varied, but should always show an
improvement (unless only 3 or 4 strings are interned). New version
typically uses about 70% of previous memory for the qstr data, and can
lead to savings of around 10% of total memory footprint of a running
script.
Costs about 120 bytes code size on Thumb2 archs (depends on how many
calls to gc_realloc are made).
I checked the entire codebase, and every place that vstr_init_len
was called, there was a call to mp_obj_new_str_from_vstr after it.
mp_obj_new_str_from_vstr always tries to reallocate a new buffer
1 byte larger than the original to store the terminating null
character.
In many cases, if we allocated the initial buffer to be 1 byte
longer, we can prevent this extra allocation, and just reuse
the originally allocated buffer.
Asking to read 256 bytes and only getting 100 will still cause
the extra allocation, but if you ask to read 256 and get 256
then the extra allocation will be optimized away.
Yes - the reallocation is optimized in the heap to try and reuse
the buffer if it can, but it takes quite a few cycles to figure
this out.
Note by Damien: vstr_init_len should now be considered as a
string-init convenience function and used only when creating
null-terminated objects.
Previous to this patch, if "abcd" and "ab" were possible completions
to tab-completing "a", then tab would expand to "abcd" straight away
if this identifier appeared first in the dict.
The TimeoutError is useful for some modules, specially the the
socket module. TimeoutError can then be alised to socket.timeout
and then Python code can differentiate between socket.error and
socket.timeout.
When "micropython -m pkg.mod" command was used, relative imports in pkg.mod
didn't work, because pkg.mod.__name__ was set to __main__, and the fact that
it's a package submodule was missed. This is an original workaround to this
issue. TODO: investigate and compare how CPython deals with this issue.
Previous to this patch each time a bytes object was referenced a new
instance (with the same data) was created. With this patch a single
bytes object is created in the compiler and is loaded directly at execute
time as a true constant (similar to loading bignum and float objects).
This saves on allocating RAM and means that bytes objects can now be
used when the memory manager is locked (eg in interrupts).
The MP_BC_LOAD_CONST_BYTES bytecode was removed as part of this.
Generated bytecode is slightly larger due to storing a pointer to the
bytes object instead of the qstr identifier.
Code size is reduced by about 60 bytes on Thumb2 architectures.
Previous to this patch a call such as list.append(1, 2) would lead to a
seg fault. This is because list.append is a builtin method and the first
argument to such methods is always assumed to have the correct type.
Now, when a builtin method is extracted like this it is wrapped in a
checker object which checks the the type of the first argument before
calling the builtin function.
This feature is contrelled by MICROPY_BUILTIN_METHOD_CHECK_SELF_ARG and
is enabled by default.
See issue #1216.
mpconfigport.mk contains configuration options which affect the way
MicroPython is linked. In this regard, it's "stronger" configuration
dependency than even mpconfigport.h, so if we rebuild everything on
mpconfigport.h change, we certianly should of that on mpconfigport.mk
change too.
If heap allocation for the Python-stack of a function fails then we may
as well allocate the Python-stack on the C stack. This will allow to
run more code without using the heap.
This allows to do "ar[i]" and "ar[i] = val" in viper when ar is a Python
object and i and/or val are native viper types (eg ints).
Patch also includes tests for this feature.
This patch converts Q(abc) to "Q(abc)" to protect the abc from the
C preprocessor, then converts back after the preprocessor is finished.
So now we can safely put includes in mpconfig(port).h, and also
preprocess qstrdefsport.h (latter is now done also in this patch).
Addresses issue #1252.
C's printf will pad nan/inf differently to CPython. Our implementation
originally conformed to C, now it conforms to CPython's way.
Tests for this are also added in this patch.
This drops the size of unicode_isxdigit from 0x1e + 0x02 filler to
0x14 bytes (so net code reduction of 12 bytes) and will make
unicode_is_xdigit perform slightly faster.
This allows using (almost) the same code for printing floats everywhere,
removes the dependency on sprintf and uses just snprintf and
applies an msvc-specific fix for snprintf in a single place so
nan/inf are now printed correctly.
mp_obj_get_int_truncated will raise a TypeError if the argument is not
an integral type. Use mp_obj_int_get_truncated only when you know the
argument is a small or big int.
Hashing is now done using mp_unary_op function with MP_UNARY_OP_HASH as
the operator argument. Hashing for int, str and bytes still go via
fast-path in mp_unary_op since they are the most common objects which
need to be hashed.
This lead to quite a bit of code cleanup, and should be more efficient
if anything. It saves 176 bytes code space on Thumb2, and 360 bytes on
x86.
The only loss is that the error message "unhashable type" is now the
more generic "unsupported type for __hash__".
Unfortunately, MP_OBJ_STOP_ITERATION doesn't have means to pass an associated
value, so we can't optimize StopIteration exception with (non-None) argument
to MP_OBJ_STOP_ITERATION.
When generator raises exception, it is automatically terminated (by setting
its code_state.ip to 0), which interferes with this check.
Triggered in particular by CPython's test_pep380.py.
Exceptions in .close() should be ignored (dumped to sys.stderr, not
propagated), but in uPy, they are propagated. Fix would require
nlr-wrapping .close() call, which is expensive. Bu on the other hand,
.close() is not called often, so maybe that's not too bad (depends,
if it's finally called and that causes stack overflow, there's nothing
good in that). And yet on another hand, .close() can be implemented to
catch exceptions on its side, and that should be the right choice.
The code was apparently broken after 9988618e0e
"py: Implement full func arg passing for native emitter.". This attempts to
propagate those changes to ARM emitter.
User instances are hashable by default (using __hash__ inherited from
"object"). But if __eq__ is defined and __hash__ not defined in particular
class, instance is not hashable.
Having NotImplemented as MP_OBJ_SENTINEL turned out to be problematic
(it needs to be checked for in a lot of places, otherwise it'll crash
as would pass MP_OBJ_IS_OBJ()), so made a proper singleton value like
Ellipsis, both of them sharing the same type.
From https://docs.python.org/3/library/constants.html#NotImplemented :
"Special value which should be returned by the binary special methods
(e.g. __eq__(), __lt__(), __add__(), __rsub__(), etc.) to indicate
that the operation is not implemented with respect to the other type;
may be returned by the in-place binary special methods (e.g. __imul__(),
__iand__(), etc.) for the same purpose. Its truth value is true."
Some people however appear to abuse it to mean "no value" when None is
a legitimate value (don't do that).
Can complete names in the global namespace, as well as a chain of
attributes, eg pyb.Pin.board.<tab> will give a list of all board pins.
Costs 700 bytes ROM on Thumb2 arch, but greatly increases usability of
REPL prompt.
This doesn't handle case fo enclosed except blocks, but once again,
sys.exc_info() support is a workaround for software which uses it
instead of properly catching exceptions via variable in except clause.
The implementation is very basic and non-compliant and provided solely for
CPython compatibility. The function itself is bad Python2 heritage, its
usage is discouraged.
Before this patch a "with" block needed to create a bound method object
on the heap for the __exit__ call. Now it doesn't because we use
load_method instead of load_attr, and save the method+self on the stack.
This fixes a long standing problem that viper code generation gave
terrible error messages, and actually no errors on pyboard where
assertions are disabled.
Now all compile-time errors are raised as proper Python exceptions, and
are of type ViperTypeError.
Addresses issue #940.
Adds support for the following Thumb2 VFP instructions, via the option
MICROPY_EMIT_INLINE_THUMB_FLOAT:
vcmp
vsqrt
vneg
vcvt_f32_to_s32
vcvt_s32_to_f32
vmrs
vmov
vldr
vstr
vadd
vsub
vmul
vdiv
Previous to this patch the printing mechanism was a bit of a tangled
mess. This patch attempts to consolidate printing into one interface.
All (non-debug) printing now uses the mp_print* family of functions,
mainly mp_printf. All these functions take an mp_print_t structure as
their first argument, and this structure defines the printing backend
through the "print_strn" function of said structure.
Printing from the uPy core can reach the platform-defined print code via
two paths: either through mp_sys_stdout_obj (defined pert port) in
conjunction with mp_stream_write; or through the mp_plat_print structure
which uses the MP_PLAT_PRINT_STRN macro to define how string are printed
on the platform. The former is only used when MICROPY_PY_IO is defined.
With this new scheme printing is generally more efficient (less layers
to go through, less arguments to pass), and, given an mp_print_t*
structure, one can call mp_print_str for efficiency instead of
mp_printf("%s", ...). Code size is also reduced by around 200 bytes on
Thumb2 archs.
In particular, numbers which are less than 1.0 but which
round up to 1.0.
This also makes those numbers which round up to 1.0 to
print with e+00 rather than e-00 for those formats which
print exponents.
Addresses issue #1178.
This simplifies the API for objects and reduces code size (by around 400
bytes on Thumb2, and around 2k on x86). Performance impact was measured
with Pystone score, but change was barely noticeable.
Fixes msvc linker warnings about mismatching sizes between the mp_obj_fdfile_t
struct defined in file.c and the mp_uint_t declarations found in modsys.c and modbuiltins.c
This patch gets full function argument passing working with native
emitter. Includes named args, keyword args, default args, var args
and var keyword args. Fully Python compliant.
It reuses the bytecode mp_setup_code_state function to do all the hard
work. This function is slightly adjusted to accommodate native calls,
and the native emitter is forced a bit to emit similar prelude and
code-info as bytecode.
splitlines() occurs ~179 times in CPython3 standard library, so was
deemed worthy to implement. The method has subtle semantic differences
from just .split("\n"). It is also defined as working for any end-of-line
combination, but this is currently not implemented - it works only with
LF line-endings (which should be OK for text strings on any platforms,
but not OK for bytes).
I.e. in this mode, C stack will never be used to call a Python function,
but if there's no free heap for a call, it will be reported as
RuntimeError (as expected), not MemoryError.
When just the bytecode emitter is needed there is no need to have a
dynamic method table for the emitter back-end, and we can instead
directly call the mp_emit_bc_XXX functions. This gives a significant
reduction in code size and a very slight performance boost for the
compiler.
This patch saves 1160 bytes code on Thumb2 and 972 bytes on x86, when
native emitters are disabled.
Overall savings in code over the last 3 commits are:
bare-arm: 1664 bytes.
minimal: 2136 bytes.
stmhal: 584 bytes (it has native emitter enabled).
cc3200: 1736 bytes.
First pass for the compiler is computing the scope (eg if an identifier
is local or not) and originally had an entire table of methods dedicated
to this, most of which did nothing. With changes from previous commit,
this set of methods can be removed and the methods from the bytecode
emitter used instead, with very little modification -- this is what is
done in this commit.
This factoring has little to no impact on the speed of the compiler
(tested by compiling 3763 Python scripts and timing it).
This factoring reduces code size by about 270-300 bytes on Thumb2 archs,
and 400 bytes on x86.
mp_obj_t internal representation doesn't have to be a pointer to object,
it can be anything.
There's also a support for back-conversion in the form of MP_OBJ_UNCAST.
This is kind of optimization/status quo preserver to minimize patching the
existing code and avoid doing potentially expensive MP_OBJ_CAST over and
over. But then one may imagine implementations where MP_OBJ_UNCAST is very
expensive. But such implementations are unlikely interesting in practice.
Despite initial guess, this code factoring does not hamper performance.
In fact it seems to improve speed by a little: running pystone(1.2) on
pyboard (which gives a very stable result) this patch takes pystones
from 1729.51 up to 1742.16. Also, pystones on x64 increase by around
the same proportion (but it's much noisier).
Taking a look at the generated machine code, stack usage with this patch
is unchanged, and call is tail-optimised with all arguments in
registers. Code size decreases by about 50 bytes on Thumb2 archs.
"Base" should rather refer to "base type"."Base object for attribute
lookup" should rather be just "object".
Also, a case of common subexpression elimination.
Given that there's already support for "fixed table" maps, which are
essentially ordered maps, the implementation of OrderedDict just extends
"fixed table" maps by adding an "is ordered" flag and add/remove
operations, and reuses 95% of objdict code, just making methods tolerant
to both dict and OrderedDict.
Some things are missing so far, like CPython-compatible repr and comparison.
OrderedDict is Disabled by default; enabled on unix and stmhal ports.
These allow to fine-tune the compiler to select whether it optimises
tuple assignments of the form a, b = c, d and a, b, c = d, e, f.
Sensible defaults are provided.
This is rarely used feature which takes enough code to implement, so is
controlled by MICROPY_PY_ARRAY_SLICE_ASSIGN config setting, default off.
But otherwise it may be useful, as allows to update arbitrary-sized data
buffers in-place.
Slice is yet to implement, and actually, slice assignment implemented in
such a way that RHS of assignment should be array of the exact same item
typecode as LHS. CPython has it more relaxed, where RHS can be any sequence
of compatible types (e.g. it's possible to assign list of int's to a
bytearray slice).
Overall, when all "slice write" features are implemented, it may cost ~1KB
of code.
This makes exception traceback info self contained (ie doesn't rely on
list object, which was a bit of a hack), reduces code size, and reduces
RAM footprint of exception by eliminating the list object.
Addresses part of issue #1126.
The implementation of these functions is very large (order 4k) and they
are rarely used, so we don't enable them by default.
They are however enabled in stmhal and unix, since we have the room.
Most of printing infrastructure now uses streams, but mp_obj_print() used
libc's printf(), which led to weird buffering issues in output. So, switch
mp_obj_print() to streams too, even though it may make sense to move it to
a separate file, as it is purely a debugging function now.
Relative imports are based of a package, so we're currently at a module
within a package, we should get to package first.
Also, factor out path travsering operation, but this broke testing for
boundary errors with relative imports. TODO: reintroduce them, together
with proper tests.
Traceback allocation for exception will now never lead to recursive
MemoryError exception - if there's no memory for traceback, it simply
won't be created.
Pushing same NLR record twice would lead to "infinite loop" in nlr_jump
(but more realistically, it will crash as soon as NLR record on stack is
overwritten).
Previous to this patch, a big-int, float or imag constant was interned
(made into a qstr) and then parsed at runtime to create an object each
time it was needed. This is wasteful in RAM and not efficient. Now,
these constants are parsed straight away in the parser and turned into
objects. This allows constants with large numbers of digits (so
addresses issue #1103) and takes us a step closer to #722.
To enable parsing constants more efficiently, mp_parse should be allowed
to raise an exception, and mp_compile can already raise a MemoryError.
So these functions need to be protected by an nlr push/pop block.
This patch adds that feature in all places. This allows to simplify how
mp_parse and mp_compile are called: they now raise an exception if they
have an error and so explicit checking is not needed anymore.
This cleans up vstr so that it's a pure "variable buffer", and the user
can decide whether they need to add a terminating null byte. In most
places where vstr is used, the vstr did not need to be null terminated
and so this patch saves code size, a tiny bit of RAM, and makes vstr
usage more efficient. When null termination is needed it must be
done explicitly using vstr_null_terminate.
Eg, "() + 1" now tells you that __add__ is not supported for tuple and
int types (before it just said the generic "binary operator"). We reuse
the table of names for slot lookup because it would be a waste of code
space to store the pretty name for each operator.
- namedtuple was wrongly using MP_OBJ_QSTR_VALUE instead of mp_obj_str_get_qstr,
so when passed a non-interned string it would segfault; fix this by using mp_obj_str_get_qstr
- store the namedtuple field names as qstrs so it is not needed to use mp_obj_str_get_qstr
everytime the field name has to be accessed. This also slighty increases performance when
fetching attributes
There was really weird warning (promoted to error) when building Windows
port. Exact cause is still unknown, but it uncovered another issue:
8-bit and unicode str_make_new implementations should be mutually exclusive,
and not built at the same time. What we had is that bytes_decode() pulled
8-bit str_make_new() even for unicode build.
With this patch str/bytes construction is streamlined. Always use a
vstr to build a str/bytes object. If the size is known beforehand then
use vstr_init_len to allocate only required memory. Otherwise use
vstr_init and the vstr will grow as needed. Then use
mp_obj_new_str_from_vstr to create a str/bytes object using the vstr
memory.
Saves code ROM: 68 bytes on stmhal, 108 bytes on bare-arm, and 336 bytes
on unix x64.
This patch allows to reuse vstr memory when creating str/bytes object.
This improves memory usage.
Also saves code ROM: 128 bytes on stmhal, 92 bytes on bare-arm, and 88
bytes on unix x64.
pyexec_friendly_repl_process_char() and friends, useful for ports which
integrate into existing cooperative multitasking system.
Unlike readline() refactor before, this was implemented in less formal,
trial&error process, minor functionality regressions are still known
(like soft&hard reset support). So, original loop-based pyexec_friendly_repl()
is left intact, specific implementation selectable by config setting.
Bytecode also needs a pass to compute the stack size. This is because
the state size of the bytecode function is encoded as a variable uint,
so we must know the value of this uint before we encode it (otherwise
the size of the generated code changes from one pass to the next).
Having an entire pass for this seems wasteful (in time). Alternative is
to allocate fixed space for the state size (would need 3-4 bytes to be
general, when 1 byte is usually sufficient) which uses a bit of extra
RAM per bytecode function, and makes the code less elegant in places
where this uint is encoded/decoded.
So, for now, opt for an extra pass.
Native code has GC-heap pointers in it so it must be scanned. But on
unix port memory for native functions is mmap'd, and so it must have
explicit code to scan it for root pointers.
Previously to this patch all constant string/bytes objects were
interned by the compiler, and this lead to crashes when the qstr was too
long (noticeable now that qstr length storage defaults to 1 byte).
With this patch, long string/bytes objects are never interned, and are
referenced directly as constant objects within generated code using
load_const_obj.
This new config option sets how many fixed-number-of-bytes to use to
store the length of each qstr. Previously this was hard coded to 2,
but, as per issue #1056, this is considered overkill since no-one
needs identifiers longer than 255 bytes.
With this patch the number of bytes for the length is configurable, and
defaults to 1 byte. The configuration option filters through to the
makeqstrdata.py script.
Code size savings going from 2 to 1 byte:
- unix x64 down by 592 bytes
- stmhal down by 1148 bytes
- bare-arm down by 284 bytes
Also has RAM savings, and will be slightly more efficient in execution.
Previous patch c38dc3ccc7 allowed any
object to be compared with any other, using pointer comparison for a
fallback. As such, existing code which checked for this case is no
longer needed.
Compiler optimises lookup of module.CONST when enabled (an existing
feature). Disabled by default; enabled for unix, windows, stmhal.
Costs about 100 bytes ROM on stmhal.
This allows to enable mem-info functions in micropython module, even if
MICROPY_MEM_STATS is not enabled. In this case, you get mem_info and
qstr_info but not mem_{total,current,peak}.
GC for unix/windows builds doesn't make use of the bss section anymore,
so we do not need the (sometimes complicated) build features and code related to it
This is a simple optimisation inspired by JITing technology: we cache in
the bytecode (using 1 byte) the offset of the last successful lookup in
a map. This allows us next time round to check in that location in the
hash table (mp_map_t) for the desired entry, and if it's there use that
entry straight away. Otherwise fallback to a normal map lookup.
Works for LOAD_NAME, LOAD_GLOBAL, LOAD_ATTR and STORE_ATTR opcodes.
On a few tests it gives >90% cache hit and greatly improves speed of
code.
Disabled by default. Enabled for unix and stmhal ports.
This patch consolidates all global variables in py/ core into one place,
in a global structure. Root pointers are all located together to make
GC tracing easier and more efficient.
This is consistent with how BC_JUMP was handled before. We never show jumps
destinations relative to jump instrucion itself, only relative to beginning
of function. Another useful way to show them as absolute (real memory
address), and this change makes result expected and consistent with how
BC_JUMP is shown.
The compiler treats `if (MICROPY_ERROR_REPORTING == MICROPY_ERROR_REPORTING_TERSE)` as
a normal statement and generates assembly for it in degug mode as if MICROPY_ERROR_REPORTING
is an actual symbol instead of a preprocessor definition.
As such linking fails because mp_arg_error_terse_mismatch is not defined when
MICROPY_ERROR_REPORTING_TERSE is detailed or normal.
We are not word-for-word compatible with CPython exceptions, so we are
free to make them short but informative in order to reduce code size.
Also, try to make messages the same as existing ones where possible.
This fixes conversion when float type has more mantissa bits than small int,
and float value has small exponent. This is for example the case of 32-bit
platform using doubles, and converting value of time.time(). Conversion of
floats with larg exponnet is still not handled correctly.
This is for efficiency, so we don't need to subtract 1 from the ip
before storing it to code_state->ip. It saves a lot of ROM bytes on
unix and stmhal.
Mirroring ip to a volatile memory variable for each opcode is an expensive
operation. For quite a lot of often executed opcodes like stack manipulation
or jumps, exceptions cannot actually happen. So, record ip only for opcode
where that's possible.
This patch makes the MICROPY_PY_BUILTINS_SLICE compile-time option
fully disable the builtin slice operation (when set to 0). This
includes removing the slice sytanx from the grammar. Now, enabling
slice costs 4228 bytes on unix x64, and 1816 bytes on stmhal.
This patch makes MICROPY_PY_BUILTINS_SET compile-time option fully
disable the builtin set object (when set to 0). This includes removing
set constructor/comprehension from the grammar, the compiler and the
emitters. Now, enabling set costs 8168 bytes on unix x64, and 3576
bytes on stmhal.
This optimisation reduces the VM exception stack element (mp_exc_stack_t)
by 1 word, by using bit 1 of a pointer to store whether the opcode was a
FINALLY or WITH opcode. This optimisation was pending, waiting for
maturity of the exception handling code, which has now proven itself.
Saves 1 machine word RAM for each exception (4->3 words per exception).
Increases stmhal code by 4 bytes, and decreases unix x64 code by 32
bytes.
This patch gives proper SyntaxError exceptions for bad global/nonlocal
declarations. It also reduces code size: 304 bytes on unix x64, 132
bytes on stmhal.
You can now assign to the range end variable and the for-loop still
works correctly. This fully addresses issue #565.
Also fixed a bug with the stack not being fully popped when breaking out
of an optimised for-loop (and it's actually impossible to write a test
for this case!).
This patch adds a configuration option (MICROPY_CAN_OVERRIDE_BUILTINS)
which, when enabled, allows to override all names within the builtins
module. A builtins override dict is created the first time the user
assigns to a name in the builtins model, and then that dict is searched
first on subsequent lookups. Note that this implementation doesn't
allow deleting of names.
This patch also does some refactoring of builtins code, creating the
modbuiltins.c file.
Addresses issue #959.
The function is modeled after traceback.print_exception(), but unbloated,
and put into existing module to save overhead on adding another module.
Compliant traceback.print_exception() is intended to be implemented in
micropython-lib in terms of sys.print_exception().
This change required refactoring mp_obj_print_exception() to take pfenv_t
interface arguments.
Addresses #751.
mp_obj_int_get_truncated is used as a "fast path" int accessor that
doesn't check for overflow and returns the int truncated to the machine
word size, ie mp_int_t.
Use mp_obj_int_get_truncated to fix struct.pack when packing maximum word
sized values.
Addresses issues #779 and #998.
mp_lexer_t type is exposed, mp_token_t type is removed, and simple lexer
functions (like checking current token kind) are now inlined.
This saves 784 bytes ROM on 32-bit unix, 348 bytes on stmhal, and 460
bytes on bare-arm. It also saves a tiny bit of RAM since mp_lexer_t
is a bit smaller. Also will run a bit more efficiently.
Behaviour of array initialisation is subtly different for bytes,
bytearray and array.array when argument has buffer protocol. This patch
gets us CPython conformant (except we allow initialisation of
array.array by buffer with length not a multiple of typecode).
By using the buffer protocol for these array operations, we now allow
addition of memoryview objects, and objects with "incompatible"
typecodes (in this case it just adds bytes naively). This is an
extension to CPython which seems sensible. It also reduces the code
size.
Before, __repl_print__() used libc printf(), while print() used uPy streams
and own printf() implementation. This led to subtle, but confusing
differences in output when just doing "foo" vs "print(foo)" on interactive
prompt.
Currently compilation sporadically fails, because the automatic
dependency gets created *during* the compilation of objects.
OBJ is a auperset of PY_O and the dependencies apply to all objects.
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Going from MICROPY_ERROR_REPORTING_NORMAL to
MICROPY_ERROR_REPORTING_TERSE now saves 2020 bytes ROM for ARM Thumb2,
and 2200 bytes ROM for 32-bit x86.
This is about a 2.5% code size reduction for bare-arm.
When compiler optimization has been turned on, gcc knows that this code
block is not going to be executed. But with -O0 it complains about
path_items being used uninitialized.
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
This turns failing assertions to type exceptions for things like
b"123".find(...). We still don't support operations like this on bytes
objects (unlike CPython), but at least it no longer crashes.
Eg b"123" + bytearray(2) now works. This patch actually decreases code
size while adding functionality: 32-bit unix down by 128 bytes, stmhal
down by 84 bytes.
Uninitialised struct members get a default value of 0/false, so this is
not strictly needed. But it actually decreases code size because when
all members are initialised the compiler doesn't need to insert a call
to memset to clear everything. In other words, setting 1 extra member
to 0 uses less code than calling memset.
ROM savings in bytes: 32-bit unix: 100; bare-arm: 44; stmhal: 52.
gc.enable/disable are now the same as CPython: they just control whether
automatic garbage collection is enabled or not. If disabled, you can
still allocate heap memory, and initiate a manual collection.
msvc does not treat 1L a 64bit integer hence all occurences of shifting it left or right
result in undefined behaviour since the maximum allowed shift count for 32bit ints is 31.
Forcing the correct type explicitely, stored in MPZ_LONG_1, solves this.
It should be fair to say that almost in all cases where some API call
expects string, it should be also possible to pass byte string. For example,
it should be open/delete/rename file with name as bytestring. Note that
similar change was done quite a long ago to mp_obj_str_get_data().
Support for packages as argument not implemented, but otherwise error and
exit handling should be correct. This for example will allow to do:
pip-micropython install micropython-test.pystone
micropython -m test.pystone
This allows to implement KeyboardInterrupt on unix, and a much safer
ctrl-C in stmhal port. First ctrl-C is a soft one, with hope that VM
will notice it; second ctrl-C is a hard one that kills anything (for
both unix and stmhal).
One needs to check for a pending exception in the VM only for jump
opcodes. Others can't produce an infinite loop (infinite recursion is
caught by stack check).
There is a lot potential in compress bytecodes and make more use of the
coding space. This patch introduces "multi" bytecodes which have their
argument included in the bytecode (by addition).
UNARY_OP and BINARY_OP now no longer take a 1 byte argument for the
opcode. Rather, the opcode is included in the first byte itself.
LOAD_FAST_[0,1,2] and STORE_FAST_[0,1,2] are removed in favour of their
multi versions, which can take an argument between 0 and 15 inclusive.
The majority of LOAD_FAST/STORE_FAST codes fit in this range and so this
saves a byte for each of these.
LOAD_CONST_SMALL_INT_MULTI is used to load small ints between -16 and 47
inclusive. Such ints are quite common and now only need 1 byte to
store, and now have much faster decoding.
In all this patch saves about 2% RAM for typically bytecode (1.8% on
64-bit test, 2.5% on pyboard test). It also reduces the binary size
(because bytecodes are simplified) and doesn't harm performance.
This saves a lot of RAM for 2 reasons:
1. For functions that don't have default values, var args or var kw
args (which is a large number of functions in the general case), the
mp_obj_fun_bc_t type now fits in 1 GC block (previously needed 2 because
of the extra pointer to point to the arg_names array). So this saves 16
bytes per function (32 bytes on 64-bit machines).
2. Combining separate memory regions generally saves RAM because the
unused bytes at the end of the GC block are saved for 1 of the blocks
(since that block doesn't exist on its own anymore). So generally this
saves 8 bytes per function.
Tested by importing lots of modules:
- 64-bit Linux gave about an 8% RAM saving for 86k of used RAM.
- pyboard gave about a 6% RAM saving for 31k of used RAM.
This makes open() and _io.FileIO() more CPython compliant.
The mode kwarg is fully iplemented.
The encoding kwarg is allowed but not implemented; mainly to allow
the tests to specify encoding for CPython, see #874
Also, usocket.readinto(). Known issue is that .readinto() should be available
only for binary files, but micropython uses single method table for both
binary and text files.
Just like they handled in other read*(). Note that behavior of readline()
in case there's no data when it's called is underspecified in Python lib
spec, implemented to behave as read() - return None.
With this patch a port can enable module weak link support and provide
a dict of qstr->module mapping. This mapping is looked up only if an
import fails to find the requested module in the filesystem.
This allows to have the builtin module named, eg, usocket, and provide
a weak link of "socket" to the same module, but this weak link can be
overridden if a file by the name "socket.py" is found in the import
path.
This has benefits all round: code factoring for parse/compile/execute,
proper context save/restore for exec, allow to sepcify globals/locals
for eval, and reduced ROM usage by >100 bytes on stmhal and unix.
Also, the call to mp_parse_compile_execute is tail call optimised for
the import code, so it doesn't increase stack memory usage.
In CPython IOError (and EnvironmentError) is deprecated and aliased to
OSError. All modules that used to raise IOError now raise OSError (or a
derived exception).
In Micro Python we never used IOError (except 1 place, incorrectly) and
so don't need to keep it.
See http://legacy.python.org/dev/peps/pep-3151/ for background.
Viper can now do the following:
def store(p:ptr8, c:int):
p[0] = c
This does a store of c to the memory pointed to by p using a machine
instructions inline in the code.
It seems most sensible to use size_t for measuring "number of bytes" in
malloc and vstr functions (since that's what size_t is for). We don't
use mp_uint_t because malloc and vstr are not Micro Python specific.
mp_parse_node_free now frees the memory associated with non-interned
strings. And the parser calls mp_parse_node_free when discarding a
non-used node (such as a doc string).
Also, the compiler now frees the parse tree explicitly just before it
exits (as opposed to relying on the caller to do this).
Addresses issue #708 as best we can.
Stack is full descending and must be 8-byte aligned. It must start off
pointing to just above the last byte of RAM.
Previously, stack started pointed to last byte of RAM (eg 0x2001ffff)
and so was not 8-byte aligned. This caused a bug in combination with
alloca.
This patch also updates some debug printing code.
Addresses issue #872 (among many other undiscovered issues).
Heap RAM was being allocated to print dicts and do some other types of
iterating. Now these iterations use 1 word of state on the stack.
Deleting elements from a dict was not allowing the value to be reclaimed
by the GC. This is now fixed.
sys.exit always raises SystemExit so doesn't need a special
implementation for each port. If C exit() is really needed, use the
standard os._exit function.
Also initialise mp_sys_path and mp_sys_argv in teensy port.
Eventually, viper wants to be able to use raw pointers to strings and
arrays for efficient access. But for now, let's just load strings as a
Python object so they can be used as normal. This will anyway be
compatible with eventual intended viper behaviour.
Addresses issue #857.
Type representing signed size doesn't have to be int, so use special value
which defaults to SSIZE_MAX, but as it's not defined by C standard (but rather
by POSIX), allow ports to set it.
Previously, mpz was restricted to using at most 15 bits in each digit,
where a digit was a uint16_t.
With this patch, mpz can use all 16 bits in the uint16_t (improvement
to mpn_div was required). This gives small inprovements in speed and
RAM usage. It also yields savings in ROM code size because all of the
digit masking operations become no-ops.
Also, mpz can now use a uint32_t as the digit type, and hence use 32
bits per digit. This will give decent improvements in mpz speed on
64-bit machines.
Test for big integer division added.
Code-info size, block name, source name, n_state and n_exc_stack now use
variable length encoded uints. This saves 7-9 bytes per bytecode
function for most functions.
This way, the native glue code is only compiled if native code is
enabled (which makes complete sense; thanks to Paul Sokolovsky for
the idea).
Should fix issue #834.
The heap allocation is now exactly as it was before the "faster gc
alloc" patch, but it's still nearly as fast. It is fixed by being
careful to always update the "last free block" pointer whenever the heap
changes (eg free or realloc).
Tested on all tests by enabling EXTENSIVE_HEAP_PROFILING in py/gc.c:
old and new allocator have exactly the same behaviour, just the new one
is much faster.
Recent speed up of GC allocation made the GC have a fragmented heap.
This patch restores "original fragmentation behaviour" whilst still
retaining relatively fast allocation. This patch works because there is
always going to be a single block allocated now and then, which advances
the gc_last_free_atb_index pointer often enough so that the whole heap
doesn't need scanning.
Should address issue #836.
With a file with 1 line (and an error on that line), used to show the
line as number 0. Now shows it correctly as line number 1.
But, when line numbers are disabled, it now prints line number 1 for any
line that has an error (instead of 0 as previously). This might end up
being confusing, but requires extra RAM and/or hack logic to make it
print something special in the case of no line numbers.
These functions are generally 1 machine instruction, and are used in
critical code, so makes sense to have them inline.
Also leave these functions uninverted (ie 0 means enable, 1 means
disable) and provide macro constants if you really need to distinguish
the states. This makes for smaller code as well (combined with
inlining).
Applied to teensy port as well.
Because (for Thumb) a function pointer has the LSB set, pointers to
dynamic functions in RAM (eg native, viper or asm functions) were not
being traced by the GC. This patch is a comprehensive fix for this.
Addresses issue #820.
This simple patch gives a very significant speed up for memory allocation
with the GC.
Eg, on PYBv1.0:
tests/basics/dict_del.py: 3.55 seconds -> 1.19 seconds
tests/misc/rge_sm.py: 15.3 seconds -> 2.48 seconds
Multiplication of a tuple, list, str or bytes now yields an empty
sequence (instead of crashing). Addresses issue #799
Also added ability to mult bytes on LHS by integer.
Can now index ranges with integers and slices, and reverse ranges
(although reversing is not very efficient).
Not sure how useful this stuff is, but gets us closer to having all of
Python's builtins.
reversed function now implemented, and works for tuple, list, str, bytes
and user objects with __len__ and __getitem__.
Renamed mp_builtin_len to mp_obj_len to make it publically available (eg
for reversed).
This happens for example for zero-size arrays. As .get_buffer() method now
has explicit return value, it's enough to distinguish success vs failure
of getting buffer.
This was a nasty bug to track down. It only had consequences when the
heap size was just the right size to expose the rounding error in the
calculation of the finaliser table size. And, a script had to allocate
a small (1 or 2 cell) object at the very end of the heap. And, this
object must not have a finaliser. And, the initial state of the heap
must have been all bits set to 1. All these conspire on the pyboard,
but only if your run the script fresh (so unused memory is all 1's),
and if your script allocates a lot of small objects (eg 2-char strings
that are not interned).
qstr_init is always called exactly before mp_init, so makes sense to
just have mp_init call it. Similarly with
mp_init_emergency_exception_buf. Doing this makes the ports simpler and
less error prone (ie they can no longer forget to call these).
Reduces by about a factor of 10 on average the amount of RAM needed to
store the line-number to bytecode map in the bytecode prelude.
Using CPython3.4's stdlib for statistics: previously, an average of
13 bytes were used per (bytecode offset, line-number offset) pair, and
now with this improvement, that's down to 1.3 bytes on average.
Large RAM usage before was due to some very large steps in line numbers,
both from the start of the first line in a function way down in the
file, and also functions that have big comments and/or big strings in
them (both cases were significant).
Although the savings are large on average for the CPython stdlib, it
won't have such a big effect for small scripts used in embedded
programming.
Addresses issue #648.
This removes mpz_as_int, since that was a terrible function (it
implemented saturating conversion).
Use mpz_as_int_checked and mpz_as_uint_checked. These now work
correctly (they previously had wrong overflow checking, eg
print(chr(10000000000000)) on 32-bit machine would incorrectly convert
this large number to a small int).
Many OSes/CPUs have affinity to put "user" data into lower half of address
space. Take advantage of that and remap such addresses into full small int
range (including negative part).
If address is from upper half, long int will be used. Previously, small
int was returned for lower quarter of address space, and upper quarter. For
2 middle quarters, long int was used, which is clearly worse schedule than
the above.
The user code should call micropython.alloc_emergency_exception_buf(size)
where size is the size of the buffer used to print the argument
passed to the exception.
With the test code from #732, and a call to
micropython.alloc_emergenncy_exception_buf(100) the following error is
now printed:
```python
>>> import heartbeat_irq
Uncaught exception in Timer(4) interrupt handler
Traceback (most recent call last):
File "0://heartbeat_irq.py", line 14, in heartbeat_cb
NameError: name 'led' is not defined
```
With unicode enabled, this patch allows reading a fixed number of
characters from text-mode streams; eg file.read(5) will read 5 unicode
chars, which can made of more than 5 bytes.
For an ASCII stream (ie no chars > 127) it only needs to do 1 read. If
there are lots of non-ASCII chars in a stream, then it needs multiple
reads of the underlying object.
Adds a new test for this case. Enables unicode support by default on
unix and stmhal ports.
dummy_data field is accessed as uint value (e.g.
in emit_write_bytecode_byte_ptr), but is not aligned as such, which causes
bus errors or incorrect behavior on any arch requiring strictly aligned
data (ARM pre-v7, MIPS, etc, etc).
Conflicts:
stmhal/pin_named_pins.c
stmhal/readline.c
Renamed HAL_H to MICROPY_HAL_H. Made stmhal/mphal.h which intends to
define the generic Micro Python HAL, which in stmhal sits above the ST
HAL.
Native emitter can now compile try/except blocks using nlr_push/nlr_pop.
It probably only works for 1 level of exception handling. It doesn't
work on Thumb (only x64).
Native emitter can also handle some additional op codes.
With this patch, 198 tests now pass using "-X emit=native" option to
micropython.
- rearrange/add definitions that were not there so it's easier to compare both
- use MICROPY_PY_SYS_PLATFORM in main.c since it's available anyway
- define EWOULDBLOCK, it is missing from ingw32
As stack checking is enabled by default, ports which don't call
stack_ctrl_init() are broken now (report RuntimeError on startup). Save
them trouble and just init stack control framework in interpreter init.
Squashed commit of the following:
commit 99dc21b67a895dc10d3c846bc158d27c839cee48
Author: Chris Angelico <rosuav@gmail.com>
Date: Thu Jun 12 02:18:54 2014 +1000
Optimize as per TODO (thanks Damien!)
commit 5bf0153ecad8348443058d449d74504fc458fe51
Author: Chris Angelico <rosuav@gmail.com>
Date: Tue Jun 10 08:42:06 2014 +1000
Test a default (= UTF-8) encode and decode
commit c962057ac340832c4fde60896f656a3fe3ad78a9
Merge: e2c9782 195de32
Author: Chris Angelico <rosuav@gmail.com>
Date: Tue Jun 10 05:23:03 2014 +1000
Merge branch 'master' into unicode, resolving conflict on py/obj.h
commit e2c9782a65eb57f481d441d40161de427e1940ba
Author: Chris Angelico <rosuav@gmail.com>
Date: Tue Jun 10 05:05:57 2014 +1000
More whitespace fixups
commit 086a2a0f57afbc1f731697fd5d3a0cbbb80e5418
Author: Chris Angelico <rosuav@gmail.com>
Date: Tue Jun 10 05:04:20 2014 +1000
Properly implement string slicing
commit 0d339a143e2b6442366145e7f3d64aada293eaa0
Author: Chris Angelico <rosuav@gmail.com>
Date: Tue Jun 10 02:24:11 2014 +1000
Support slicing in str_index_to_ptr, and fix a bounds error
commit 24371c7267d360e77cf5eabc2e8ce9a73d2ee0da
Author: Chris Angelico <rosuav@gmail.com>
Date: Tue Jun 10 02:10:22 2014 +1000
Break out index-to-pointer calculation into a function
commit 616c24ac014c3ca56008428c506034dd1bfff7a8
Author: Chris Angelico <rosuav@gmail.com>
Date: Tue Jun 10 02:03:11 2014 +1000
Add tests of string slicing, which currently fail
commit a24d19f676fe8cc21dad512d91b826892e162a5b
Author: Chris Angelico <rosuav@gmail.com>
Date: Tue Jun 10 01:56:53 2014 +1000
Change string indexing to not precalculate the charlen, and add test for neg indexing
commit 0bcc7ab89eafb2ae53195e94c9bea42a4e886b64
Author: Chris Angelico <rosuav@gmail.com>
Date: Sun Jun 8 22:09:17 2014 +1000
Clean up constant qstr declarations now that charlen isn't needed
commit 5473e1a1dba2124b7b0c207f2964293cfbe80167
Author: Chris Angelico <rosuav@gmail.com>
Date: Sun Jun 8 07:18:42 2014 +1000
Remove the charlen field from strings, calculating it when required
commit 5c1658ec71aefbdc88c261ce2e57dc7670cdc6ef
Author: Chris Angelico <rosuav@gmail.com>
Date: Sun Jun 8 07:11:27 2014 +1000
Get rid of mp_obj_str_get_data_len() which was used in only one place
commit a019ba968b4e8daf7f3674f63c5cc400e304c509
Author: Chris Angelico <rosuav@gmail.com>
Date: Sun Jun 8 06:58:26 2014 +1000
Add a unichar_charlen() function to calculate length-in-characters from length-in-bytes
commit 44b0d5cff846ba487c526ed95be1b3d1cd3d762a
Author: Chris Angelico <rosuav@gmail.com>
Date: Sun Jun 8 06:32:44 2014 +1000
Use utf8_get/next_char in building up a string's repr
commit 30d1bad33f7af90f1971987c39864c8fcf3f5c21
Author: Chris Angelico <rosuav@gmail.com>
Date: Sun Jun 8 06:10:45 2014 +1000
Make utf8_get_char() and utf8_next_char() actually do what their names say
commit bc990dad9afb8ec112f5e7f7f79d5ab415da0e72
Author: Chris Angelico <rosuav@gmail.com>
Date: Sun Jun 8 02:10:59 2014 +1000
Revert "Add PEP 393-flags to strings and stub usage."
This reverts commit c239f509521d1a0f9563bf9c5de0c4fb9a6a33ba.
commit f9bebb28ad52467f2f2d7a752bb033296b6c2f9b
Author: Chris Angelico <rosuav@gmail.com>
Date: Sat Jun 7 15:41:48 2014 +1000
Whitespace fixes
commit 279de0c8eb3cb186914799ccc5ee94ea97f56de4
Author: Chris Angelico <rosuav@gmail.com>
Date: Sat Jun 7 15:28:35 2014 +1000
Formatting/layout improvements - introduce macros for UTF-8 byte detection, add braces. No functional changes.
commit f1911f53d56da809c97b07245f5728a419e8fb30
Author: Chris Angelico <rosuav@gmail.com>
Date: Sat Jun 7 11:56:02 2014 +1000
Make chr() Unicode-aware
commit f51ad737b48ac04c161197a4012821d50885c4c7
Author: Chris Angelico <rosuav@gmail.com>
Date: Sat Jun 7 11:44:07 2014 +1000
Make a string's repr Unicode-aware
commit 01bd68684611585d437982dccdf05b33cbedc630
Author: Chris Angelico <rosuav@gmail.com>
Date: Sat Jun 7 11:33:43 2014 +1000
Expand the Unicode tests
commit 7bc91904f899f8012089fc14a06495680a51e590
Author: Chris Angelico <rosuav@gmail.com>
Date: Sat Jun 7 11:27:30 2014 +1000
Record byte lengths for byte strings
commit bb132120717cf176dcfb26f87fa309378f76ab5f
Author: Chris Angelico <rosuav@gmail.com>
Date: Sat Jun 7 11:25:06 2014 +1000
Make ord() Unicode-aware
commit 03f0cbe9051b62192be97b59f84f63f9216668bf
Author: Chris Angelico <rosuav@gmail.com>
Date: Sat Jun 7 10:24:35 2014 +1000
Retain characters as UTF-8 encoded Unicode
commit e924659b85c001916a5ff7f4d1d8b3ebe2bf0c2f
Author: Chris Angelico <rosuav@gmail.com>
Date: Sat Jun 7 08:37:27 2014 +1000
Add support for \u and \U escapes, but not \N (with explanatory comment)
commit 231031ac5f0346e4ffcf9c4abec2bd33f566232c
Author: Chris Angelico <rosuav@gmail.com>
Date: Sat Jun 7 05:09:35 2014 +1000
Add character length to qstr
commit 6df1b946fb17d8d5df3d91b21cde627c3d4556a8
Author: Chris Angelico <rosuav@gmail.com>
Date: Fri Jun 6 13:48:36 2014 +1000
Add test of UTF-8 encoded source file resulting in properly formed string
commit 16429b81a8483cf25865ed11afd81a7d9c253c26
Author: Chris Angelico <rosuav@gmail.com>
Date: Fri Jun 6 13:44:15 2014 +1000
Make len(s) return character length (even though creation's still buggy)
commit cd2cf6663cc47831dbc97819ad5c50ad33f939d3
Author: Chris Angelico <rosuav@gmail.com>
Date: Fri Jun 6 13:15:36 2014 +1000
HACK - When indexing a qstr, count its charlen. Stupidly inefficient but POC.
All tests pass now, though string creation is still buggy.
commit 47c234584d3358dfa6b4003d5e7264105d17b8f7
Author: Chris Angelico <rosuav@gmail.com>
Date: Fri Jun 6 13:15:32 2014 +1000
objstr: Record character length separately from byte length
CAUTION: Buggy, may crash stuff - qstr needs equivalent functionality too
commit b0f41c72af27d3b361027146025877b3d7e8785c
Author: Chris Angelico <rosuav@gmail.com>
Date: Fri Jun 6 05:37:36 2014 +1000
Beginnings of UTF-8 support - construct strings from that many UTF-8-encoded chars, and subscript bytes the same way
commit 89452be641674601e9bfce86dc71c17c3140a6cf
Author: Chris Angelico <rosuav@gmail.com>
Date: Fri Jun 6 05:28:47 2014 +1000
Update comments - now aiming for UTF-8 rather than PEP 393 strings
commit c239f509521d1a0f9563bf9c5de0c4fb9a6a33ba
Author: Chris Angelico <rosuav@gmail.com>
Date: Wed Jun 4 05:28:12 2014 +1000
Add PEP 393-flags to strings and stub usage.
The test suite all passes, but nothing has actually been changed.
Such mechanism is important to get stable Python functioning, because Python
function calling is handled with C stack. The idea is to sprinkle
STACK_CHECK() calls in places where there can be C recursion.
TODO: Add more STACK_CHECK()'s.
Expected to be set on command line, with the idea being that for different
targets, there're different smartass ABIs which strive to put unneeded
sections into executables, etc., so let people have flexible way to
strip that.
The option name is similar to previously introduced CLFAGS_EXTRA &
LDFLAGS_EXTRA.
char can be signedness, and using signedness types is dangerous - it can
lead to negative offsets when doing table lookups. We apparently should just
ban char usage.
This will allow roughly the same behavior as Python3 for non-ASCII strings,
for example, print("<phrase in non-Latin script>".split()) will print list
of words, not weird hex dump (like Python2 behaves). (Of course, that it
will print list of words, if there're "words" in that phrase at all, separated
by ASCII-compatible whitespace; that surely won't apply to every human
language in existence).
Functionality we provide in builtin io module is fairly minimal. Some
code, including CPython stdlib, depends on more functionality. So, there's
a choice to either implement it in C, or move it _io, and let implement other
functionality in Python. 2nd choice is pursued. This setup matches CPython
too (_io is builtin, io is Python-level).
Benefits: won't crash baremetal targets, will provide Python source location
when not implemented feature used (it will no longer provide C source
location, but just grep for error message).
there are special tweaks and paths to be considered. Just provide some
defaults, in case the values are undefined.
- py-version.sh does not need any bash specific features.
- Use libdl only on Linux for now. FreeBSD provides dl*() calls from libc.
Some small fixed:
- Combine 'x' and 'X' cases in str format code.
- Remove trailing spaces from some lines.
- Make exception messages consistently begin with lower case (then
needed to change those in objarray and objtuple so the same
constant string data could be used).
- Fix bug with exception message having %c instead of %%c.
Add keyword args to dict.update(), and ability to take a dictionary as
argument.
dict() class constructor can now use dict.update() directly.
This patch loses fast path for dict(other_dict), but is that really
needed? Any anyway, this idiom will now re-hash the dictionary, so is
arguably more memory efficient.
Addresses issue #647.
This may seem a bit of a risky change, in that it may introduce crazy
bugs with respect to volatile variables in the VM loop. But, I think it
should be fine: code_state points to some external memory, so the
compiler should always read/write to that memory when accessing the
ip/sp variables (ie not put them in registers).
Anyway, it passes all tests and improves on all efficiency fronts: about
2-4% faster (64-bit unix), 16 bytes less stack space per call (64-bit
unix) and slightly less executable size (unix and stmhal).
The reason it's more efficient is save_ip and save_sp were volatile
variables, so were anyway stored on the stack (in memory, not regs).
Thus converting them to code_state->{ip, sp} doesn't cost an extra
memory dereference (except maybe to get code_state, but that can be put
in a register and then made more efficient for other uses of it).
Conflicts:
py/vm.c
Fixed stack underflow check. Use UINT_FMT/INT_FMT where necessary.
Specify maximum VM-stack byte size by multiple of machine word size, so
that on 64 bit machines it has same functionality as 32 bit.
This improves stack usage in callers to mp_execute_bytecode2, and is step
forward towards unifying execution interface for function and generators
(which is important because generators don't even support full forms
of arguments passing (keywords, etc.)).
Needed to pop the iterator object when breaking out of a for loop. Need
also to be careful to unwind exception handler before popping iterator.
Addresses issue #635.
This helps the compiler do its optimisation, makes it clear which
variables are local per opcode and which global, and makes it consistent
when extra variables are needed in an opcode (in addition to old obj1,
obj2 pair, for example).
Could also make unum local, but that's for another time.
This completes non-automatic interning of strings in the parser, so that
doc strings don't take up RAM. It complicates the parser and compiler,
and bloats stmhal by about 300 bytes. It's complicated because now
there are 2 kinds of parse-nodes that can be strings: interned leaves
and non-interned structs.
io.FileIO is binary I/O, ans actually optional. Default file type is
io.TextIOWrapper, which provides str results. CPython3 explicitly describes
io.TextIOWrapper as buffered I/O, but we don't have buffering support yet
anyway.
Now schedule is: for native types, we call ->make_new() C-level method, which
should perform actions of __new__ and __init__ (note that this is not
compliant, but is efficient), but for user types, __new__ and __init__ are
called as expected.
Also, make sure we convert scalar attribute value to a bound-pair tight in
mp_obj_class_lookup() method, which avoids converting it again and again in
its callers.
__debug__ now resolves to True or False. Its value needs to be set by
mp_set_debug().
TODO: call mp_set_debug in unix/ port.
TODO: optimise away "if False:" statements in compiler.
Updated functions now do proper checking that n_kw==0, and are simpler
because they don't have to explicitly raise an exception. Down side is
that the error messages no longer include the function name, but that's
acceptable.
Saves order 300 text bytes on x64 and ARM.
This is not fully correct re: error handling, because we should check that
that types are used consistently (only str's or only bytes), but magically
makes lot of functions support bytes.
Two things are handled here: allow to compare native subtypes of tuple,
e.g. namedtuple (TODO: should compare type too, currently compared
duck-typedly by content). Secondly, allow user sunclasses of tuples
(and its subtypes) be compared either. "Magic" I did previously in
objtype.c covers only one argument (lhs is many), so we're in trouble
when lhs is native type - there's no other option besides handling
rhs in special manner. Fortunately, this patch outlines approach with
fast path for native types.
This was hit when trying to make urlparse.py from stdlib run. Took
quite some time to debug.
TODO: Reconsile bound method creation process better, maybe callable is
to generic type to bind at all?
Parser shouldn't raise exceptions, so needs to check when memory
allocation fails. This patch does that for the initial set up of the
parser state.
Also, we now put the parser object on the stack. It's small enough to
go there instead of on the heap.
This partially addresses issue #558.
"object" type in MicroPython currently doesn't implement any methods, and
hopefully, we'll try to stay like that for as long as possible. Even if we
have to add something eventually, look up from there might be handled in
adhoc manner, as last resort (that's not compliant with Python3 MRO, but
we're already non-compliant). Hence: 1) no need to spend type trying to
lookup anything in object; 2) no need to allocate subobject when explicitly
inheriting from object; 3) and having multiple bases inheriting from object
is not a case of incompatible multiple inheritance.
This patch simplifies the glue between native emitter and runtime,
and handles viper code like inline assember: return values are
converted to Python objects.
Fixes issue #531.
You can now do:
X = const(123)
Y = const(456 + X)
and the compiler will replace X and Y with their values.
See discussion in issue #266 and issue #573.
In case of empty non-blocking read()/write(), both return None. read()
cannot return 0, as that means EOF, so returns another value, and then
write() just follows. This is still pretty unexpected, and typical
"if not len:" check would treat this as EOF. Well, non-blocking files
require special handling!
This also kind of makes it depending on POSIX, but well, anything else
should emulate POSIX anyway ;-).
Need to have a policy as to how far we go adding keyword support to
built ins. It's nice to have, and gets better CPython compatibility,
but hurts the micro nature of uPy.
Addresses issue #577.
There are 2 locations in parser, and 1 in compiler, where memory
allocation is not precise. In the parser it's the rule stack and result
stack, in the compiler it's the array for the identifiers in the current
scope. All other mallocs are exact (ie they don't allocate more than is
needed).
This patch adds tuning options (MP_ALLOC_*) to mpconfig.h for these 3
inexact allocations.
The inexact allocations in the parser should actually be close to
logarithmic: you need an exponentially larger script (absent pathological
cases) to use up more room on the rule and result stacks. As such, the
default allocation policy for these is now to start with a modest sized
stack, but grow only in small increments.
For the identifier arrays in the compiler, these now start out quite
small (4 entries, since most functions don't have that many ids), and
grow incrementally by 6 (since if you have more ids than 4, you probably
have quite a few more, but it wouldn't be exponentially more).
Partially addresses issue #560.
This will work if MICROPY_DEBUG_PRINTERS is defined, which is only for
unix/windows ports. This makes it convenient to user uPy normally, but
easily get bytecode dump on the spot if needed, without constant recompiles
back and forth.
TODO: Add more useful debug output, adjust verbosity level on which
specifically bytecode dump happens.
Blanket wide to all .c and .h files. Some files originating from ST are
difficult to deal with (license wise) so it was left out of those.
Also merged modpyb.h, modos.h, modstm.h and modtime.h in stmhal/.
By default mingw outputs 3 digits instead of the standard 2 so all float
tests using printf fail. Using setenv at the start of the program fixes this.
To accomodate calling platform specific initialization a
MICROPY_MAIN_INIT_FUNC macro is used which is called in mp_init()
The original parsing would error out on any C declarations that are not typedefs
or extern variables. This limits what can go in mpconfig.h and mpconfigport.h,
as they are included in qstr.h. For instance even a function declaration would be
rejected and including system headers is a complete no-go.
That seems too limiting for a global config header, so makeqstrdata now
ignores everything that does not match a qstr definition.
alloca() is declared in alloca.h which als happens to be included by stdlib.h.
On mingw however it resides in malloc.h only.
So if we include alloca.h directly, and add an alloca.h for mingw in it's port
directory we can get rid of the mingw-specific define to include malloc.h
and the other ports are happy as well.
Biggest part of this support is refactoring mp_obj_class_lookup() to return
standard "bound member" pair (mp_obj_t[2]). Actual support of inherited
native methods is 3 lines then. Some inherited features may be not supported
yet (e.g. native class methods, native properties, etc., etc.). There may
be opportunities for further optimization too.
This implements checking of base types, allocation and basic initialization,
and optimized support for special method lookups. Other features are not yet
supported.
Of course, keywords are turned into lexer tokens in the lexer, so will
never need to be interned (unless you do something like x="def").
As it is now, the following on pyboard makes no new qstrs:
import pyb
pyb.info()
New way uses slightly less ROM and RAM, should be slightly faster, and,
most importantly, allows to catch the error "non-keyword arg following
keyword arg".
Addresses issue #466.
Also add some more debugging output to gc_dump_alloc_table().
Now that newly allocated heap is always zero'd, maybe we just make this
a policy for the uPy API to keep it simple (ie any new implementation of
memory allocation must zero all allocations). This follows the D
language philosophy.
Before this patch, a previously used memory block which had pointers in
it may still retain those pointers if the new user of that block does
not actually use the entire block. Eg, if I want 5 blocks worth of
heap, I actually get 8 (round up to nearest 4). Then I never use the
last 3, so they keep their old values, which may be pointers pointing to
the heap, hence preventing GC.
In rare (or maybe not that rare) cases, this leads to long, unintentional
"linked lists" within the GC'd heap, filling it up completely. It's
pretty rare, because you have to reuse exactly that memory which is part
of this "linked list", and reuse it in just the right way.
This should fix issue #522, and might have something to do with
issue #510.
3 emitter functions are needed only for emitcpy, and so we can #if them
out when compiling with emitcpy support.
Also remove unused SETUP_LOOP bytecode.
Closed over variables are now passed on the stack, instead of creating a
tuple and passing that. This way memory for the closed over variables
can be allocated within the closure object itself. See issue #510 for
background.
There were typos, various rounding errors trying to do concurrent counting
in bytes vs blocks, complex conditional paths, superfluous variables, etc.,
etc., all leading to obscure segfaults.
These are to assist in writing native C functions that take positional
and keyword arguments. mp_arg_check_num is for just checking the
number of arguments is correct. mp_arg_parse_all is for parsing
positional and keyword arguments with default values.
When querying an object that supports the buffer protocol, that object
must now return a typecode (as per binary.[ch]). This does not have to
be honoured by the caller, but can be useful for determining element
size.
Test usecase I used is print(time.time()) and print(time.time() - time.time()).
On Linux/Glibc they now give the same output as CPython 3.3. Specifically,
time.time() gives non-exponential output with 7 decimal digits, and subtraction
gives exponential output e-06/e-07.
On stmhal, computed gotos make the binary about 1k bigger, but makes it
run faster, and we have the room, so why not. All tests pass on
pyboard using computed gotos.
This follows pattern already used for objtuple, etc.: objfun.h's content
is not public - each and every piece of code should not have access to it.
It's not private either - with out architecture and implementation language
(C) it doesn't make sense to keep implementation of each object strictly
private and maintain cumbersome accessors. It's "local" - intended to be
used by a small set of "friend" (in C++ terms) objects.
Things get tricky when using the nlr code to catch exceptions. Need to
ensure that the variables (stack layout) in the exception handler are
the same as in the bit protected by the exception handler.
Prior to this patch there were a few bugs. 1) The constant
mp_const_MemoryError_obj was being preloaded to a specific location on
the stack at the start of the function. But this location on the stack
was being overwritten in the opcode loop (since it didn't think that
variable would ever be referenced again), and so when an exception
occurred, the variable holding the address of MemoryError was corrupt.
2) The FOR_ITER opcode detection in the exception handler used sp, which
may or may not contain the right value coming out of the main opcode
loop.
With this patch there is a clear separation of variables used in the
opcode loop and in the exception handler (should fix issue (2) above).
Furthermore, nlr_raise is no longer used in the opcode loop. Instead,
it jumps directly into the exception handler. This tells the C compiler
more about the possible code flow, and means that it should have the
same stack layout for the exception handler. This should fix issue (1)
above. Indeed, the generated (ARM) assembler has been checked explicitly,
and with 'goto exception_handler', the problem with &MemoryError is
fixed.
This may now fix problems with rge-sm, and probably many other subtle
bugs yet to show themselves. Incidentally, rge-sm now passes on
pyboard (with a reduced range of integration)!
Main lesson: nlr is tricky. Don't use nlr_push unless you know what you
are doing! Luckily, it's not used in many places. Using nlr_raise/jump
is fine.
The autogenerated header files have been moved about, and an extra
include dir has been added, which means you can give a custom
BUILD=newbuilddir option to make, and everything "just works"
Also tidied up the way the different Makefiles build their include-
directory flags
That was easy - just avoid erroring out on seeing candidate dir for namespace
package. That's far from being complete though - namespace packages should
support importing portions of package from different sys.path entries, here
we require first matching entry to contain all namespace package's portions.
And yet, that's a way to put parts of the same Python package into multiple
installable package - something we really need for *Micro*Python.
The logic appears to be that (at least beginning of) sys.versions is the
version of reference Python language implemented, not version of particular
implementation.
Also, bump set versions at 3.4.0, based on @dpgeorge preference.
Attempt to address issue #386. unique_code_id's have been removed and
replaced with a pointer to the "raw code" information. This pointer is
stored in the actual byte code (aligned, so the GC can trace it), so
that raw code (ie byte code, native code and inline assembler) is kept
only for as long as it is needed. In memory it's now like a tree: the
outer module's byte code points directly to its children's raw code. So
when the outer code gets freed, if there are no remaining functions that
need the raw code, then the children's code gets freed as well.
This is pretty much like CPython does it, except that CPython stores
indexes in the byte code rather than machine pointers. These indices
index the per-function constant table in order to find the relevant
code.
Improved the Thumb assembler back end. Added many more Thumb
instructions to the inline assembler. Improved parsing of assembler
instructions and arguments. Assembler functions can now be passed the
address of any object that supports the buffer protocol (to get the
address of the buffer). Added an example of how to sum numbers from
an array in assembler.
This is necessary to catch all cases where locals are referenced before
assignment. We still keep the _0, _1, _2 versions of LOAD_FAST to help
reduced the byte code size in RAM.
Addresses issue #457.
I'm pretty sure these are never reached, since NOT_EQUAL is always
converted into EQUAL in mp_binary_op. No one should call
type.binary_op directly, they should always go through mp_binary_op
(or mp_obj_is_equal).
Per https://docs.python.org/3.3/reference/import.html , this is the way to
tell module from package: "Specifically, any module that contains a __path__
attribute is considered a package." And it for sure will be needed to
implement relative imports.
This simplifies the compiler a little, since now it can do 1 pass over
a function declaration, to determine default arguments. I would have
done this originally, but CPython 3.3 somehow had the default keyword
args compiled before the default position args (even though they appear
in the other order in the text of the script), and I thought it was
important to have the same order of execution when evaluating default
arguments. CPython 3.4 has changed the order to the more obvious one,
so we can also change.
It has (again) a fast path for ints, and a simplified "slow" path for
everything else.
Also simplify the way str indexing is done (now matches tuple and list).
A specific target can define either MP_ENDIANNESS_LITTLE or MP_ENDIANNESS_BIG
to 1. Default is MP_ENDIANNESS_LITTLE.
TODO: Autodetect based on compiler predefined macros?
Working towards trying to support compile-time constants (see discussion
in issue #227), this patch allows the compiler to look inside arbitrary
uPy objects at compile time. The objects to search are given by the
macro MICROPY_EXTRA_CONSTANTS (so they must be constant/ROM objects),
and the constant folding occures on forms base.attr (both base and attr
must be id's).
It works, but it breaks strict CPython compatibility, since the lookup
will succeed even without importing the namespace.
Previously, a failed malloc/realloc would throw an exception, which was
not caught. I think it's better to keep the parser free from NLR
(exception throwing), hence this patch.
Only calcsize() and unpack() functions provided so far, for little-endian
byte order. Format strings don't support repition spec (like "2b3i").
Unfortunately, dealing with all the various binary type sizes and alignments
will lead to quite a bloated "binary" helper functions - if optimizing for
speed. Need to think if using dynamic parametrized algos makes more sense.
With the implementation of proper string formatting, code to print a
small int was delegated to mpz_as_str_inpl (after first converting the
small int to an mpz using stack memory). But mpz_as_str_inpl allocates
heap memory to do the conversion, so small ints needed heap memory just
to be printed.
This fix has a separate function to print small ints, which does not
allocate heap, and allocates less stack.
String formatting, printf and pfenv are now large beasts, with some
semi-duplicated code.
These two are apprerently the most concise and efficient way to convert
int to/from bytes in Python. The alternatives are struct and array modules,
but methods using them are more verbose in Python code and less efficient
in memory/cycles.
Full CPython compatibility with this requires actually parsing the
input so far collected, and if it fails parsing due to lack of tokens,
then continue collecting input. It's not worth doing it this way. Not
having compatibility at this level does not hurt the goals of Micro
Python.
stmhal relies on pfenv_* to implement its printf. Thus, it needs a
pfenv_print_int which prints a proper 32-bit integer. With latest
change to pfenv, this function became one that took mp_obj_t, and
extracted the integer value from that object.
To fix temporarily, pfenv_print_int has been renamed to
pfenv_print_mp_int (to indicate it takes a mp_obj_t for the int), and
pfenv_print_int has been added (which takes a normal C int). Currently,
pfenv_print_int proxies to pfenv_print_mp_int, but this means it looses
the MSB. Need to find a way to fix this, but the only way I can think
of will duplicate lots of code.
Two things: 1) set flags in copy properly; make mp_map_init() not be too
smart and do something with requested alloc size. Policy of using prime
numbers for alloc size is high-level policy which should be applied at
corresponding high levels. Low-level functions should just do what they're
asked to, because they don't have enough context to be smarter than that.
For example, munging with alloc size of course breaks dict copying (as
changing sizes requires rehashing).
Based on the discussion in #433. mp_load_attr() is critical-path function,
so any extra check will slowdown any script. As supporting default val
required only for getattr() builtin, move correspending implementation
there (still as a separate function due to concerns of maintainability
of such almost-duplicated code instances).
This is to reduce ROM usage. stream_p is used in file and socket types
only (at the moment), so seems a good idea to make the protocol
functions a pointer instead of the actual structure.
It saves 308 bytes of ROM in the stmhal/ port, 928 in unix/.
Finishes addressing issue #424.
In the end this was a very neat refactor that now makes things a lot
more consistent across the py code base. It allowed some
simplifications in certain places, now that everything is a dict object.
Also converted builtins tables to dictionaries. This will be useful
when we need to turn builtins into a proper module.
When searching next time, such entry should be just skipped, not terminate
the search. It's known that marking techique is not efficient at the presense
of many removes, but namespace usage should not require many deletes, and
as for user dictionaries - well, open addressing map table with linear
rehashing and load factor of ~1 is not particularly efficient at all ;-).
TODO: May consider "shift other entries in cluster" approach as an
alternative.