This patch moves the start of the root pointer section in mp_state_ctx_t
so that it skips entries that are not pointers and don't need scanning.
Previously, the start of the root pointer section was at the very beginning
of the mp_state_ctx_t struct (which is the beginning of mp_state_thread_t).
This was the original assembler version of the NLR code was hard-coded to
have the nlr_top pointer at the start of this state structure. But now
that the NLR code is partially written in C there is no longer this
restriction on the location of nlr_top (and a comment to this effect has
been removed in this patch).
So now the root pointer section starts part way through the
mp_state_thread_t structure, after the entries which are not root pointers.
This patch also moves the non-pointer entries for MICROPY_ENABLE_SCHEDULER
outside the root pointer section.
Moving non-pointer entries out of the root pointer section helps to make
the GC more precise and should help to prevent some cases of collectable
garbage being kept.
This patch also has a measurable improvement in performance of the
pystone.py benchmark: on unix x86-64 and stm32 there was an improvement of
roughly 0.6% (tested with both gcc 7.3 and gcc 8.1).
This patch changes 2 things in the endianness detection:
1. Don't assume that __BYTE_ORDER__ not being __ORDER_LITTLE_ENDIAN__ means
that the machine is big endian, so add an explicit check that this macro
is indeed __ORDER_BIG_ENDIAN__ (same with __BYTE_ORDER, __LITTLE_ENDIAN
and __BIG_ENDIAN). A machine could have PDP endianness.
2. Remove the checks which base their autodetection decision on whether any
little or big endian macros are defined (eg __LITTLE_ENDIAN__ or
__BIG_ENDIAN__). Just because a system defines these does not mean it
has that endianness.
See issue #3760.
For cases where size_t is smaller than mp_int_t (eg nan-boxing builds) the
difference between two size_t's is not sign extended into mp_int_t and so
the result is never negative. This patch fixes this bug by using ssize_t
for the type of the result.
This gives dir() better behaviour when listing the attributes of a user
type that defines __getattr__: it will now not list those attributes for
which __getattr__ raises AttributeError (meaning the attribute is not
supported by the object).
This patch fixes the possibility of a crash of the REPL when tab-completing
an object which raises an exception when its attributes are accessed.
See issue #3729.
This new helper function acts like mp_load_method_maybe but is wrapped in
an NLR handler so it can catch exceptions. It prevents AttributeError from
propagating out, and optionally all other exceptions. This helper can be
used to fully implement hasattr (see follow-up commit), and also for cases
where mp_load_method_maybe is used but it must now raise an exception.
This is a more consistent use of errno codes. For example, it may be that
a stream returns MP_EAGAIN but the mp_is_nonblocking_error() macro doesn't
catch this value because it checks for EAGAIN instead (which may have a
different value than MP_EAGAIN when MICROPY_USE_INTERNAL_ERRNO is enabled).
Most modern systems have EWOULDBLOCK aliased to EAGAIN, ie they have the
same value. But some systems use different values for these errnos and if
a uPy port is using the system errno values (ie not the internal uPy
values) then it's important to be able to distinguish EWOULDBLOCK from
EAGAIN. Eg if a system call returned EWOULDBLOCK it must be possible to
check for this return value, and this patch makes this now possible.
Instead of emitnative.c having configuration code for each supported
architecture, and then compiling this file multiple times with different
macros defined, this patch adds a file per architecture with the necessary
code to configure the native emitter. These files then #include the
emitnative.c file.
This simplifies emitnative.c (which is already very large), and simplifies
the build system because emitnative.c no longer needs special handling for
compilation and qstr extraction.
This patch moves the implementation of stream closure from a dedicated
method to the ioctl of the stream protocol, for each type that implements
closing. The benefits of this are:
1. Rounds out the stream ioctl function, which already includes flush,
seek and poll (among other things).
2. Makes calling mp_stream_close() on an object slightly more efficient
because it now no longer needs to lookup the close method and call it,
rather it just delegates straight to the ioctl function (if it exists).
3. Reduces code size and allows future types that implement the stream
protocol to be smaller because they don't need a dedicated close method.
Code size reduction is around 200 bytes smaller for x86 archs and around
30 bytes smaller for the bare-metal archs.
The LHS passed to mp_obj_int_binary_op() will always be an integer, either
a small int or a big int, so the test for this type doesn't need to include
an "other, unsupported type" case.
Without the compiler enabled the mp_optimise_value is unused, and the
micropython.opt_level() function is not useful, so exclude these from the
build to save RAM and code size.
When pystack is enabled mp_obj_fun_bc_prepare_codestate() will always
return a valid pointer, and if there is no more pystack available then it
will raise an exception (a RuntimeError). So having pystack enabled with
stackless enabled automatically gives strict stackless mode. There is
therefore no need to have code for strict stackless mode when pystack is
enabled, and this patch optimises the VM for such a case.
The VM expects that, if mp_resume() returns MP_VM_RETURN_EXCEPTION, then
the returned value is an exception instance (eg to add a traceback to it).
It's possible that a value passed to a generator's throw() is not an
exception so must be explicitly checked for if the thrown value is not
intercepted by the generator.
Thanks to @jepler for finding the bug.
Prior to this patch the code would crash if a key in a ** dict was anything
other than a str or qstr. This is because mp_setup_code_state() assumes
that keys in kwargs are qstrs (for efficiency).
Thanks to @jepler for finding the bug.
By using pre-compiled regexs, using startswith(), and explicitly checking
for empty lines (of which around 30% of the input lines are), automatic
qstr extraction is speed up by about 10%.
All callers of mp_obj_int_formatted() are expected to pass in a valid int
object, and they do:
- mp_obj_int_print() should always pass through an int object because it is
the print special method for int instances.
- mp_print_mp_int() checks that the argument is an int, and if not converts
it to a small int.
This patch saves around 20-50 bytes of code space.
Prior to this patch, some architectures (eg unix x86) could render floats
with "negative" digits, like ")". For example, '%.23e' % 1e-80 would come
out as "1.0000000000000000/)/(,*0e-80". This patch fixes the known cases.
Prior to this patch, some architectures (eg unix x86) could render floats
with a ":" character in them, eg 1e+39 would come out as ":e+38" (":" is
just after "9" in ASCII so this is like 10e+38). This patch fixes some of
these cases.
Prior to this patch the %f formatting of some FP values could be off by up
to 1, eg '%.0f' % 123 would return "122" (unix x64). Depending on the FP
precision (single vs double) certain numbers would format correctly, but
others wolud not. This patch should fix all cases of rounding for %f.
There's no need to have MP_OBJ_NULL a special case, the code can re-use
the MP_OBJ_STOP_ITERATION value to signal the special case and the VM can
detect this with only one check (for MP_OBJ_STOP_ITERATION).
This patch concerns the handling of an NLR-raised StopIteration, raised
during a call to mp_resume() which is handling the yield from opcode.
Previously, commit 6738c1dded introduced code
to handle this case, along with a test. It seems that it was lucky that
the test worked because the code did not correctly handle the stack pointer
(sp).
Furthermore, commit 79d996a57b improved the
way mp_resume() propagated certain exceptions: it changed raising an NLR
value to returning MP_VM_RETURN_EXCEPTION. This change meant that the
test introduced in gen_yield_from_ducktype.py was no longer hitting the
code introduced in 6738c1dded.
The patch here does two things:
1. Fixes the handling of sp in the VM for the case that yield from is
interrupted by a StopIteration raised via NLR.
2. Introduces a new test to check this handling of sp and re-covers the
code in the VM.
This path for src->deg==NULL is never used because mpz_clone() is always
called with an argument that has a non-zero integer value, and hence has
some digits allocated to it (mpz_clone() is a static function private to
mpz.c all callers of this function first check if the integer value is zero
and if so take a special-case path, bypassing the call to mpz_clone()).
There is some unused and commented-out functions that may actually pass a
zero-valued mpz to mpz_clone(), so some TODOs are added to these function
in case they are needed in the future.
All callers of the asm entry function guarantee that num_locals>=0, so no
need to add an explicit check for it. Use an assertion instead.
Also, the signature of asm_x86_entry is changed to match the other asm
entry functions.
If a port only needs the core files then it can now use the $(PY_CORE_O)
variable instead of $(PY_O). $(PY_EXTMOD_O) contains the list of extmod
files (including some files from lib/). $(PY_O) retains its original
definition as the list of all object file (including those for frozen code)
and is a convenience variable for ports that want everything.
Saves a few bytes of code space, and is more efficient because with
MICROPY_GC_CONSERVATIVE_CLEAR enabled by default all memory is already
cleared when allocated.
Otherwise passing -1 as maxlen will lead to a zero allocation and
subsequent unbound buffer overflow in deque.append() because i_put is
allowed to grow without bound.
So far, implements just append() and popleft() methods, required for
a normal queue. Constructor doesn't accept an arbitarry sequence to
initialize from (am empty deque is always created), so an empty tuple
must be passed as such. Only fixed-size deques are supported, so 2nd
argument (size) is required.
There's also an extension to CPython - if True is passed as 3rd argument,
append(), instead of silently overwriting the oldest item on queue
overflow, will throw IndexError. This behavior is desired in many
cases, where queues should store information reliably, instead of
silently losing some items.