Some examples of improved compliance with CPython that currently
have divergent behavior in CircuitPython are listed below:
* yield from is not allowed in async methods
```
>>> async def f():
... yield from 'abc'
...
Traceback (most recent call last):
File "<stdin>", line 2, in f
SyntaxError: 'yield from' inside async function
```
* await only works on awaitable expressions
```
>>> async def f():
... await 'not awaitable'
...
>>> f().send(None)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in f
AttributeError: 'str' object has no attribute '__await__'
```
* only __await__()able expressions are awaitable
Okay this one actually does not work in circuitpython at all today.
This is how CPython works though and pretending __await__ does not
exist will only bite users who write both.
```
>>> class c:
... pass
...
>>> def f(self):
... yield
... yield
... return 'f to pay respects'
...
>>> c.__await__ = f # could just as easily have put it on the class but this shows how it's wired
>>> async def g():
... awaitable_thing = c()
... partial = await awaitable_thing
... return 'press ' + partial
...
>>> q = g()
>>> q.send(None)
>>> q.send(None)
>>> q.send(None)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration: press f to pay respects
```
MicroPython's original implementation of __aiter__ was correct for an
earlier (provisional) version of PEP492 (CPython 3.5), where __aiter__ was
an async-def function. But that changed in the final version of PEP492 (in
CPython 3.5.2) where the function was changed to a normal one. See
https://www.python.org/dev/peps/pep-0492/#why-aiter-does-not-return-an-awaitable
See also the note at the end of this subsection in the docs:
https://docs.python.org/3.5/reference/datamodel.html#asynchronous-iterators
And for completeness the BPO: https://bugs.python.org/issue27243
To be consistent with the Python spec as it stands today (and now that
PEP492 is final) this commit changes MicroPython's behaviour to match
CPython: __aiter__ should return an async-iterable object, but is not
itself awaitable.
The relevant tests are updated to match.
See #6267.
coroutines don't have __next__; they also call themselves coroutines.
This does not change the fact that `async def` methods are generators,
but it does make them behave more like CPython.
When adding the ability for boards to turn on the `@micropython.native`, `viper`, and `asm_thumb` decorators it was pointed out that it's somewhat awkward to write libraries and drivers that can take advantage of this since the decorators raise `SyntaxErrors` if they aren't enabled. In the case of `viper` and `asm_thumb` this behavior makes sense as they require writing non-normative code. Drivers could have a normal and viper/thumb implementation and implement them as such:
```python
try:
import _viper_impl as _impl
except SyntaxError:
import _python_impl as _impl
def do_thing():
return _impl.do_thing()
```
For `native`, however, this behavior and the pattern to work around it is less than ideal. Since `native` code should also be valid Python code (although not necessarily the other way around) using the pattern above means *duplicating* the Python implementation and adding `@micropython.native` in the code. This is an unnecessary maintenance burden.
This commit *modifies* the behavior of the `@micropython.native` decorator. On boards with `CIRCUITPY_ENABLE_MPY_NATIVE` turned on it operates as usual. On boards with it turned off it does *nothing*- it doesn't raise a `SyntaxError` and doesn't apply optimizations. This means we can write our drivers/libraries once and take advantage of speedups on boards where they are enabled.
This saves code space in builds which use link-time optimization.
The optimization drops the untranslated strings and replaces them
with a compressed_string_t struct. It can then be decompressed to
a c string.
Builds without LTO work as well but include both untranslated
strings and compressed strings.
This work could be expanded to include QSTRs and loaded strings if
a compress method is added to C. Its tracked in #531.
This patch combines the compiler optimisation code for double and triple
tuple-to-tuple assignment, taking it from two separate if-blocks to one
combined if-block. This can be done because the code for both of these
optimisations has a lot in common. Combining them together reduces code
size for ports that have the triple-tuple optimisation enabled (and doesn't
change code size for ports that have it disabled).
The technique of using alloca is how dotted import names are composed in
mp_import_from and mp_builtin___import__, so use the same technique in the
compiler. This puts less pressure on the heap (only the stack is used if
the qstr already exists, and if it doesn't exist then the standard qstr
block memory is used for the new qstr rather than a separate chunk of the
heap) and reduces overall code size.
Previous to this patch, a label with value "0" was used to indicate an
invalid label, but that meant a wasted word (at slot 0) in the array of
label offsets. This patch adjusts the label indices so the first one
starts at 0, and the maximum value indicates an invalid label.
This patch fixes a bug whereby the Python stack was not correctly reset if
there was a break/continue statement in the else black of an optimised
for-range loop.
For example, in the following code the "j" variable from the inner for loop
was not being popped off the Python stack:
for i in range(4):
for j in range(4):
pass
else:
continue
This is now fixed with this patch.
In CPython 3.4 this raises a SyntaxError. In CPython 3.5+ having a
positional after * is allowed but uPy has the wrong semantics and passes
the arguments in the incorrect order. To prevent incorrect use of a
function going unnoticed it is important to raise the SyntaxError in uPy,
until the behaviour is fixed to follow CPython 3.5+.
This patch allows the following code to run without allocating on the heap:
super().foo(...)
Before this patch such a call would allocate a super object on the heap and
then load the foo method and call it right away. The super object is only
needed to perform the lookup of the method and not needed after that. This
patch makes an optimisation to allocate the super object on the C stack and
discard it right after use.
Changes in code size due to this patch are:
bare-arm: +128
minimal: +232
unix x64: +416
unix nanbox: +364
stmhal: +184
esp8266: +340
cc3200: +128
This patch refactors the handling of the special super() call within the
compiler. It removes the need for a global (to the compiler) state variable
which keeps track of whether the subject of an expression is super. The
handling of super() is now done entirely within one function, which makes
the compiler a bit cleaner and allows to easily add more optimisations to
super calls.
Changes to the code size are:
bare-arm: +12
minimal: +0
unix x64: +48
unix nanbox: -16
stmhal: +4
cc3200: +0
esp8266: -56
With this optimisation enabled the compiler optimises the if-else
expression within a return statement. The optimisation reduces bytecode
size by 2 bytes for each use of such a return-if-else statement. Since
such a statement is not often used, and costs bytes for the code, the
feature is disabled by default.
For example the following code:
def f(x):
return 1 if x else 2
compiles to this bytecode with the optimisation disabled (left column is
bytecode offset in bytes):
00 LOAD_FAST 0
01 POP_JUMP_IF_FALSE 8
04 LOAD_CONST_SMALL_INT 1
05 JUMP 9
08 LOAD_CONST_SMALL_INT 2
09 RETURN_VALUE
and to this bytecode with the optimisation enabled:
00 LOAD_FAST 0
01 POP_JUMP_IF_FALSE 6
04 LOAD_CONST_SMALL_INT 1
05 RETURN_VALUE
06 LOAD_CONST_SMALL_INT 2
07 RETURN_VALUE
So the JUMP to RETURN_VALUE is optimised and replaced by RETURN_VALUE,
saving 2 bytes and making the code a bit faster.
Otherwise the type of parse-node and its kind has to be re-extracted
multiple times. This optimisation reduces code size by a bit (16 bytes on
bare-arm).