This implements (most of) the PEP-498 spec for f-strings, with two
exceptions:
- raw f-strings (`fr` or `rf` prefixes) raise `NotImplementedError`
- one special corner case does not function as specified in the PEP
(more on that in a moment)
This is implemented in the core as a syntax translation, brute-forcing
all f-strings to run through `String.format`. For example, the statement
`x='world'; print(f'hello {x}')` gets translated *at a syntax level*
(injected into the lexer) to `x='world'; print('hello {}'.format(x))`.
While this may lead to weird column results in tracebacks, it seemed
like the fastest, most efficient, and *likely* most RAM-friendly option,
despite being implemented under the hood with a completely separate
`vstr_t`.
Since [string concatenation of adjacent literals is implemented in the
lexer](534b7c368d),
two side effects emerge:
- All strings with at least one f-string portion are concatenated into a
single literal which *must* be run through `String.format()` wholesale,
and:
- Concatenation of a raw string with interpolation characters with an
f-string will cause `IndexError`/`KeyError`, which is both different
from CPython *and* different from the corner case mentioned in the PEP
(which gave an example of the following:)
```python
x = 10
y = 'hi'
assert ('a' 'b' f'{x}' '{c}' f'str<{y:^4}>' 'd' 'e') == 'ab10{c}str< hi >de'
```
The above-linked commit detailed a pretty solid case for leaving string
concatenation in the lexer rather than putting it in the parser, and
undoing that decision would likely be disproportionately costly on
resources for the sake of a probably-low-impact corner case. An
alternative to become complaint with this corner case of the PEP would
be to revert to string concatenation in the parser *only when an
f-string is part of concatenation*, though I've done no investigation on
the difficulty or costs of doing this.
A decent set of tests is included. I've manually tested this on the
`unix` port on Linux and on a Feather M4 Express (`atmel-samd`) and
things seem sane.
This feature is controlled at compile time by MICROPY_PY_URE_SUB, disabled
by default.
Thanks to @dmazzella for the original patch for this feature; see #3770.
This feature is controlled at compile time by
MICROPY_PY_URE_MATCH_SPAN_START_END, disabled by default.
Thanks to @dmazzella for the original patch for this feature; see #3770.
This feature is controlled at compile time by MICROPY_PY_URE_MATCH_GROUPS,
disabled by default.
Thanks to @dmazzella for the original patch for this feature; see #3770.
Commit 95e70cd0ea 'time: Use 1970 epoch' changed epoch for the time
module, but not for other users. This patch does the same for the only
other core timeutils user: extmod/vfs_fat.c:fat_vfs_stat().
Other timeutils users: cc3200, esp8266 and stm32, are not changed.
Ports that don't use long ints, will still get wrong time values from
os.stat().
This behaviour of a NULL write C method on a stream that uses the write
adaptor objects is no longer supported. It was only ever used by the
coverage build for testing the fail path of mp_get_stream_raise().
For i2c.py: the accelerometer now uses the new I2C driver so need to
explicitly init the legacy i2c object to get the test working.
For pyb1.py: the legacy pyb.hid() call will crash if the USB_HID object is
not initialised.
This patch is a code optimisation, trading text bytes for speed. On
pyboard it's an increase of 0.06% in code size for a gain (in pystone
performance) of roughly 6.5%.
The patch optimises load/store/delete of attributes in user defined classes
by not looking up special accessors (@property, __get__, __delete__,
__set__, __setattr__ and __getattr_) if they are guaranteed not to exist in
the class.
Currently, if you do my_obj.foo() then the runtime has to do a few checks
to see if foo is a property or has __get__, and if so delegate the call.
And for stores things like my_obj.foo = 1 has to first check if foo is a
property or has __set__ defined on it.
Doing all those checks each and every time the attribute is accessed has a
performance penalty. This patch eliminates all those checks for cases when
it's guaranteed that the checks will always fail, ie no attributes are
properties nor have any special accessor methods defined on them.
To make this guarantee it checks all attributes of a user-defined class
when it is first created. If any of the attributes of the user class are
properties or have special accessors, or any of the base classes of the
user class have them, then it sets a flag in the class to indicate that
special accessors must be checked for. Then in the load/store/delete code
it checks this flag to see if it can take the shortcut and optimise the
lookup.
It's an optimisation that's pretty widely applicable because it improves
lookup performance for all methods of user defined classes, and stores of
attributes, at least for those that don't have special accessors. And, it
allows to enable descriptors with minimal additional runtime overhead if
they are not used for a particular user class.
There is one restriction on dynamic class creation that has been introduced
by this patch: a user-defined class cannot go from zero special accessors
to one special accessor (or more) after that class has been subclassed. If
the script attempts this an AttributeError is raised (see addition to
tests/misc/non_compliant.py for an example of this case).
The cost in code space bytes for the optimisation in this patch is:
unix x64: +528
unix nanbox: +508
stm32: +192
cc3200: +200
esp8266: +332
esp32: +244
Performance tests that were done:
- on unix x86-64, pystone improved by about 5%
- on pyboard, pystone improved by about 6.5%, from 1683 up to 1794
- on pyboard, bm_chaos (from CPython benchmark suite) improved by about 5%
- on esp32, pystone improved by about 30% (but there are caching effects)
- on esp32, bm_chaos improved by about 11%
This conditional import was only used to get the tests working on the unix
coverage build, which has now switched to use VFS by default so the uos
module alone has the required functionality.
Printing of uPy floats can differ by the floating-point precision on
different architectures (eg 64-bit vs 32-bit x86), so it's not possible to
using printing of floats in some parts of this test. Instead we can just
check for equivalence with what is known to be the correct answer.
Commit e269cabe3e added a check that the
first argument to the to_bytes() method is an integer, and now uPy
follows CPython behaviour and raises a TypeError for this test.
Note: CPython checks the argument types before checking the number of
arguments, but uPy does it the other way around, so they give different
exception messages for this test, but still the same type, a TypeError.