py: Implement partial PEP-498 (f-string) support
This implements (most of) the PEP-498 spec for f-strings, with two
exceptions:
- raw f-strings (`fr` or `rf` prefixes) raise `NotImplementedError`
- one special corner case does not function as specified in the PEP
(more on that in a moment)
This is implemented in the core as a syntax translation, brute-forcing
all f-strings to run through `String.format`. For example, the statement
`x='world'; print(f'hello {x}')` gets translated *at a syntax level*
(injected into the lexer) to `x='world'; print('hello {}'.format(x))`.
While this may lead to weird column results in tracebacks, it seemed
like the fastest, most efficient, and *likely* most RAM-friendly option,
despite being implemented under the hood with a completely separate
`vstr_t`.
Since [string concatenation of adjacent literals is implemented in the
lexer](https://github.com/micropython/micropython/commit/534b7c368dc2af7720f3aaed0c936ef46d773957),
two side effects emerge:
- All strings with at least one f-string portion are concatenated into a
single literal which *must* be run through `String.format()` wholesale,
and:
- Concatenation of a raw string with interpolation characters with an
f-string will cause `IndexError`/`KeyError`, which is both different
from CPython *and* different from the corner case mentioned in the PEP
(which gave an example of the following:)
```python
x = 10
y = 'hi'
assert ('a' 'b' f'{x}' '{c}' f'str<{y:^4}>' 'd' 'e') == 'ab10{c}str< hi >de'
```
The above-linked commit detailed a pretty solid case for leaving string
concatenation in the lexer rather than putting it in the parser, and
undoing that decision would likely be disproportionately costly on
resources for the sake of a probably-low-impact corner case. An
alternative to become complaint with this corner case of the PEP would
be to revert to string concatenation in the parser *only when an
f-string is part of concatenation*, though I've done no investigation on
the difficulty or costs of doing this.
A decent set of tests is included. I've manually tested this on the
`unix` port on Linux and on a Feather M4 Express (`atmel-samd`) and
things seem sane.
2019-08-10 21:27:20 -07:00
|
|
|
# Tests against https://www.python.org/dev/peps/pep-0498/
|
|
|
|
|
|
|
|
assert f'no interpolation' == 'no interpolation'
|
|
|
|
assert f"no interpolation" == 'no interpolation'
|
|
|
|
|
|
|
|
# Quoth the PEP:
|
|
|
|
# Backslashes may not appear anywhere within expressions. Comments, using the
|
|
|
|
# '#' character, are not allowed inside an expression
|
|
|
|
#
|
|
|
|
# CPython (3.7.4 on Linux) raises a SyntaxError here:
|
|
|
|
# >>> f'{#}'
|
|
|
|
# File "<stdin>", line 1
|
|
|
|
# SyntaxError: f-string expression part cannot include '#'
|
|
|
|
# >>> f'{\}'
|
|
|
|
# File "<stdin>", line 1
|
|
|
|
# SyntaxError: f-string expression part cannot include a backslash
|
|
|
|
# >>> f'{\\}'
|
|
|
|
# File "<stdin>", line 1
|
|
|
|
# SyntaxError: f-string expression part cannot include a backslash
|
|
|
|
# >>> f'{\#}'
|
|
|
|
# File "<stdin>", line 1
|
|
|
|
# SyntaxError: f-string expression part cannot include a backslash
|
|
|
|
|
|
|
|
# Backslashes and comments allowed outside expression
|
|
|
|
assert f"\\" == "\\"
|
|
|
|
assert f'#' == '#'
|
|
|
|
|
|
|
|
## But not inside
|
|
|
|
try:
|
|
|
|
eval("f'{\}'")
|
|
|
|
except SyntaxError:
|
|
|
|
pass
|
|
|
|
else:
|
|
|
|
raise AssertionError('f-string with backslash in expression did not raise SyntaxError')
|
|
|
|
|
|
|
|
try:
|
|
|
|
eval("f'{#}'")
|
|
|
|
except SyntaxError:
|
|
|
|
pass
|
|
|
|
else:
|
|
|
|
raise AssertionError('f-string with \'#\' in expression did not raise SyntaxError')
|
|
|
|
|
|
|
|
# Quoth the PEP:
|
|
|
|
# While scanning the string for expressions, any doubled braces '{{' or '}}'
|
|
|
|
# inside literal portions of an f-string are replaced by the corresponding
|
|
|
|
# single brace. Doubled literal opening braces do not signify the start of an
|
|
|
|
# expression. A single closing curly brace '}' in the literal portion of a
|
|
|
|
# string is an error: literal closing curly braces must be doubled '}}' in
|
|
|
|
# order to represent a single closing brace.
|
|
|
|
#
|
|
|
|
# CPython (3.7.4 on Linux) raises a SyntaxError for the last case:
|
|
|
|
# >>> f'{{}'
|
|
|
|
# File "<stdin>", line 1
|
|
|
|
# SyntaxError: f-string: single '}' is not allowed
|
|
|
|
|
|
|
|
assert f'{{}}' == '{}'
|
|
|
|
|
|
|
|
try:
|
|
|
|
eval("f'{{}'")
|
|
|
|
except ValueError:
|
|
|
|
pass
|
|
|
|
else:
|
|
|
|
raise RuntimeError('Expected ValueError for invalid f-string literal bracing')
|
|
|
|
|
|
|
|
x = 1
|
|
|
|
assert f'{x}' == '1'
|
|
|
|
|
|
|
|
# Quoth the PEP:
|
|
|
|
# The expressions that are extracted from the string are evaluated in the
|
|
|
|
# context where the f-string appeared. This means the expression has full
|
|
|
|
# access to local and global variables. Any valid Python expression can be
|
|
|
|
# used, including function and method calls. Because the f-strings are
|
|
|
|
# evaluated where the string appears in the source code, there is no additional
|
|
|
|
# expressiveness available with f-strings. There are also no additional
|
|
|
|
# security concerns: you could have also just written the same expression, not
|
|
|
|
# inside of an f-string:
|
|
|
|
|
|
|
|
def foo():
|
|
|
|
return 20
|
|
|
|
|
|
|
|
assert f'result={foo()}' == 'result=20'
|
|
|
|
assert f'result={foo()}' == 'result={}'.format(foo())
|
|
|
|
assert f'result={foo()}' == 'result={result}'.format(result=foo())
|
|
|
|
|
|
|
|
|
|
|
|
# Quoth the PEP:
|
|
|
|
# Adjacent f-strings and regular strings are concatenated. Regular strings are
|
|
|
|
# concatenated at compile time, and f-strings are concatenated at run time. For
|
|
|
|
# example, the expression:
|
|
|
|
#
|
|
|
|
# >>> x = 10
|
|
|
|
# >>> y = 'hi'
|
|
|
|
# >>> 'a' 'b' f'{x}' '{c}' f'str<{y:^4}>' 'd' 'e'
|
|
|
|
#
|
|
|
|
# yields the value: 'ab10{c}str< hi >de'
|
|
|
|
#
|
|
|
|
# Because strings are concatenated at lexer time rather than parser time in
|
|
|
|
# MicroPython for mostly RAM efficiency reasons (see
|
|
|
|
# https://github.com/micropython/micropython/commit/534b7c368dc2af7720f3aaed0c936ef46d773957),
|
|
|
|
# and because f-strings here are implemented as a syntax translation
|
|
|
|
# (f'{something}' => '{}'.format(something)), this particular functionality is unimplemented,
|
|
|
|
# and in the above example, the '{c}' portion will trigger a KeyError on String.format()
|
|
|
|
|
|
|
|
x = 10
|
|
|
|
y = 'hi'
|
|
|
|
assert (f'h' f'i') == 'hi'
|
2020-03-09 08:33:47 -05:00
|
|
|
#assert (f'h' 'i') == 'hi'
|
|
|
|
#assert ('h' f'i') == 'hi'
|
py: Implement partial PEP-498 (f-string) support
This implements (most of) the PEP-498 spec for f-strings, with two
exceptions:
- raw f-strings (`fr` or `rf` prefixes) raise `NotImplementedError`
- one special corner case does not function as specified in the PEP
(more on that in a moment)
This is implemented in the core as a syntax translation, brute-forcing
all f-strings to run through `String.format`. For example, the statement
`x='world'; print(f'hello {x}')` gets translated *at a syntax level*
(injected into the lexer) to `x='world'; print('hello {}'.format(x))`.
While this may lead to weird column results in tracebacks, it seemed
like the fastest, most efficient, and *likely* most RAM-friendly option,
despite being implemented under the hood with a completely separate
`vstr_t`.
Since [string concatenation of adjacent literals is implemented in the
lexer](https://github.com/micropython/micropython/commit/534b7c368dc2af7720f3aaed0c936ef46d773957),
two side effects emerge:
- All strings with at least one f-string portion are concatenated into a
single literal which *must* be run through `String.format()` wholesale,
and:
- Concatenation of a raw string with interpolation characters with an
f-string will cause `IndexError`/`KeyError`, which is both different
from CPython *and* different from the corner case mentioned in the PEP
(which gave an example of the following:)
```python
x = 10
y = 'hi'
assert ('a' 'b' f'{x}' '{c}' f'str<{y:^4}>' 'd' 'e') == 'ab10{c}str< hi >de'
```
The above-linked commit detailed a pretty solid case for leaving string
concatenation in the lexer rather than putting it in the parser, and
undoing that decision would likely be disproportionately costly on
resources for the sake of a probably-low-impact corner case. An
alternative to become complaint with this corner case of the PEP would
be to revert to string concatenation in the parser *only when an
f-string is part of concatenation*, though I've done no investigation on
the difficulty or costs of doing this.
A decent set of tests is included. I've manually tested this on the
`unix` port on Linux and on a Feather M4 Express (`atmel-samd`) and
things seem sane.
2019-08-10 21:27:20 -07:00
|
|
|
assert f'{x:^4}' == ' 10 '
|
2020-03-09 08:33:47 -05:00
|
|
|
#assert ('a' 'b' f'{x}' f'str<{y:^4}>' 'd' 'e') == 'ab10str< hi >de'
|
py: Implement partial PEP-498 (f-string) support
This implements (most of) the PEP-498 spec for f-strings, with two
exceptions:
- raw f-strings (`fr` or `rf` prefixes) raise `NotImplementedError`
- one special corner case does not function as specified in the PEP
(more on that in a moment)
This is implemented in the core as a syntax translation, brute-forcing
all f-strings to run through `String.format`. For example, the statement
`x='world'; print(f'hello {x}')` gets translated *at a syntax level*
(injected into the lexer) to `x='world'; print('hello {}'.format(x))`.
While this may lead to weird column results in tracebacks, it seemed
like the fastest, most efficient, and *likely* most RAM-friendly option,
despite being implemented under the hood with a completely separate
`vstr_t`.
Since [string concatenation of adjacent literals is implemented in the
lexer](https://github.com/micropython/micropython/commit/534b7c368dc2af7720f3aaed0c936ef46d773957),
two side effects emerge:
- All strings with at least one f-string portion are concatenated into a
single literal which *must* be run through `String.format()` wholesale,
and:
- Concatenation of a raw string with interpolation characters with an
f-string will cause `IndexError`/`KeyError`, which is both different
from CPython *and* different from the corner case mentioned in the PEP
(which gave an example of the following:)
```python
x = 10
y = 'hi'
assert ('a' 'b' f'{x}' '{c}' f'str<{y:^4}>' 'd' 'e') == 'ab10{c}str< hi >de'
```
The above-linked commit detailed a pretty solid case for leaving string
concatenation in the lexer rather than putting it in the parser, and
undoing that decision would likely be disproportionately costly on
resources for the sake of a probably-low-impact corner case. An
alternative to become complaint with this corner case of the PEP would
be to revert to string concatenation in the parser *only when an
f-string is part of concatenation*, though I've done no investigation on
the difficulty or costs of doing this.
A decent set of tests is included. I've manually tested this on the
`unix` port on Linux and on a Feather M4 Express (`atmel-samd`) and
things seem sane.
2019-08-10 21:27:20 -07:00
|
|
|
|
|
|
|
# Other tests
|
|
|
|
assert f'{{{4*10}}}' == '{40}'
|