This commit removes all parts of code associated with the existing
MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE optimisation option, including the
-mcache-lookup-bc option to mpy-cross.
This feature originally provided a significant performance boost for Unix,
but wasn't able to be enabled for MCU targets (due to frozen bytecode), and
added significant extra complexity to generating and distributing .mpy
files.
The equivalent performance gain is now provided by the combination of
MICROPY_OPT_LOAD_ATTR_FAST_PATH and MICROPY_OPT_MAP_LOOKUP_CACHE (which has
been enabled on the unix port in the previous commit).
It's hard to provide precise performance numbers, but tests have been run
on a wide variety of architectures (x86-64, ARM Cortex, Aarch64, RISC-V,
xtensa) and they all generally agree on the qualitative improvements seen
by the combination of MICROPY_OPT_LOAD_ATTR_FAST_PATH and
MICROPY_OPT_MAP_LOOKUP_CACHE.
For example, on a "quiet" Linux x64 environment (i3-5010U @ 2.10GHz) the
change from CACHE_MAP_LOOKUP_IN_BYTECODE, to LOAD_ATTR_FAST_PATH combined
with MAP_LOOKUP_CACHE is:
diff of scores (higher is better)
N=2000 M=2000 bccache -> attrmapcache diff diff% (error%)
bm_chaos.py 13742.56 -> 13905.67 : +163.11 = +1.187% (+/-3.75%)
bm_fannkuch.py 60.13 -> 61.34 : +1.21 = +2.012% (+/-2.11%)
bm_fft.py 113083.20 -> 114793.68 : +1710.48 = +1.513% (+/-1.57%)
bm_float.py 256552.80 -> 243908.29 : -12644.51 = -4.929% (+/-1.90%)
bm_hexiom.py 521.93 -> 625.41 : +103.48 = +19.826% (+/-0.40%)
bm_nqueens.py 197544.25 -> 217713.12 : +20168.87 = +10.210% (+/-3.01%)
bm_pidigits.py 8072.98 -> 8198.75 : +125.77 = +1.558% (+/-3.22%)
misc_aes.py 17283.45 -> 16480.52 : -802.93 = -4.646% (+/-0.82%)
misc_mandel.py 99083.99 -> 128939.84 : +29855.85 = +30.132% (+/-5.88%)
misc_pystone.py 83860.10 -> 82592.56 : -1267.54 = -1.511% (+/-2.27%)
misc_raytrace.py 21490.40 -> 22227.23 : +736.83 = +3.429% (+/-1.88%)
This shows that the new optimisations are at least as good as the existing
inline-bytecode-caching, and are sometimes much better (because the new
ones apply caching to a wider variety of map lookups).
The new optimisations can also benefit code generated by the native
emitter, because they apply to the runtime rather than the generated code.
The improvement for the native emitter when LOAD_ATTR_FAST_PATH and
MAP_LOOKUP_CACHE are enabled is (same Linux environment as above):
diff of scores (higher is better)
N=2000 M=2000 native -> nat-attrmapcache diff diff% (error%)
bm_chaos.py 14130.62 -> 15464.68 : +1334.06 = +9.441% (+/-7.11%)
bm_fannkuch.py 74.96 -> 76.16 : +1.20 = +1.601% (+/-1.80%)
bm_fft.py 166682.99 -> 168221.86 : +1538.87 = +0.923% (+/-4.20%)
bm_float.py 233415.23 -> 265524.90 : +32109.67 = +13.756% (+/-2.57%)
bm_hexiom.py 628.59 -> 734.17 : +105.58 = +16.796% (+/-1.39%)
bm_nqueens.py 225418.44 -> 232926.45 : +7508.01 = +3.331% (+/-3.10%)
bm_pidigits.py 6322.00 -> 6379.52 : +57.52 = +0.910% (+/-5.62%)
misc_aes.py 20670.10 -> 27223.18 : +6553.08 = +31.703% (+/-1.56%)
misc_mandel.py 138221.11 -> 152014.01 : +13792.90 = +9.979% (+/-2.46%)
misc_pystone.py 85032.14 -> 105681.44 : +20649.30 = +24.284% (+/-2.25%)
misc_raytrace.py 19800.01 -> 23350.73 : +3550.72 = +17.933% (+/-2.79%)
In summary, compared to MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE, the new
MICROPY_OPT_LOAD_ATTR_FAST_PATH and MICROPY_OPT_MAP_LOOKUP_CACHE options:
- are simpler;
- take less code size;
- are faster (generally);
- work with code generated by the native emitter;
- can be used on embedded targets with a small and constant RAM overhead;
- allow the same .mpy bytecode to run on all targets.
See #7680 for further discussion. And see also #7653 for a discussion
about simplifying mpy-cross options.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
The existing inline bytecode caching optimisation, selected by
MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE, reserves an extra byte in the
bytecode after certain opcodes, which at runtime stores a map index of the
likely location of this field when looking up the qstr. This scheme is
incompatible with bytecode-in-ROM, and doesn't work with native generated
code. It also stores bytecode in .mpy files which is of a different format
to when the feature is disabled, making generation of .mpy files more
complex.
This commit provides an alternative optimisation via an approach that adds
a global cache for map offsets, then all mp_map_lookup operations use it.
It's less precise than bytecode caching, but allows the cache to be
independent and external to the bytecode that is executing. It also works
for the native emitter and adds a similar performance boost on top of the
gain already provided by the native emitter.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
When the LOAD_ATTR opcode is executed there are quite a few different cases
that have to be handled, but the common case is accessing a member on an
instance type. Typically, built-in types provide methods which is why this
is common.
Fortunately, for this specific case, if the member is found in the member
map then there's no further processing.
This optimisation does a relatively cheap check (type is instance) and then
forwards directly to the member map lookup, falling back to the regular
path if necessary.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This is the beginning of a set of changes to simplify enabling/disabling
features. The goals are:
- Remove redundancy from mpconfigport.h (never set a value to the default
-- make it clear exactly what's being enabled).
- Improve consistency between ports. All "similar" ports (i.e. approx same
flash size) should get the same features.
- Simplify mpconfigport.h -- just get default/sensible options for the size
of the port.
- Make it easy for defining constrained boards (e.g. STM32F0/L0), they can
just set a lower level.
This commit makes a step towards this and defines the "core" level as the
current default feature set, and a "minimal" level to turn off everything.
And a few placeholder levels are added for where the other ports will
roughly land.
This is a no-op change for all ports.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This commit simplifies and optimises the parse tree in-memory
representation of lists of expressions, for tuples and lists, and when
tuples are used on the left-hand-side of assignments and within del
statements. This reduces memory usage of the parse tree when such code is
compiled, and also reduces the size of the compiler.
For example, (1,) was previously the following parse tree:
expr_stmt(5) (n=2)
atom_paren(45) (n=1)
testlist_comp(146) (n=2)
int(1)
testlist_comp_3b(149) (n=1)
NULL
NULL
and with this commit is now:
expr_stmt(5) (n=2)
atom_paren(45) (n=1)
testlist_comp(146) (n=1)
int(1)
NULL
Similarly, (1, 2, 3) was previously:
expr_stmt(5) (n=2)
atom_paren(45) (n=1)
testlist_comp(146) (n=2)
int(1)
testlist_comp_3c(150) (n=2)
int(2)
int(3)
NULL
and is now:
expr_stmt(5) (n=2)
atom_paren(45) (n=1)
testlist_comp(146) (n=3)
int(1)
int(2)
int(3)
NULL
Signed-off-by: Damien George <damien@micropython.org>
The boot counter is a uint8_t single-byte counter stored in the first NVM byte position (`micrcontroller.nvm[0]`). The counter increments by 1 each time the board boots, regardless if it's a hard or soft reset.
Enable the boot counter by adding `#define CIRCUITPY_BOOT_COUNTER 1` to your board's mpconfigboard.h file. Note that an error will be thrown during the build if `CIRCUITPY_INTERNAL_NVM_SIZE` is not also set within mpconfigboard.h.
This commit refactors machine.PWM and creates extmod/machine_pwm.c. The
esp8266, esp32 and rp2 ports all use this and provide implementations of
the required PWM functionality. This helps to reduce code duplication and
keep the same Python API across ports.
This commit does not make any functional changes.
Signed-off-by: Damien George <damien@micropython.org>
The zephyr port doesn't support SoftI2C so it's not enabled, and the legacy
I2C constructor check can be removed.
Signed-off-by: Damien George <damien@micropython.org>
* Reduce the number of supported HID reports of IDs per descriptor.
This saves ~200 bytes in the default HID objects.
* (Not enabled) Compute QSTR attrs on init. This trades 1k RAM for
flash. Flash is the default (1).
Messages like 'command-line option is [not valid] for C++' can result
from the way the preprocessor is invoked by `genlast`. Instead, cause
the files to be preprocessed as though their content is "C". This
should generally be OK, as they'll eventually be _compiled_ as C++.
When preprocessed as C, the file simply needs to generate all the same
QSTRS and TRANSLATEs.
When you wish to use the valgrind memory analysis tool on micropython,
you can arrange to define MICROPY_DEBUG_VALGRIND to enable use of
special valgrind macros. For now, this only fixes `gc_get_ptr`
so that it never emits the diagnostic "Conditional jump or move depends
on uninitialised value(s)".
Signed-off-by: Jeff Epler <jepler@gmail.com>
Convert bitbangio, bitmaptools, _bleio, board, busio, countio, digitalio, framebufferio, frequencyio, gamepadshift, getpass, keypad, math, microcontroller, and msgpack modules to use MP_REGISTER_MODULE.
Related to #5183.
Convert adafruit_bus_device, adafruit_pixelbuf, analogio, atexit, audiobusio, audiocore, audioio, audiomixer, and audiomp3 modules to use MP_REGISTER_MODULE.
Related to #5183.
Convert from using MICROPY_PORT_BUILTIN_MODULES_STRONG_LINKS to using MP_REGISTER_MODULE for displayio, terminalio, and fontio modules.
Related to #5183.
This is a generic API for synchronously bit-banging data on a pin.
Initially this adds a single supported encoding, which supports controlling
WS2812 LEDs.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This was missed in 692d36d779. It's not
strictly necessary as the GC will clean it anyway, but it's good to
pre-emptively gc_free() all the blocks used in lexing/parsing.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This also tweaks the repr for unicode strings to only escape a few
utf-8 code points. This makes emoji show in os.listdir() for
example.
Also, enable exfat support on full builds.
Fixes#5146
This implements (most of) the PEP-498 spec for f-strings and is based on
https://github.com/micropython/micropython/pull/4998 by @klardotsh.
It is implemented in the lexer as a syntax translation to `str.format`:
f"{a}" --> "{}".format(a)
It also supports:
f"{a=}" --> "a={}".format(a)
This is done by extracting the arguments into a temporary vstr buffer,
then after the string has been tokenized, the lexer input queue is saved
and the contents of the temporary vstr buffer are injected into the lexer
instead.
There are four main limitations:
- raw f-strings (`fr` or `rf` prefixes) are not supported and will raise
`SyntaxError: raw f-strings are not supported`.
- literal concatenation of f-strings with adjacent strings will fail
"{}" f"{a}" --> "{}{}".format(a) (str.format will incorrectly use
the braces from the non-f-string)
f"{a}" f"{a}" --> "{}".format(a) "{}".format(a) (cannot concatenate)
- PEP-498 requires the full parser to understand the interpolated
argument, however because this entirely runs in the lexer it cannot
resolve nested braces in expressions like
f"{'}'}"
- The !r, !s, and !a conversions are not supported.
Includes tests and cpydiffs.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This leaves much more space on SAMD21 builds that aren't "full builds".
These are new APIs that we don't need to add to old boards.
Also, tweak two Arduino boards to save space on them.
* The new nonstandard '%S' format takes a pointer to compressed_string_t
and prints it
* The new mp_cprintf and mp_vcprintf take a format string that is a
compressed_string_t
By storing "count of words by length", the long `wends` table can be
replaced with a short `wlencount` table. This saves flash storage space.
Extend the range of string lengths that can be in the dictionary.
Originally it was to 2 to 9; at one point it was changed to 3 to 9.
Putting the lower bound back at 2 has a positive impact on the French
translation (a bunch of them, such as "ch", "\r\n", "%q", are used).
Increasing the maximum length gets 'mpossible', ' doit être ',
and 'CircuitPyth' at the long end. This adds a bit of processing time
to makeqstrdata. The specific 2/11 values are again empirical based on
the French translation on the adafruit_proxlight_trinkey_m0.
Commit 4173950658 removed automatic building
of mpy-cross, which rebuilt it whenever any of its dependent source files
changed.
But needing to build mpy-cross, and not knowing how, is a frequent issue.
This commit aims to help by automatically building mpy-cross only if it
doesn't exist. For Makefiles it uses an order-only prerequisite, while
for CMake it uses a custom command.
If MICROPY_MPYCROSS (which is what makemanifest.py uses to locate the
mpy-cross executable) is defined in the environment then automatic build
will not be attempted, allowing a way to prevent this auto-build if needed.
Thanks to Trammell Hudson aka @osresearch for the original idea; see #5760.
Signed-off-by: Damien George <damien@micropython.org>
Optionally enabled via MICROPY_PY_UJSON_SEPARATORS. Enabled by default.
For dump, make sure mp_get_stream_raise is called after
mod_ujson_separators since CPython does it in this order (if both
separators and stream are invalid, separators will raise an exception
first).
Add separators argument in the docs as well.
Signed-off-by: Peter Züger <zueger.peter@icloud.com>
Signed-off-by: Damien George <damien@micropython.org>
Commit e33bc597 ("py: Remove calls to file reader functions when these
are disabled.") changed the condition for one caller of
do_execute_raw_code() from
MICROPY_PERSISTENT_CODE_LOAD
to
MICROPY_HAS_FILE_READER && MICROPY_PERSISTENT_CODE_LOAD
The condition that enables compiling the function itself needs to be
changed to match.
Signed-off-by: David Lechner <david@pybricks.com>
Previously a subclass of a type that didn't implement unary_op, or didn't
handle MP_UNARY_OP_BOOL, would raise TypeError on bool conversion.
Fixes#5677.
This adds #if MICROPY_PY_USELECT_SELECT around the uselect.select()
function. According to the docs, this function is only for CPython
compatibility and should not normally be used. So we can disable it
and save a few bytes of flash space where possible.
Signed-off-by: David Lechner <david@pybricks.com>
The MP_OBJ_STOP_ITERATION optimisation is a shortcut for creating a
StopIteration() exception object, and means that heap memory does not need
to be allocated for the exception (in cases where it can be used). This
commit allows this optimised object to take an optional argument (before,
it could only have no argument).
The commit also adds some new tests to cover corner cases with
StopIteration and generators that previously did not work.
Signed-off-by: Damien George <damien@micropython.org>
I was puzzled by why the dictionary words were sorted by length.
It was because TextSplitter sorted its parameter, instead of a copy.
This doesn't affect encoding size, but does affect the encoding NUMBER
of the found words. We'll deliberately restore sorting by length next,
for other reasons, but not by spooky action.
Try to accurately measure the costs of including a word in the dictionary
vs the gains from using it in messages.
This saves about 160 bytes on trinket_m0 ja, the fullest translation
for that board. Other translations on the same board all have savings,
ranging from 24 to 228 bytes.
```
Translation Before After Savings
ja 1164 1324 160
de_DE 1260 1396 136
fr 1424 1652 228
zh_Latn_pinyin 1448 1520 72
pt_BR 1584 1736 152
pl 1592 1640 48
es 1724 1816 92
ko 1724 1816 92
fil 1764 1800 36
it_IT 1896 2040 144
nl 1956 2136 180
ID 2072 2180 108
cs 2124 2148 24
sv 2340 2448 108
en_x_pirate 2644 2740 96
en_GB 2652 2752 100
el 2656 2768 112
en_US 2656 2768 112
hi 2656 2768 112
```
By comparing the address of the initial 'name' field instead of the
addresses of the objects themselves, a small amount of type safety is
added back, vs just casting to void.
In the event that some other kind of object is passed in as 't',
which happens to have a 'name' field of the right type, the construct
would be (undesirably) accepted but it would almost certainly evaluate
to false at runtime.
This adds the --tags argument to the git describe command that is used
to define the MICROPY_GIT_TAG macro. This makes it match non-annotated
tags. This is useful for MicroPython derivatives that don't use
annotated tags.
Signed-off-by: David Lechner <david@pybricks.com>
Only include .c and .cpp files explicitly in the list of files passed to
the preprocessor for QSTR extraction. All relevant .h files will be
included in this process by "#include" from the .c(pp) files. In
particular for moduledefs.h, this is included by py/objmodule.c (and
doesn't actually contain any extractable MP_QSTR_xxx, but rather defines
macros with MP_QSTR_xxx's in them which are then part of py/objmodule.c).
The main reason for this change is to simplify the preprocessing step on
the javascript port, which tries to compile .h files as C++ precompiled
headers if they are passed with -E to clang.
Signed-off-by: Damien George <damien@micropython.org>