For the coverage build this reduces the binary size to about 1/4 of its
size, and seems to help gcov/lcov coverage analysis so that it doesn't miss
lines.
Signed-off-by: Damien George <damien@micropython.org>
This commit removes all parts of code associated with the existing
MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE optimisation option, including the
-mcache-lookup-bc option to mpy-cross.
This feature originally provided a significant performance boost for Unix,
but wasn't able to be enabled for MCU targets (due to frozen bytecode), and
added significant extra complexity to generating and distributing .mpy
files.
The equivalent performance gain is now provided by the combination of
MICROPY_OPT_LOAD_ATTR_FAST_PATH and MICROPY_OPT_MAP_LOOKUP_CACHE (which has
been enabled on the unix port in the previous commit).
It's hard to provide precise performance numbers, but tests have been run
on a wide variety of architectures (x86-64, ARM Cortex, Aarch64, RISC-V,
xtensa) and they all generally agree on the qualitative improvements seen
by the combination of MICROPY_OPT_LOAD_ATTR_FAST_PATH and
MICROPY_OPT_MAP_LOOKUP_CACHE.
For example, on a "quiet" Linux x64 environment (i3-5010U @ 2.10GHz) the
change from CACHE_MAP_LOOKUP_IN_BYTECODE, to LOAD_ATTR_FAST_PATH combined
with MAP_LOOKUP_CACHE is:
diff of scores (higher is better)
N=2000 M=2000 bccache -> attrmapcache diff diff% (error%)
bm_chaos.py 13742.56 -> 13905.67 : +163.11 = +1.187% (+/-3.75%)
bm_fannkuch.py 60.13 -> 61.34 : +1.21 = +2.012% (+/-2.11%)
bm_fft.py 113083.20 -> 114793.68 : +1710.48 = +1.513% (+/-1.57%)
bm_float.py 256552.80 -> 243908.29 : -12644.51 = -4.929% (+/-1.90%)
bm_hexiom.py 521.93 -> 625.41 : +103.48 = +19.826% (+/-0.40%)
bm_nqueens.py 197544.25 -> 217713.12 : +20168.87 = +10.210% (+/-3.01%)
bm_pidigits.py 8072.98 -> 8198.75 : +125.77 = +1.558% (+/-3.22%)
misc_aes.py 17283.45 -> 16480.52 : -802.93 = -4.646% (+/-0.82%)
misc_mandel.py 99083.99 -> 128939.84 : +29855.85 = +30.132% (+/-5.88%)
misc_pystone.py 83860.10 -> 82592.56 : -1267.54 = -1.511% (+/-2.27%)
misc_raytrace.py 21490.40 -> 22227.23 : +736.83 = +3.429% (+/-1.88%)
This shows that the new optimisations are at least as good as the existing
inline-bytecode-caching, and are sometimes much better (because the new
ones apply caching to a wider variety of map lookups).
The new optimisations can also benefit code generated by the native
emitter, because they apply to the runtime rather than the generated code.
The improvement for the native emitter when LOAD_ATTR_FAST_PATH and
MAP_LOOKUP_CACHE are enabled is (same Linux environment as above):
diff of scores (higher is better)
N=2000 M=2000 native -> nat-attrmapcache diff diff% (error%)
bm_chaos.py 14130.62 -> 15464.68 : +1334.06 = +9.441% (+/-7.11%)
bm_fannkuch.py 74.96 -> 76.16 : +1.20 = +1.601% (+/-1.80%)
bm_fft.py 166682.99 -> 168221.86 : +1538.87 = +0.923% (+/-4.20%)
bm_float.py 233415.23 -> 265524.90 : +32109.67 = +13.756% (+/-2.57%)
bm_hexiom.py 628.59 -> 734.17 : +105.58 = +16.796% (+/-1.39%)
bm_nqueens.py 225418.44 -> 232926.45 : +7508.01 = +3.331% (+/-3.10%)
bm_pidigits.py 6322.00 -> 6379.52 : +57.52 = +0.910% (+/-5.62%)
misc_aes.py 20670.10 -> 27223.18 : +6553.08 = +31.703% (+/-1.56%)
misc_mandel.py 138221.11 -> 152014.01 : +13792.90 = +9.979% (+/-2.46%)
misc_pystone.py 85032.14 -> 105681.44 : +20649.30 = +24.284% (+/-2.25%)
misc_raytrace.py 19800.01 -> 23350.73 : +3550.72 = +17.933% (+/-2.79%)
In summary, compared to MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE, the new
MICROPY_OPT_LOAD_ATTR_FAST_PATH and MICROPY_OPT_MAP_LOOKUP_CACHE options:
- are simpler;
- take less code size;
- are faster (generally);
- work with code generated by the native emitter;
- can be used on embedded targets with a small and constant RAM overhead;
- allow the same .mpy bytecode to run on all targets.
See #7680 for further discussion. And see also #7653 for a discussion
about simplifying mpy-cross options.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This compiler is unable to optimise out the giant strcmp match generated
by MP_MATCH_COMPRESSED.
See github.com/micropython/micropython/pull/7659#issuecomment-899479793
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This implements (most of) the PEP-498 spec for f-strings and is based on
https://github.com/micropython/micropython/pull/4998 by @klardotsh.
It is implemented in the lexer as a syntax translation to `str.format`:
f"{a}" --> "{}".format(a)
It also supports:
f"{a=}" --> "a={}".format(a)
This is done by extracting the arguments into a temporary vstr buffer,
then after the string has been tokenized, the lexer input queue is saved
and the contents of the temporary vstr buffer are injected into the lexer
instead.
There are four main limitations:
- raw f-strings (`fr` or `rf` prefixes) are not supported and will raise
`SyntaxError: raw f-strings are not supported`.
- literal concatenation of f-strings with adjacent strings will fail
"{}" f"{a}" --> "{}{}".format(a) (str.format will incorrectly use
the braces from the non-f-string)
f"{a}" f"{a}" --> "{}".format(a) "{}".format(a) (cannot concatenate)
- PEP-498 requires the full parser to understand the interpolated
argument, however because this entirely runs in the lexer it cannot
resolve nested braces in expressions like
f"{'}'}"
- The !r, !s, and !a conversions are not supported.
Includes tests and cpydiffs.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
Add an optional 'lock' kwarg to callback that locks GC and scheduler. This
allows the callback to be invoked asynchronously in 'interrupt context',
for example as a signal handler.
Also add the 'cfun' member function to callback, that allows retrieving the
C callback function address. This is needed when the callback should be
set to a struct field.
See related #7373.
Signed-off-by: Amir Gonnen <amirgonnen@gmail.com>
It reschedules the BT HCI poll soft timer so that it is called exactly when
the next timer expires.
Signed-off-by: Damien George <damien@micropython.org>
This introduces a new macro to get the main thread and uses it to ensure
that asynchronous exceptions such as KeyboardInterrupt (CTRL+C) are only
scheduled on the main thread. This is more deterministic than being
scheduled on a random thread and is more in line with CPython that only
allow signal handlers to run on the main thread.
Fixes issue #7026.
Signed-off-by: David Lechner <david@pybricks.com>
This moves mp_pending_exception from mp_state_vm_t to mp_state_thread_t.
This allows exceptions to be scheduled on a specific thread.
Signed-off-by: David Lechner <david@pybricks.com>
This fixes error: cast to smaller integer type 'int' from 'pthread_t'.
pthread_t is defined as long, not as int.
Signed-off-by: Pavol Rusnak <pavol@rusnak.io>
This commit fixes the following problems converting to/from Python integers
and ffi types:
- integers of 8 and 16 bits not working on big endian
- integers of 64 bits not working on 32 bits architectures
- unsigned returns were converted to signed Python integers
Fixes issue #7269.
This fixes a bug where double arguments on a 32-bit architecture would not
be passed correctly because they only had 4 bytes of storage (not 8). It
also fixes a compiler warning/error in return_ffi_value on certian
architectures: array subscript 'double[0]' is partly outside array bounds
of 'ffi_arg[1]' {aka 'long unsigned int[1]'}.
Fixes issue #7064.
Signed-off-by: Damien George <damien@micropython.org>
Doing "import <tab>" will now complete/list built-in modules.
Originally at adafruit#4548 and adafruit#4608
Signed-off-by: Artyom Skrobov <tyomitch@gmail.com>
This fixes `error: variable 'subpkg_tried' might be clobbered by 'longjmp'
or 'vfork' [-Werror=clobbered]` when compiling on ppc64le and aarch64 (and
possibly other architectures/toolchains).
Per CPython everything which comes after the command, module or file
argument is not an option for the interpreter itself. Hence the processing
of options should stop when encountering those, and the remainder be passed
as sys.argv. Note the latter was already the case for a module or file but
not for a command.
This fixes issues like 'micropython myfile.py -h' showing the help and
exiting instead of passing '-h' as sys.argv[1], likewise for
'-X <something>' being treated as a special option no matter where it
occurs on the command line.
It's a bit of a pitfall with user C modules that including them in the
build does not automatically enable them. This commit changes the docs and
examples for user C modules to encourage writers of user C modules to
enable them unconditionally. This makes things simpler and covers most use
cases.
See discussion in issue #6960, and also #7086.
Signed-off-by: Damien George <damien@micropython.org>
The "word" referred to by BYTES_PER_WORD is actually the size of mp_obj_t
which is not always the same as the size of a pointer on the target
architecture. So rename this config value to better reflect what it
measures, and also prefix it with MP_.
For uses of BYTES_PER_WORD in setting the stack limit this has been
changed to sizeof(void *), because the stack usually grows with
machine-word sized values (eg an nlr_buf_t has many machine words in it).
Signed-off-by: Damien George <damien@micropython.org>
It practically does the same as qstr_from_str and was only used in one
place, which should actually use the compile-time MP_QSTR_XXX form for
consistency; qstr_from_str is for runtime strings only.
With MICROPY_FLOAT_IMPL_FLOAT the results of utime.time(), gmtime() and
localtime() change only every 129 seconds. As one consequence
tests/extmod/vfs_lfs_mtime.py will fail on a unix port with LFS support.
With this patch these functions only return floats if
MICROPY_FLOAT_IMPL_DOUBLE is used. Otherwise they return integers.
Two issues are tackled:
1. The calculation of the correct length to print is fixed to treat the
precision as a maximum length instead as the exact length.
This is done for both qstr (%q) and for regular str (%s).
2. Fix the incorrect use of mp_printf("%.*s") to mp_print_strn().
Because of the fix of above issue, some testcases that would print
an embedded null-byte (^@ in test-output) would now fail.
The bug here is that "%s" was used to print null-bytes. Instead,
mp_print_strn is used to make sure all bytes are outputted and the
exact length is respected.
Test-cases are added for both %s and %q with a combination of precision
and padding specifiers.
Also known as L2CAP "connection oriented channels". This provides a
socket-like data transfer mechanism for BLE.
Currently only implemented for NimBLE on STM32 / Unix.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This changes stm32 from using PENDSV to run NimBLE to use the MicroPython
scheduler instead. This allows Python BLE callbacks to be invoked directly
(and therefore synchronously) rather than via the ringbuffer.
The NimBLE UART HCI and event processing now happens in a scheduled task
every 128ms. When RX IRQ idle events arrive, it will also schedule this
task to improve latency.
There is a similar change for the unix port where the background thread now
queues the scheduled task.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This requires that the event handlers are called from non-interrupt context
(i.e. the MicroPython scheduler).
This will allow the BLE stack (e.g. NimBLE) to run from the scheduler
rather than an IRQ like PENDSV, and therefore be able to invoke Python
callbacks directly/synchronously. This allows writing Python BLE handlers
for events that require immediate response such as _IRQ_READ_REQUEST (which
was previous a hard IRQ) and future events relating to pairing/bonding.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
Prior to this change machine.mem32['foo'] (or using any other non-integer
subscript) could result in a fault due to 'foo' being interpreted as an
integer. And when writing code it's hard to tell if the fault is due to a
bad subscript type, or an integer subscript that specifies an invalid
memory address.
The type of the object used in the subscript is now tested to be an
integer by using mp_obj_get_int_truncated instead of
mp_obj_int_get_truncated. The performance hit of this change is minimal,
and machine.memX objects are more for convenience than performance (there
are many other ways to read/write memory in a faster way),
Fixes issue #6588.
Add working example code to provide a starting point for users with files
that they can just copy, and include the modules in the coverage test to
verify the complete user C module build functionality. The cexample module
uses the code originally found in cmodules.rst, which has been updated to
reflect this and partially rewritten with more complete information.
Support building .cpp files and linking them into the micropython
executable in a way similar to how it is done for .c files. The main
incentive here is to enable user C modules to use C++ files (which are put
in SRC_MOD_CXX by py.mk) since the core itself does not utilize C++.
However, to verify build functionality a unix overage test is added. The
esp32 port already has CXXFLAGS so just add the user modules' flags to it.
For the unix port use a copy of the CFLAGS but strip the ones which are not
usable for C++.
This is a generally useful feature and because it's part of the object
model it cannot be added at runtime by some loadable Python code, so enable
it on the standard unix build.
It requires mp_hal_time_ns() to be provided by a port. This function
allows very accurate absolute timestamps.
Enabled on unix, windows, stm32, esp8266 and esp32.
Signed-off-by: Damien George <damien@micropython.org>
And enable this feature on unix, the coverage variant. The .exp test file
is needed so the test can run on CPython versions prior to "@=" operator
support.
Signed-off-by: Damien George <damien@micropython.org>
To portably get the Epoch. This is simply aliased to localtime() on ports
that are not timezone aware.
Signed-off-by: Damien George <damien@micropython.org>