Go to file
Damien George f2040bfc7e py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing.  They are also
smaller on disk.

But the real benefit of .mpy files comes when they are frozen into the
firmware.  This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device.  These C data structures can be executed in-place, ie directly from
ROM.  This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).

The downside of frozen code is that it requires recompiling and reflashing
the entire firmware.  This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).

This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware.  The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place.  If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.

With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).

The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded.  Instead only a small qstr table needs to be built (and put in RAM)
at import time.  This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory.  Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).

In more detail, in the VM what used to be (schematically):

    qst = DECODE_QSTR_VALUE;

is now (schematically):

    idx = DECODE_QSTR_INDEX;
    qst = qstr_table[idx];

That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values.  Only qstr_table needs to be linked
when the .mpy is loaded.

Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.

The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before

The qstr indirection in the bytecode has only a small impact on VM
performance.  For stm32 on PYBv1.0 the performance change of this commit
is:

diff of scores (higher is better)
N=100 M=100             baseline -> this-commit  diff      diff% (error%)
bm_chaos.py               371.07 ->  357.39 :  -13.68 =  -3.687% (+/-0.02%)
bm_fannkuch.py             78.72 ->   77.49 :   -1.23 =  -1.563% (+/-0.01%)
bm_fft.py                2591.73 -> 2539.28 :  -52.45 =  -2.024% (+/-0.00%)
bm_float.py              6034.93 -> 5908.30 : -126.63 =  -2.098% (+/-0.01%)
bm_hexiom.py               48.96 ->   47.93 :   -1.03 =  -2.104% (+/-0.00%)
bm_nqueens.py            4510.63 -> 4459.94 :  -50.69 =  -1.124% (+/-0.00%)
bm_pidigits.py            650.28 ->  644.96 :   -5.32 =  -0.818% (+/-0.23%)
core_import_mpy_multi.py  564.77 ->  581.49 :  +16.72 =  +2.960% (+/-0.01%)
core_import_mpy_single.py  68.67 ->   67.16 :   -1.51 =  -2.199% (+/-0.01%)
core_qstr.py               64.16 ->   64.12 :   -0.04 =  -0.062% (+/-0.00%)
core_yield_from.py        362.58 ->  354.50 :   -8.08 =  -2.228% (+/-0.00%)
misc_aes.py               429.69 ->  405.59 :  -24.10 =  -5.609% (+/-0.01%)
misc_mandel.py           3485.13 -> 3416.51 :  -68.62 =  -1.969% (+/-0.00%)
misc_pystone.py          2496.53 -> 2405.56 :  -90.97 =  -3.644% (+/-0.01%)
misc_raytrace.py          381.47 ->  374.01 :   -7.46 =  -1.956% (+/-0.01%)
viper_call0.py            576.73 ->  572.49 :   -4.24 =  -0.735% (+/-0.04%)
viper_call1a.py           550.37 ->  546.21 :   -4.16 =  -0.756% (+/-0.09%)
viper_call1b.py           438.23 ->  435.68 :   -2.55 =  -0.582% (+/-0.06%)
viper_call1c.py           442.84 ->  440.04 :   -2.80 =  -0.632% (+/-0.08%)
viper_call2a.py           536.31 ->  532.35 :   -3.96 =  -0.738% (+/-0.06%)
viper_call2b.py           382.34 ->  377.07 :   -5.27 =  -1.378% (+/-0.03%)

And for unix on x64:

diff of scores (higher is better)
N=2000 M=2000        baseline -> this-commit     diff      diff% (error%)
bm_chaos.py          13594.20 ->  13073.84 :  -520.36 =  -3.828% (+/-5.44%)
bm_fannkuch.py          60.63 ->     59.58 :    -1.05 =  -1.732% (+/-3.01%)
bm_fft.py           112009.15 -> 111603.32 :  -405.83 =  -0.362% (+/-4.03%)
bm_float.py         246202.55 -> 247923.81 : +1721.26 =  +0.699% (+/-2.79%)
bm_hexiom.py           615.65 ->    617.21 :    +1.56 =  +0.253% (+/-1.64%)
bm_nqueens.py       215807.95 -> 215600.96 :  -206.99 =  -0.096% (+/-3.52%)
bm_pidigits.py        8246.74 ->   8422.82 :  +176.08 =  +2.135% (+/-3.64%)
misc_aes.py          16133.00 ->  16452.74 :  +319.74 =  +1.982% (+/-1.50%)
misc_mandel.py      128146.69 -> 130796.43 : +2649.74 =  +2.068% (+/-3.18%)
misc_pystone.py      83811.49 ->  83124.85 :  -686.64 =  -0.819% (+/-1.03%)
misc_raytrace.py     21688.02 ->  21385.10 :  -302.92 =  -1.397% (+/-3.20%)

The code size change is (firmware with a lot of frozen code benefits the
most):

       bare-arm:  +396 +0.697%
    minimal x86: +1595 +0.979% [incl +32(data)]
       unix x64: +2408 +0.470% [incl +800(data)]
    unix nanbox: +1396 +0.309% [incl -96(data)]
          stm32: -1256 -0.318% PYBV10
         cc3200:  +288 +0.157%
        esp8266:  -260 -0.037% GENERIC
          esp32:  -216 -0.014% GENERIC[incl -1072(data)]
            nrf:  +116 +0.067% pca10040
            rp2:  -664 -0.135% PICO
           samd:  +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS

As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.

In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place.  Performance is not impacted too much.  Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM.  This will
essentially be able to replace frozen code for most applications.

Signed-off-by: Damien George <damien@micropython.org>
2022-02-24 18:08:43 +11:00
.github github/workflows: Show context for qemu-arm test failures. 2022-01-23 09:25:19 +11:00
docs esp32/esp32_partition: Add support for specifying block_size. 2022-02-22 00:37:25 +11:00
drivers drivers/sdcard: Allow setting the final SPI baudrate. 2022-02-18 14:37:44 +11:00
examples unix/Makefile: Remove explicit addition of -std=c++ flag. 2022-02-18 14:40:16 +11:00
extmod extmod/moduplatform: Move platform PP definitions into a header. 2022-02-22 00:59:14 +11:00
lib lib/stm32lib: Update library for fix to F7 USB HS. 2022-01-06 16:43:22 +11:00
logo all: Use the name MicroPython consistently in comments 2017-07-31 18:35:40 +10:00
mpy-cross py: Rework bytecode and .mpy file format to be mostly static data. 2022-02-24 18:08:43 +11:00
ports py: Rework bytecode and .mpy file format to be mostly static data. 2022-02-24 18:08:43 +11:00
py py: Rework bytecode and .mpy file format to be mostly static data. 2022-02-24 18:08:43 +11:00
shared py: Rework bytecode and .mpy file format to be mostly static data. 2022-02-24 18:08:43 +11:00
tests py: Rework bytecode and .mpy file format to be mostly static data. 2022-02-24 18:08:43 +11:00
tools py: Rework bytecode and .mpy file format to be mostly static data. 2022-02-24 18:08:43 +11:00
.git-blame-ignore-revs top: Update .git-blame-ignore-revs for latest Black formatting commits. 2022-02-02 16:50:14 +11:00
.gitattributes gitattributes: Mark *.a files as binary. 2019-06-03 14:57:50 +10:00
.gitignore gitignore: Ignore macOS desktop metadata files. 2021-05-04 16:56:16 +10:00
.gitmodules gitmodules: Update branch for stm32lib submodule. 2022-02-01 16:21:01 +11:00
ACKNOWLEDGEMENTS ACKNOWLEDGEMENTS: Remove entry as requested by backer. 2019-07-12 12:57:37 +10:00
CODECONVENTIONS.md top: Update contribution and commit guide to include optional sign-off. 2020-06-12 13:32:22 +10:00
CODEOFCONDUCT.md top: Add CODEOFCONDUCT.md document based on the PSF code of conduct. 2019-10-15 16:18:46 +11:00
CONTRIBUTING.md top: Update contribution and commit guide to include optional sign-off. 2020-06-12 13:32:22 +10:00
LICENSE LICENSE,docs: Update copyright year range to include 2022. 2022-01-06 15:50:14 +11:00
README.md README: Update link for ARM embedded toolchain to developer.arm.com. 2022-02-18 14:28:29 +11:00

CI badge codecov

The MicroPython project

MicroPython Logo

This is the MicroPython project, which aims to put an implementation of Python 3.x on microcontrollers and small embedded systems. You can find the official website at micropython.org.

WARNING: this project is in beta stage and is subject to changes of the code-base, including project-wide name changes and API changes.

MicroPython implements the entire Python 3.4 syntax (including exceptions, with, yield from, etc., and additionally async/await keywords from Python 3.5). The following core datatypes are provided: str (including basic Unicode support), bytes, bytearray, tuple, list, dict, set, frozenset, array.array, collections.namedtuple, classes and instances. Builtin modules include sys, time, and struct, etc. Select ports have support for _thread module (multithreading). Note that only a subset of Python 3 functionality is implemented for the data types and modules.

MicroPython can execute scripts in textual source form or from precompiled bytecode, in both cases either from an on-device filesystem or "frozen" into the MicroPython executable.

See the repository http://github.com/micropython/pyboard for the MicroPython board (PyBoard), the officially supported reference electronic circuit board.

Major components in this repository:

  • py/ -- the core Python implementation, including compiler, runtime, and core library.
  • mpy-cross/ -- the MicroPython cross-compiler which is used to turn scripts into precompiled bytecode.
  • ports/unix/ -- a version of MicroPython that runs on Unix.
  • ports/stm32/ -- a version of MicroPython that runs on the PyBoard and similar STM32 boards (using ST's Cube HAL drivers).
  • ports/minimal/ -- a minimal MicroPython port. Start with this if you want to port MicroPython to another microcontroller.
  • tests/ -- test framework and test scripts.
  • docs/ -- user documentation in Sphinx reStructuredText format. Rendered HTML documentation is available at http://docs.micropython.org.

Additional components:

  • ports/bare-arm/ -- a bare minimum version of MicroPython for ARM MCUs. Used mostly to control code size.
  • ports/teensy/ -- a version of MicroPython that runs on the Teensy 3.1 (preliminary but functional).
  • ports/pic16bit/ -- a version of MicroPython for 16-bit PIC microcontrollers.
  • ports/cc3200/ -- a version of MicroPython that runs on the CC3200 from TI.
  • ports/esp8266/ -- a version of MicroPython that runs on Espressif's ESP8266 SoC.
  • ports/esp32/ -- a version of MicroPython that runs on Espressif's ESP32 SoC.
  • ports/nrf/ -- a version of MicroPython that runs on Nordic's nRF51 and nRF52 MCUs.
  • extmod/ -- additional (non-core) modules implemented in C.
  • tools/ -- various tools, including the pyboard.py module.
  • examples/ -- a few example Python scripts.

The subdirectories above may include READMEs with additional info.

"make" is used to build the components, or "gmake" on BSD-based systems. You will also need bash, gcc, and Python 3.3+ available as the command python3 (if your system only has Python 2.7 then invoke make with the additional option PYTHON=python2).

The MicroPython cross-compiler, mpy-cross

Most ports require the MicroPython cross-compiler to be built first. This program, called mpy-cross, is used to pre-compile Python scripts to .mpy files which can then be included (frozen) into the firmware/executable for a port. To build mpy-cross use:

$ cd mpy-cross
$ make

The Unix version

The "unix" port requires a standard Unix environment with gcc and GNU make. x86 and x64 architectures are supported (i.e. x86 32- and 64-bit), as well as ARM and MIPS. Making full-featured port to another architecture requires writing some assembly code for the exception handling and garbage collection. Alternatively, fallback implementation based on setjmp/longjmp can be used.

To build (see section below for required dependencies):

$ cd ports/unix
$ make submodules
$ make

Then to give it a try:

$ ./micropython
>>> list(5 * x + y for x in range(10) for y in [4, 2, 1])

Use CTRL-D (i.e. EOF) to exit the shell. Learn about command-line options (in particular, how to increase heap size which may be needed for larger applications):

$ ./micropython -h

Run complete testsuite:

$ make test

Unix version comes with a builtin package manager called upip, e.g.:

$ ./micropython -m upip install micropython-pystone
$ ./micropython -m pystone

Browse available modules on PyPI. Standard library modules come from micropython-lib project.

External dependencies

Building MicroPython ports may require some dependencies installed.

For Unix port, libffi library and pkg-config tool are required. On Debian/Ubuntu/Mint derivative Linux distros, install build-essential (includes toolchain and make), libffi-dev, and pkg-config packages.

Other dependencies can be built together with MicroPython. This may be required to enable extra features or capabilities, and in recent versions of MicroPython, these may be enabled by default. To build these additional dependencies, in the port directory you're interested in (e.g. ports/unix/) first execute:

$ make submodules

This will fetch all the relevant git submodules (sub repositories) that the port needs. Use the same command to get the latest versions of submodules as they are updated from time to time. After that execute:

$ make deplibs

This will build all available dependencies (regardless whether they are used or not). If you intend to build MicroPython with additional options (like cross-compiling), the same set of options should be passed to make deplibs. To actually enable/disable use of dependencies, edit ports/unix/mpconfigport.mk file, which has inline descriptions of the options. For example, to build SSL module (required for upip tool described above, and so enabled by default), MICROPY_PY_USSL should be set to 1.

For some ports, building required dependences is transparent, and happens automatically. But they still need to be fetched with the make submodules command.

The STM32 version

The "stm32" port requires an ARM compiler, arm-none-eabi-gcc, and associated bin-utils. For those using Arch Linux, you need arm-none-eabi-binutils, arm-none-eabi-gcc and arm-none-eabi-newlib packages. Otherwise, try here: https://developer.arm.com/tools-and-software/open-source-software/developer-tools/gnu-toolchain/gnu-rm

To build:

$ cd ports/stm32
$ make submodules
$ make

You then need to get your board into DFU mode. On the pyboard, connect the 3V3 pin to the P1/DFU pin with a wire (on PYBv1.0 they are next to each other on the bottom left of the board, second row from the bottom).

Then to flash the code via USB DFU to your device:

$ make deploy

This will use the included tools/pydfu.py script. If flashing the firmware does not work it may be because you don't have the correct permissions, and need to use sudo make deploy. See the README.md file in the ports/stm32/ directory for further details.

Contributing

MicroPython is an open-source project and welcomes contributions. To be productive, please be sure to follow the Contributors' Guidelines and the Code Conventions. Note that MicroPython is licenced under the MIT license, and all contributions should follow this license.