circuitpython

Author	SHA1	Message	Date
Damien George	f2040bfc7e	py: Rework bytecode and .mpy file format to be mostly static data. Background: .mpy files are precompiled .py files, built using mpy-cross, that contain compiled bytecode functions (and can also contain machine code). The benefit of using an .mpy file over a .py file is that they are faster to import and take less memory when importing. They are also smaller on disk. But the real benefit of .mpy files comes when they are frozen into the firmware. This is done by loading the .mpy file during compilation of the firmware and turning it into a set of big C data structures (the job of mpy-tool.py), which are then compiled and downloaded into the ROM of a device. These C data structures can be executed in-place, ie directly from ROM. This makes importing even faster because there is very little to do, and also means such frozen modules take up much less RAM (because their bytecode stays in ROM). The downside of frozen code is that it requires recompiling and reflashing the entire firmware. This can be a big barrier to entry, slows down development time, and makes it harder to do OTA updates of frozen code (because the whole firmware must be updated). This commit attempts to solve this problem by providing a solution that sits between loading .mpy files into RAM and freezing them into the firmware. The .mpy file format has been reworked so that it consists of data and bytecode which is mostly static and ready to run in-place. If these new .mpy files are located in flash/ROM which is memory addressable, the .mpy file can be executed (mostly) in-place. With this approach there is still a small amount of unpacking and linking of the .mpy file that needs to be done when it's imported, but it's still much better than loading an .mpy from disk into RAM (although not as good as freezing .mpy files into the firmware). The main trick to make static .mpy files is to adjust the bytecode so any qstrs that it references now go through a lookup table to convert from local qstr number in the module to global qstr number in the firmware. That means the bytecode does not need linking/rewriting of qstrs when it's loaded. Instead only a small qstr table needs to be built (and put in RAM) at import time. This means the bytecode itself is static/constant and can be used directly if it's in addressable memory. Also the qstr string data in the .mpy file, and some constant object data, can be used directly. Note that the qstr table is global to the module (ie not per function). In more detail, in the VM what used to be (schematically): qst = DECODE_QSTR_VALUE; is now (schematically): idx = DECODE_QSTR_INDEX; qst = qstr_table[idx]; That allows the bytecode to be fixed at compile time and not need relinking/rewriting of the qstr values. Only qstr_table needs to be linked when the .mpy is loaded. Incidentally, this helps to reduce the size of bytecode because what used to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices. If the module uses the same qstr more than two times then the bytecode is smaller than before. The following changes are measured for this commit compared to the previous (the baseline): - average 7%-9% reduction in size of .mpy files - frozen code size is reduced by about 5%-7% - importing .py files uses about 5% less RAM in total - importing .mpy files uses about 4% less RAM in total - importing .py and .mpy files takes about the same time as before The qstr indirection in the bytecode has only a small impact on VM performance. For stm32 on PYBv1.0 the performance change of this commit is: diff of scores (higher is better) N=100 M=100 baseline -> this-commit diff diff% (error%) bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%) bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%) bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%) bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%) bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%) bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%) bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%) core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%) core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%) core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%) core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%) misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%) misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%) misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%) misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%) viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%) viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%) viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%) viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%) viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%) viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%) And for unix on x64: diff of scores (higher is better) N=2000 M=2000 baseline -> this-commit diff diff% (error%) bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%) bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%) bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%) bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%) bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%) bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%) bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%) misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%) misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%) misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%) misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%) The code size change is (firmware with a lot of frozen code benefits the most): bare-arm: +396 +0.697% minimal x86: +1595 +0.979% [incl +32(data)] unix x64: +2408 +0.470% [incl +800(data)] unix nanbox: +1396 +0.309% [incl -96(data)] stm32: -1256 -0.318% PYBV10 cc3200: +288 +0.157% esp8266: -260 -0.037% GENERIC esp32: -216 -0.014% GENERIC[incl -1072(data)] nrf: +116 +0.067% pca10040 rp2: -664 -0.135% PICO samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS As part of this change the .mpy file format version is bumped to version 6. And mpy-tool.py has been improved to provide a good visualisation of the contents of .mpy files. In summary: this commit changes the bytecode to use qstr indirection, and reworks the .mpy file format to be simpler and allow .mpy files to be executed in-place. Performance is not impacted too much. Eventually it will be possible to store such .mpy files in a linear, read-only, memory- mappable filesystem so they can be executed from flash/ROM. This will essentially be able to replace frozen code for most applications. Signed-off-by: Damien George <damien@micropython.org>	2022-02-24 18:08:43 +11:00
Damien George	e328a5d469	py/scope: Optimise scope_find_or_add_id to not need "added" arg. Taking the address of a local variable is mildly expensive, in code size and stack usage. So optimise scope_find_or_add_id() to not need to take a pointer to the "added" variable, and instead take the kind to use for newly added identifiers.	2018-10-28 00:38:18 +11:00
Damien George	9201f46cc8	py/compile: Fix case of eager implicit conversion of local to nonlocal. This ensures that implicit variables are only converted to implicit closed-over variables (nonlocals) at the very end of the function scope. If variables are closed-over when first used (read from, as was done prior to this commit) then this can be incorrect because the variable may be assigned to later on in the function which means they are just a plain local, not closed over. Fixes issue #4272.	2018-10-28 00:33:08 +11:00
Damien George	d298013939	py/emit: Combine name and global into one func for load/store/delete. Reduces code size by: bare-arm: -56 minimal x86: -300 unix x64: -576 unix nanbox: -300 stm32: -164 cc3200: -56 esp8266: -236 esp32: -76	2018-05-23 00:22:47 +10:00
Damien George	0a25fff956	py/emit: Combine fast and deref into one function for load/store/delete. Reduces code size by: bare-arm: -16 minimal x86: -208 unix x64: -408 unix nanbox: -248 stm32: -12 cc3200: -24 esp8266: -96 esp32: -44	2018-05-23 00:22:20 +10:00
Alexander Steffen	55f33240f3	all: Use the name MicroPython consistently in comments There were several different spellings of MicroPython present in comments, when there should be only one.	2017-07-31 18:35:40 +10:00
Damien George	0d10517a45	py/scope: Factor common code to find locals and close over them. Saves 50-100 bytes of code.	2016-09-30 13:53:00 +10:00
Damien George	3dea8c9e92	py/scope: Use lookup-table to determine a scope's simple name. Generates slightly smaller and more efficient code.	2016-09-30 12:34:05 +10:00
Damien George	dd5353a405	py: Add MICROPY_ENABLE_COMPILER and MICROPY_PY_BUILTINS_EVAL_EXEC opts. MICROPY_ENABLE_COMPILER can be used to enable/disable the entire compiler, which is useful when only loading of pre-compiled bytecode is supported. It is enabled by default. MICROPY_PY_BUILTINS_EVAL_EXEC controls support of eval and exec builtin functions. By default they are only included if MICROPY_ENABLE_COMPILER is enabled. Disabling both options saves about 40k of code size on 32-bit x86.	2015-12-18 12:35:44 +00:00
Damien George	65dc960e3b	unix-cpy: Remove unix-cpy. It's no longer needed. unix-cpy was originally written to get semantic equivalent with CPython without writing functional tests. When writing the initial implementation of uPy it was a long way between lexer and functional tests, so the half-way test was to make sure that the bytecode was correct. The idea was that if the uPy bytecode matched CPython 1-1 then uPy would be proper Python if the bytecodes acted correctly. And having matching bytecode meant that it was less likely to miss some deep subtlety in the Python semantics that would require an architectural change later on. But that is all history and it no longer makes sense to retain the ability to output CPython bytecode, because: 1. It outputs CPython 3.3 compatible bytecode. CPython's bytecode changes from version to version, and seems to have changed quite a bit in 3.5. There's no point in changing the bytecode output to match CPython anymore. 2. uPy and CPy do different optimisations to the bytecode which makes it harder to match. 3. The bytecode tests are not run. They were never part of Travis and are not run locally anymore. 4. The EMIT_CPYTHON option needs a lot of extra source code which adds heaps of noise, especially in compile.c. 5. Now that there is an extensive test suite (which tests functionality) there is no need to match the bytecode. Some very subtle behaviour is tested with the test suite and passing these tests is a much better way to stay Python-language compliant, rather than trying to match CPy bytecode.	2015-08-17 12:51:26 +01:00
Damien George	542bd6b4a1	py, compiler: Refactor load/store/delete_id logic to reduce code size. Saves around 230 bytes on Thumb2 and 750 bytes on x86.	2015-03-26 16:52:45 +00:00
Damien George	0abb5609b0	py: Remove unnecessary id_flags argument from emitter's load_fast. Saves 24 bytes in bare-arm.	2015-01-16 12:24:49 +00:00
Damien George	51dfcb4bb7	py: Move to guarded includes, everywhere in py/ core. Addresses issue #1022.	2015-01-01 20:32:09 +00:00
Damien George	7ff996c237	py: Convert [u]int to mp_[u]int_t in emit.h and associated .c files. Towards resolving issue #50.	2014-09-08 23:05:16 +01:00
Paul Sokolovsky	59c675a64c	py: Include mpconfig.h before all other includes. It defines types used by all other headers. Fixes #691.	2014-06-21 22:43:22 +03:00
Damien George	04b9147e15	Add license header to (almost) all files. Blanket wide to all .c and .h files. Some files originating from ST are difficult to deal with (license wise) so it was left out of those. Also merged modpyb.h, modos.h, modstm.h and modtime.h in stmhal/.	2014-05-03 23:27:38 +01:00
Damien George	d395a0e4b1	Merge pull request #471 from errordeveloper/misc_fix/unistd py: the entire `<unistd.h>` shouldn't be needed	2014-04-13 13:22:36 +01:00
Damien George	df8127a17e	py: Remove unique_codes from emitglue.c. Replace with pointers. Attempt to address issue #386. unique_code_id's have been removed and replaced with a pointer to the "raw code" information. This pointer is stored in the actual byte code (aligned, so the GC can trace it), so that raw code (ie byte code, native code and inline assembler) is kept only for as long as it is needed. In memory it's now like a tree: the outer module's byte code points directly to its children's raw code. So when the outer code gets freed, if there are no remaining functions that need the raw code, then the children's code gets freed as well. This is pretty much like CPython does it, except that CPython stores indexes in the byte code rather than machine pointers. These indices index the per-function constant table in order to find the relevant code.	2014-04-13 11:04:33 +01:00
Ilya Dmitrichenko	5630b01920	py: the entire `<unistd.h>` shouldn't be needed	2014-04-12 16:45:35 +01:00
Damien George	2bf7c09222	py: Properly implement deletion of locals and derefs, and detect errors. Needed to reinstate 2 delete opcodes, to specifically check that a local is not deleted twice.	2014-04-09 15:26:46 +01:00
xbe	efe3422394	py: Clean up includes. Remove unnecessary includes. Add includes that improve portability.	2014-03-17 02:43:40 -07:00
Damien George	1dc76af7bf	py: Remove name of var arg from macros with var args.	2014-02-26 16:57:08 +00:00
Damien George	55baff4c9b	Revamp qstrs: they now include length and hash. Can now have null bytes in strings. Can define ROM qstrs per port using qstrdefsport.h	2014-01-21 21:40:13 +00:00
Damien	d99b05282d	Change object representation from 1 big union to individual structs. A big change. Micro Python objects are allocated as individual structs with the first element being a pointer to the type information (which is itself an object). This scheme follows CPython. Much more flexible, not necessarily slower, uses same heap memory, and can allocate objects statically. Also change name prefix, from py_ to mp_ (mp for Micro Python).	2013-12-21 18:17:45 +00:00
Damien	27fb45eb1c	Add local_num skeleton framework to deref/closure emit calls.	2013-10-20 15:07:49 +01:00
Damien	c025ebb2dc	Separate out mpy core and unix version.	2013-10-12 14:30:21 +01:00
Damien	4b03e77d4a	Factorise EMIT_COMMON calls, mostly into emit_pass1.	2013-10-05 14:17:09 +01:00
Damien	415eb6f850	Restructure emit so it goes through a method table.	2013-10-05 12:19:06 +01:00
Damien	429d71943d	Initial commit.	2013-10-04 19:53:11 +01:00

29 Commits