circuitpython

Commit Graph

Author	SHA1	Message	Date
Jim Mussared	b326edf68c	all: Remove MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE. This commit removes all parts of code associated with the existing MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE optimisation option, including the -mcache-lookup-bc option to mpy-cross. This feature originally provided a significant performance boost for Unix, but wasn't able to be enabled for MCU targets (due to frozen bytecode), and added significant extra complexity to generating and distributing .mpy files. The equivalent performance gain is now provided by the combination of MICROPY_OPT_LOAD_ATTR_FAST_PATH and MICROPY_OPT_MAP_LOOKUP_CACHE (which has been enabled on the unix port in the previous commit). It's hard to provide precise performance numbers, but tests have been run on a wide variety of architectures (x86-64, ARM Cortex, Aarch64, RISC-V, xtensa) and they all generally agree on the qualitative improvements seen by the combination of MICROPY_OPT_LOAD_ATTR_FAST_PATH and MICROPY_OPT_MAP_LOOKUP_CACHE. For example, on a "quiet" Linux x64 environment (i3-5010U @ 2.10GHz) the change from CACHE_MAP_LOOKUP_IN_BYTECODE, to LOAD_ATTR_FAST_PATH combined with MAP_LOOKUP_CACHE is: diff of scores (higher is better) N=2000 M=2000 bccache -> attrmapcache diff diff% (error%) bm_chaos.py 13742.56 -> 13905.67 : +163.11 = +1.187% (+/-3.75%) bm_fannkuch.py 60.13 -> 61.34 : +1.21 = +2.012% (+/-2.11%) bm_fft.py 113083.20 -> 114793.68 : +1710.48 = +1.513% (+/-1.57%) bm_float.py 256552.80 -> 243908.29 : -12644.51 = -4.929% (+/-1.90%) bm_hexiom.py 521.93 -> 625.41 : +103.48 = +19.826% (+/-0.40%) bm_nqueens.py 197544.25 -> 217713.12 : +20168.87 = +10.210% (+/-3.01%) bm_pidigits.py 8072.98 -> 8198.75 : +125.77 = +1.558% (+/-3.22%) misc_aes.py 17283.45 -> 16480.52 : -802.93 = -4.646% (+/-0.82%) misc_mandel.py 99083.99 -> 128939.84 : +29855.85 = +30.132% (+/-5.88%) misc_pystone.py 83860.10 -> 82592.56 : -1267.54 = -1.511% (+/-2.27%) misc_raytrace.py 21490.40 -> 22227.23 : +736.83 = +3.429% (+/-1.88%) This shows that the new optimisations are at least as good as the existing inline-bytecode-caching, and are sometimes much better (because the new ones apply caching to a wider variety of map lookups). The new optimisations can also benefit code generated by the native emitter, because they apply to the runtime rather than the generated code. The improvement for the native emitter when LOAD_ATTR_FAST_PATH and MAP_LOOKUP_CACHE are enabled is (same Linux environment as above): diff of scores (higher is better) N=2000 M=2000 native -> nat-attrmapcache diff diff% (error%) bm_chaos.py 14130.62 -> 15464.68 : +1334.06 = +9.441% (+/-7.11%) bm_fannkuch.py 74.96 -> 76.16 : +1.20 = +1.601% (+/-1.80%) bm_fft.py 166682.99 -> 168221.86 : +1538.87 = +0.923% (+/-4.20%) bm_float.py 233415.23 -> 265524.90 : +32109.67 = +13.756% (+/-2.57%) bm_hexiom.py 628.59 -> 734.17 : +105.58 = +16.796% (+/-1.39%) bm_nqueens.py 225418.44 -> 232926.45 : +7508.01 = +3.331% (+/-3.10%) bm_pidigits.py 6322.00 -> 6379.52 : +57.52 = +0.910% (+/-5.62%) misc_aes.py 20670.10 -> 27223.18 : +6553.08 = +31.703% (+/-1.56%) misc_mandel.py 138221.11 -> 152014.01 : +13792.90 = +9.979% (+/-2.46%) misc_pystone.py 85032.14 -> 105681.44 : +20649.30 = +24.284% (+/-2.25%) misc_raytrace.py 19800.01 -> 23350.73 : +3550.72 = +17.933% (+/-2.79%) In summary, compared to MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE, the new MICROPY_OPT_LOAD_ATTR_FAST_PATH and MICROPY_OPT_MAP_LOOKUP_CACHE options: - are simpler; - take less code size; - are faster (generally); - work with code generated by the native emitter; - can be used on embedded targets with a small and constant RAM overhead; - allow the same .mpy bytecode to run on all targets. See #7680 for further discussion. And see also #7653 for a discussion about simplifying mpy-cross options. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>	2021-09-16 16:04:03 +10:00
Damien George	2c1a6a237d	tools/mpy-tool.py: Support relocating ARMv6 arch. Signed-off-by: Damien George <damien@micropython.org>	2021-05-26 16:24:00 +10:00
Damien George	fe16e785fe	tools/mpy-tool.py: List frozen modules in MICROPY_FROZEN_LIST_ITEM. Signed-off-by: Damien George <damien@micropython.org>	2021-01-29 23:57:10 +11:00
Damien George	4f2fe34623	tools/mpy-tool.py: Fix merge of multiple mpy files to POP_TOP correctly. MP_BC_CALL_FUNCTION will leave the result on the Python stack, so that result must be discarded by MP_BC_POP_TOP. Signed-off-by: Damien George <damien@micropython.org>	2020-09-09 00:11:51 +10:00
Martin Milata	492cf34fd8	tools/mpy-tool.py: Fix offset of line number info. Signed-off-by: Martin Milata <martin@martinmilata.cz>	2020-08-21 16:17:07 +10:00
stijn	bcf01d1686	all: Fix implicit conversion from double to float. These are found when building with -Wfloat-conversion.	2020-04-18 22:42:24 +10:00
Damien George	69661f3343	all: Reformat C and Python source code with tools/codeformat.py. This is run with uncrustify 0.70.1, and black 19.10b0.	2020-02-28 10:33:03 +11:00
Damien George	fc97d6d1b5	tools/mpy-tool.py: Raise exception if trying to freeze relocatable mpy.	2019-12-12 20:15:28 +11:00
Damien George	27879844d2	tools/mpy-tool.py: Add ability to merge multiple .mpy files into one. Usage: mpy-tool.py -o merged.mpy --merge mod1.mpy mod2.mpy The constituent .mpy files are executed sequentially when the merged file is imported, and they all use the same global namespace.	2019-12-12 20:15:28 +11:00
Damien George	360d972c16	py/nativeglue: Add new header file with native function table typedef.	2019-12-12 20:15:28 +11:00
Damien George	7f24c29778	tools/mpy-tool.py: Support qstr linking when freezing Xtensa native mpy.	2019-11-28 13:11:51 +11:00
Damien George	36c9be6f60	tools/mpy-tool.py: Use "@progbits #" attribute for native xtensa code.	2019-11-04 15:31:42 +11:00
Damien George	23f0691fdd	py/persistentcode: Make .mpy more compact with qstr directly in prelude. Instead of encoding 4 zero bytes as placeholders for the simple_name and source_file qstrs, and storing the qstrs after the bytecode, store the qstrs at the location of these 4 bytes. This saves 4 bytes per bytecode function stored in a .mpy file (for example lcd160cr.mpy drops by 232 bytes, 4x 58 functions). And resulting code size is slightly reduced on ports that use this feature.	2019-10-15 16:56:27 +11:00
Damien George	9adedce42e	py: Add new Xtensa-Windowed arch for native emitter. Enabled via the configuration MICROPY_EMIT_XTENSAWIN.	2019-10-05 13:44:53 +10:00
Damien George	c8c0fd4ca3	py: Rework and compress second part of bytecode prelude. This patch compresses the second part of the bytecode prelude which contains the source file name, function name, source-line-number mapping and cell closure information. This part of the prelude now begins with a single varible length unsigned integer which encodes 2 numbers, being the byte-size of the following 2 sections in the header: the "source info section" and the "closure section". After decoding this variable unsigned integer it's possible to skip over one or both of these sections very easily. This scheme saves about 2 bytes for most functions compared to the original format: one in the case that there are no closure cells, and one because padding was eliminated.	2019-10-01 12:26:22 +10:00
Damien George	b5ebfadbd6	py: Compress first part of bytecode prelude. The start of the bytecode prelude contains 6 numbers telling the amount of stack needed for the Python values and exceptions, and the signature of the function. Prior to this patch these numbers were all encoded one after the other (2x variable unsigned integers, then 4x bytes), but using so many bytes is unnecessary. An entropy analysis of around 150,000 bytecode functions from the CPython standard library showed that the optimal Shannon coding would need about 7.1 bits on average to encode these 6 numbers, compared to the existing 48 bits. This patch attempts to get close to this optimal value by packing the 6 numbers into a single, varible-length unsigned integer via bit-wise interleaving. The interleaving scheme is chosen to minimise the average number of bytes needed, and at the same time keep the scheme simple enough so it can be implemented without too much overhead in code size or speed. The scheme requires about 10.5 bits on average to store the 6 numbers. As a result most functions which originally took 6 bytes to encode these 6 numbers now need only 1 byte (in 80% of cases).	2019-10-01 12:26:22 +10:00
Damien George	5716c5cf65	py/persistentcode: Bump .mpy version to 5. The bytecode opcodes have changed (there are more, and they have been reordered).	2019-09-26 16:39:37 +10:00
Josh Lloyd	7d58a197cf	py: Rename MP_QSTR_NULL to MP_QSTRnull to avoid intern collisions. Fixes #5140.	2019-09-26 16:04:56 +10:00
Damien George	1f7202d122	py/bc: Replace big opcode format table with simple macro.	2019-09-26 15:27:11 +10:00
Damien George	5889cf58db	py/bc0: Order opcodes into groups based on their size and format.	2019-09-26 15:27:10 +10:00
Damien George	c69f58e6b9	tools/mpy-tool.py: Fix freezing of non-bytecode funcs with settrace. Only bytecode functions can be profiled at this stage. Native functions (eg inline assembler) may not even have a valid prelude. Fixes issue #5075.	2019-09-06 23:55:15 +10:00
Damien George	b29fae0c56	py/bc: Fix size calculation of UNWIND_JUMP opcode in mp_opcode_format. Prior to this patch mp_opcode_format would calculate the incorrect size of the MP_BC_UNWIND_JUMP opcode, missing the additional byte. But, because opcodes below 0x10 are unused and treated as bytes in the .mpy load/save and freezing code, this bug did not show any symptoms, since nested unwind jumps would rarely (if ever) reach a depth of 16 (so the extra byte of this opcode would be between 0x01 and 0x0f and be correctly loaded/saved/frozen simply as an undefined opcode). This patch fixes this bug by correctly accounting for the additional byte. .	2019-09-02 13:30:16 +10:00
Damien George	4691b43c8a	tools/mpy-tool.py: Add initial support for frozen with settrace.	2019-08-30 16:49:13 +10:00
Jim Mussared	4ab5156c01	tools/mpy-tool.py: Force native func alignment to halfword/word on ARM. This is necessary for ARMV6 and V7. Without this change, calling a frozen native/viper function that is misaligned will crash.	2019-08-20 15:13:17 +10:00
Jun Wu	b152bbddd1	py: Define EMIT_MACHINE_CODE as EMIT_NATIVE \|\| EMIT_INLINE_ASM. The combination MICROPY_EMIT_NATIVE \|\| MICROPY_EMIT_INLINE_ASM is used in many places, so define a new macro for it.	2019-06-28 13:54:45 +10:00
Damien George	9d3031cc9d	tools/mpy-tool.py: Fix linking of qstr objects in native ARM Thumb code. Previously, when linking qstr objects in native code for ARM Thumb, the index into the machine code was being incremented by 4, not 8. It should be 8 to account for the size of the two machine instructions movw and movt. This patch makes sure the index into the machine code is incremented by the correct amount for all variations of qstr linking. See issue #4829.	2019-06-11 11:36:39 +10:00
Damien George	faf3d3e9e9	tools/mpy-tool.py: Fix linking qstrs in native code, and multiple files. Fixes errors in the tool when 1) linking qstrs in native ARM-M code; 2) freezing multiple files some of which use native code and some which don't. Fixes issue #4829.	2019-06-04 22:21:01 +10:00
Damien George	74ed06828f	tools/mpy-tool.py: Fix init of QStrWindow, and remove unused variable. The qstr window size is not log-2 encoded, it's just the actual number (but in mpy-tool.py this didn't lead to an error because the size is just used to truncate the window so it doesn't grow arbitrarily large in memory). Addresses issue #4635.	2019-04-08 15:24:24 +10:00
Damien George	643d2a0e86	tools/mpy-tool.py: Adjust use of super() to make it work with Python 2. Fixes the regression introduced in `ea3c80a514`	2019-04-08 11:21:18 +10:00
Damien George	9a5f92ea72	py/persistentcode: Bump .mpy version to 4.	2019-03-08 15:53:05 +11:00
Damien George	ea3c80a514	tools/mpy-tool.py: Add support for freezing native code. This adds support to freeze .mpy files that contain native code blocks.	2019-03-08 15:53:05 +11:00
Damien George	636ed0ff8d	py/emitglue: Remove union in mp_raw_code_t to combine bytecode & native.	2019-03-08 15:53:04 +11:00
Damien George	4f0931b21f	py/persistentcode: Define static qstr set to reduce size of mpy files. When encoded in the mpy file, if qstr <= QSTR_LAST_STATIC then store two bytes: 0, static_qstr_id. Otherwise encode the qstr as usual (either with string data or a reference into the qstr window). Reduces mpy file size by about 5%.	2019-03-05 16:32:05 +11:00
Damien George	992a6e1dea	py/persistentcode: Pack qstrs directly in bytecode to reduce mpy size. Instead of emitting two bytes in the bytecode for where the linked qstr should be written to, it is now replaced by the actual qstr data, or a reference into the qstr window. Reduces mpy file size by about 10%.	2019-03-05 16:27:34 +11:00
Damien George	5996eeb48f	py/persistentcode: Add a qstr window to save mpy files more efficiently. This is an implementation of a sliding qstr window used to reduce the number of qstrs stored in a .mpy file. The window size is configured to 32 entries which takes a fixed 64 bytes (16-bits each) on the C stack when loading/saving a .mpy file. It allows to remember the most recent 32 qstrs so they don't need to be stored again in the .mpy file. The qstr window uses a simple least-recently-used mechanism to discard the least recently used qstr when the window overflows (similar to dictionary compression). This scheme only needs a single pass to save/load the .mpy file. Reduces mpy file size by about 25% with a window size of 32.	2019-03-05 16:25:07 +11:00
Damien George	5a2599d962	py: Replace POP_BLOCK and POP_EXCEPT opcodes with POP_EXCEPT_JUMP. POP_BLOCK and POP_EXCEPT are now the same, and are always followed by a JUMP. So this optimisation reduces code size, and RAM usage of bytecode by two bytes for each try-except handler.	2019-03-05 16:09:58 +11:00
Dave Hylands	39eef27083	tools/mpy-tool.py: Fix build error when no qstrs present in frozen mpy. If you happen to only have a really simple frozen file that doesn't contain any new qstrs then the generated frozen_mpy.c file contains an empty enumeration which causes a C compile time error.	2018-12-15 14:36:08 +11:00
Damien George	814d580a15	tools/mpy-tool.py: Fix calc of opcode size for opcodes with map caching. Following an equivalent fix to py/bc.c. The reason the incorrect values for the opcode constants were not previously causing a bug is because they were never being used: these opcodes always have qstr arguments so the part of the code that was comparing them would never be reached. Thanks to @malinah for finding the problem and providing the initial patch.	2018-12-13 01:26:55 +11:00
Rich Barlow	6e5a40cf3c	tools/mpy-tool: Set sane initial dynamic qstr pool size with frozen mods The first dynamic qstr pool is double the size of the 'alloc' field of the last const qstr pool. The built in const qstr pool (mp_qstr_const_pool) has a hardcoded alloc size of 10, meaning that the first dynamic pool is allocated space for 20 entries. The alloc size must be less than or equal to the actual number of qstrs in the pool (the 'len' field) to ensure that the first dynamically created qstr triggers the creation of a new pool. When modules are frozen a second const pool is created (generally mp_qstr_frozen_const_pool) and linked to the built in pool. However, this second const pool had its 'alloc' field set to the number of qstrs in the pool. When freezing a large quantity of modules this can result in thousands of qstrs being in the pool. This means that the first dynamically created qstr results in a massive allocation. This commit sets the alloc size of the frozen qstr pool to 10 or less (if the number of qstrs in the pool is less than 10). The result of this is that the allocation behaviour when a dynamic qstr is created is identical with an without frozen code. Note that there is the potential for a slight memory inefficiency if the frozen modules have less than 10 qstrs, as the first few dynamic allocations will have quite a large overhead, but the geometric growth soon deals with this.	2018-08-01 18:59:31 +10:00
Damien George	44fc92ea7c	tools/mpy-tool.py: Put frozen bignum digit data in ROM, not in RAM.	2018-07-09 13:43:34 +10:00
Damien George	929d10acf7	tools/mpy-tool.py: Support freezing of floats in obj representation D.	2018-07-09 12:22:40 +10:00
Damien George	9ba3de6ea1	tools/mpy-tool.py: Implement freezing of Ellipsis const object.	2017-11-15 12:46:08 +11:00
Damien George	933eab46fc	py/bc: Update opcode_format_table to match the bytecode.	2017-10-10 10:37:38 +11:00
Damien George	ff93fd4f50	py/persistentcode: Bump .mpy version number to version 3. The binary and unary ops have changed bytecode encoding.	2017-10-05 10:49:44 +11:00
stijn	e4ab404780	tools/mpy-tool.py: Fix missing argument in dump() function This makes the -d commandline argument usable again. Pass empty string as parent name as listing starts from the root.	2017-08-16 10:38:19 +02:00
Damien George	b6a3289564	tools/mpy-tool.py: Don't generate const_table if it's empty.	2017-08-12 22:26:18 +10:00
Damien George	88c51c3592	tools/mpy-tool.py: Fix regression with freezing floats in obj repr C. Regression was introduced by `ec534609f6`	2017-05-16 18:53:02 +10:00
Damien George	ec534609f6	tools/mpy-tool.py: Use MP_ROM_xxx macros to support nanbox builds.	2017-05-13 10:08:13 +10:00
Paul Sokolovsky	473e85e2da	tools/mpy-tool: Make work if run from another directory. By making sure we don't add relative paths to sys.path.	2017-05-01 00:01:30 +03:00
Damien George	dd11af209d	py: Add LOAD_SUPER_METHOD bytecode to allow heap-free super meth calls. This patch allows the following code to run without allocating on the heap: super().foo(...) Before this patch such a call would allocate a super object on the heap and then load the foo method and call it right away. The super object is only needed to perform the lookup of the method and not needed after that. This patch makes an optimisation to allocate the super object on the C stack and discard it right after use. Changes in code size due to this patch are: bare-arm: +128 minimal: +232 unix x64: +416 unix nanbox: +364 stmhal: +184 esp8266: +340 cc3200: +128	2017-04-22 23:39:20 +10:00

1 2

63 Commits