This commit removes all parts of code associated with the existing
MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE optimisation option, including the
-mcache-lookup-bc option to mpy-cross.
This feature originally provided a significant performance boost for Unix,
but wasn't able to be enabled for MCU targets (due to frozen bytecode), and
added significant extra complexity to generating and distributing .mpy
files.
The equivalent performance gain is now provided by the combination of
MICROPY_OPT_LOAD_ATTR_FAST_PATH and MICROPY_OPT_MAP_LOOKUP_CACHE (which has
been enabled on the unix port in the previous commit).
It's hard to provide precise performance numbers, but tests have been run
on a wide variety of architectures (x86-64, ARM Cortex, Aarch64, RISC-V,
xtensa) and they all generally agree on the qualitative improvements seen
by the combination of MICROPY_OPT_LOAD_ATTR_FAST_PATH and
MICROPY_OPT_MAP_LOOKUP_CACHE.
For example, on a "quiet" Linux x64 environment (i3-5010U @ 2.10GHz) the
change from CACHE_MAP_LOOKUP_IN_BYTECODE, to LOAD_ATTR_FAST_PATH combined
with MAP_LOOKUP_CACHE is:
diff of scores (higher is better)
N=2000 M=2000 bccache -> attrmapcache diff diff% (error%)
bm_chaos.py 13742.56 -> 13905.67 : +163.11 = +1.187% (+/-3.75%)
bm_fannkuch.py 60.13 -> 61.34 : +1.21 = +2.012% (+/-2.11%)
bm_fft.py 113083.20 -> 114793.68 : +1710.48 = +1.513% (+/-1.57%)
bm_float.py 256552.80 -> 243908.29 : -12644.51 = -4.929% (+/-1.90%)
bm_hexiom.py 521.93 -> 625.41 : +103.48 = +19.826% (+/-0.40%)
bm_nqueens.py 197544.25 -> 217713.12 : +20168.87 = +10.210% (+/-3.01%)
bm_pidigits.py 8072.98 -> 8198.75 : +125.77 = +1.558% (+/-3.22%)
misc_aes.py 17283.45 -> 16480.52 : -802.93 = -4.646% (+/-0.82%)
misc_mandel.py 99083.99 -> 128939.84 : +29855.85 = +30.132% (+/-5.88%)
misc_pystone.py 83860.10 -> 82592.56 : -1267.54 = -1.511% (+/-2.27%)
misc_raytrace.py 21490.40 -> 22227.23 : +736.83 = +3.429% (+/-1.88%)
This shows that the new optimisations are at least as good as the existing
inline-bytecode-caching, and are sometimes much better (because the new
ones apply caching to a wider variety of map lookups).
The new optimisations can also benefit code generated by the native
emitter, because they apply to the runtime rather than the generated code.
The improvement for the native emitter when LOAD_ATTR_FAST_PATH and
MAP_LOOKUP_CACHE are enabled is (same Linux environment as above):
diff of scores (higher is better)
N=2000 M=2000 native -> nat-attrmapcache diff diff% (error%)
bm_chaos.py 14130.62 -> 15464.68 : +1334.06 = +9.441% (+/-7.11%)
bm_fannkuch.py 74.96 -> 76.16 : +1.20 = +1.601% (+/-1.80%)
bm_fft.py 166682.99 -> 168221.86 : +1538.87 = +0.923% (+/-4.20%)
bm_float.py 233415.23 -> 265524.90 : +32109.67 = +13.756% (+/-2.57%)
bm_hexiom.py 628.59 -> 734.17 : +105.58 = +16.796% (+/-1.39%)
bm_nqueens.py 225418.44 -> 232926.45 : +7508.01 = +3.331% (+/-3.10%)
bm_pidigits.py 6322.00 -> 6379.52 : +57.52 = +0.910% (+/-5.62%)
misc_aes.py 20670.10 -> 27223.18 : +6553.08 = +31.703% (+/-1.56%)
misc_mandel.py 138221.11 -> 152014.01 : +13792.90 = +9.979% (+/-2.46%)
misc_pystone.py 85032.14 -> 105681.44 : +20649.30 = +24.284% (+/-2.25%)
misc_raytrace.py 19800.01 -> 23350.73 : +3550.72 = +17.933% (+/-2.79%)
In summary, compared to MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE, the new
MICROPY_OPT_LOAD_ATTR_FAST_PATH and MICROPY_OPT_MAP_LOOKUP_CACHE options:
- are simpler;
- take less code size;
- are faster (generally);
- work with code generated by the native emitter;
- can be used on embedded targets with a small and constant RAM overhead;
- allow the same .mpy bytecode to run on all targets.
See #7680 for further discussion. And see also #7653 for a discussion
about simplifying mpy-cross options.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
Support freezing modules via manifest.py for consistency with the other
ports. In essence this comes down to calling makemanifest.py and adding
the resulting .c file to the build. Note the file with preprocessed qstrs
has been renamed to match what makemanifest.py expects and which is also
the name all other ports use.
Reserve sources.props for listing just the MicroPython core and extmod
files, similar to how py.mk lists port-independent source files. This
allows reusing the source list, for instance for building mpy-cross. The
sources for building the executable itself are listed in the corresponding
project file, similar to how the other ports specify the source files in
their Makefile.
Append to PyIncDirs, used to define include directories specific to
MicroPython, instead of just overwriting it so project files importing this
file can define additional directories. And allow defining the target
directory for the executable instead of hardcoding it to the windows
directory. Main reason for this change is that it will allow building
mpy-cross with msvc.
We want the .vcxproj to be just a container with the minimum content for
making it work as a project file for Visual Studio and MSBuild, whereas the
actual build options and actions get placed in separate reusable files.
This was roughly the case already except some compiler options were
overlooked; fix this here: we'll need those common options when adding a
project file for building mpy-cross.
These were probably added to detect more qstrs but as long as the
micropython executable itself doesn't use the same build options the qstrs
would be unused anyway. Furthermore these definitions are for internal use
and get enabled when corresponding MICROPY_EMIT_XXX are defined, in which
case the compiler would warn about symbol redefinitions since they'd be
defined both here and in the source.
During make, makemoduledefs.py parses the current builds c files for
MP_REGISTER_MODULE(module_name, obj_module, enabled_define)
These are used to generate a header with the required entries for
"mp_rom_map_elem_t mp_builtin_module_table[]" in py/objmodule.c
Add some more POSIX compatibility by adding a d_type field to the
dirent structure and defining corresponding macros so listdir_next
in the unix' port modos.c can use it, end result being uos.ilistdir
now reports the file type.
Use overrideable properties instead of hardcoding the use of the
default cl executable used by msvc toolsets. This allows using
arbitrary compiler commands for qstr header generation.
The CLToolExe and CLToolPath properties are used because they are,
even though absent from any official documentation, the de-facto
standard as used by the msvc toolsets themselves.
Printing debugging info by defining MICROPY_DEBUG_VERBOSE expects
a definition of the DEBUG_printf function which is readily available
in printf.c so include that file in the build. Before this patch
one would have to manually provide such definition which is tedious.
For the msvc port disable MICROPY_USE_INTERNAL_PRINTF though: the
linker provides no (easy) way to replace printf with the custom
version as defined in printf.c.
This is to keep the top-level directory clean, to make it clear what is
core and what is a port, and to allow the repository to grow with new ports
in a sustainable way.