From cf7d962cf38db296d1ac419fc4d5302b64c59644 Mon Sep 17 00:00:00 2001 From: Damien George Date: Fri, 10 Jun 2022 14:36:22 +1000 Subject: [PATCH] docs/reference/mpyfiles: Update .mpy description to match latest format. Signed-off-by: Damien George --- docs/reference/mpyfiles.rst | 62 +++++++++++++++++++++++++++---------- 1 file changed, 45 insertions(+), 17 deletions(-) diff --git a/docs/reference/mpyfiles.rst b/docs/reference/mpyfiles.rst index 70354c4e1f..fcb4996565 100644 --- a/docs/reference/mpyfiles.rst +++ b/docs/reference/mpyfiles.rst @@ -80,7 +80,8 @@ and .mpy version. =================== ============ MicroPython release .mpy version =================== ============ -v1.12 and up 5 +v1.19 and up 6 +v1.12 - v1.18 5 v1.11 4 v1.9.3 - v1.10 3 v1.9 - v1.9.2 2 @@ -93,6 +94,7 @@ MicroPython repository at which the .mpy version was changed. =================== ======================================== .mpy version change Git commit =================== ======================================== +5 to 6 f2040bfc7ee033e48acef9f289790f3b4e6b74e5 4 to 5 5716c5cf65e9b2cb46c2906f40302401bdd27517 3 to 4 9a5f92ea72754c01cc03e5efcdfe94021120531e 2 to 3 ff93fd4f50321c6190e1659b19e64fef3045a484 @@ -104,21 +106,31 @@ initial version 0 d8c834c95d506db979ec871417de90b7951edc30 Binary encoding of .mpy files ----------------------------- -MicroPython .mpy files are a binary container format with code objects -stored internally in a nested hierarchy. To keep files small while still +MicroPython .mpy files are a binary container format with code objects (bytecode +and native machine code) stored internally in a nested hierarchy. The code for +the outer module is stored first, and then its children follow. Each child may +have further children, for example in the case of a class having methods, or a +function defining a lambda or comprehension. To keep files small while still providing a large range of possible values it uses the concept of a variably-encoded-unsigned-integer (vuint) in many places. Similar to utf-8 encoding, this encoding stores 7 bits per byte with the 8th bit (MSB) set if one or more bytes follow. The bits of the unsigned integer are stored in the vuint in LSB form. -The top-level of an .mpy file consists of two parts: +The top-level of an .mpy file consists of three parts: * The header. +* The global qstr and constant tables. + * The raw-code for the outer scope of the module. This outer scope is executed when the .mpy file is imported. +You can inspect the contents of a .mpy file by using ``mpy-tool.py``, for +example (run from the root of the main MicroPython repository):: + + $ ./tools/mpy-tool.py -xd myfile.mpy + The header ~~~~~~~~~~ @@ -131,7 +143,26 @@ byte value 0x4d (ASCII 'M') byte .mpy version number byte feature flags byte number of bits in a small int -vuint size of qstr window +====== ================================ + +The global qstr and constant tables +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +An .mpy file contains a single qstr table, and a single constant object table. +These are global to the .mpy file, they are referenced by all nested raw-code +objects. The qstr table maps internal qstr number (internal to the .mpy file) +to the resolved qstr number of the runtime that the .mpy file is imported into. +This links the .mpy file with the rest of the system that it executes within. +The constant object table is populated with references to all constant objects +that the .mpy file needs. + +====== ================================ +size field +====== ================================ +vuint number of qstrs +vuint number of constant objects +... qstr data +... encoded constant objects ====== ================================ Raw code elements @@ -143,24 +174,21 @@ contents are: ====== ================================ size field ====== ================================ -vuint type and size +vuint type, size and whether there are sub-raw-code elements ... code (bytecode or machine code) -vuint number of constant objects -vuint number of sub-raw-code elements -... constant objects +vuint number of sub-raw-code elements (only if non-zero) ... sub-raw-code elements ====== ================================ The first vuint in a raw-code element encodes the type of code stored in this -element (the two least-significant bits), and the decompressed length of the code -(the amount of RAM to allocate for it). +element (the two least-significant bits), whether this raw-code has any +children (the third least-significant bit), and the length of the code that +follows (the amount of RAM to allocate for it). -Following the vuint comes the code itself. In the case of bytecode it also contains -compressed qstr values. +Following the vuint comes the code itself. Unless the code type is viper code +with relocations, this code is constant data and does not need to be modified. -Following the code comes a vuint counting the number of constant objects, and -another vuint counting the number of sub-raw-code elements. - -The constant objects are then stored next. +If this raw-code has any children (as indicated by a bit in the first vuint), +following the code comes a vuint counting the number of sub-raw-code elements. Finally any sub-raw-code elements are stored, recursively.