This commit implements automatic module weak links for all built-in
modules, by searching for "ufoo" in the built-in module list if "foo"
cannot be found. This means that all modules named "ufoo" are always
available as "foo". Also, a port can no longer add any other weak links,
which makes strict the definition of a weak link.
It saves some code size (about 100-200 bytes) on ports that previously had
lots of weak links.
Some changes from the previous behaviour:
- It doesn't intern the non-u module names (eg "foo" is not interned),
which saves code size, but will mean that "import foo" creates a new qstr
(namely "foo") in RAM (unless the importing module is frozen).
- help('modules') no longer lists non-u module names, only the u-variants;
this reduces duplication in the help listing.
Weak links are effectively the same as having a set of symbolic links on
the filesystem that is searched last. So an "import foo" will search
built-in modules first, then all paths in sys.path, then weak links last,
importing "ufoo" if it exists. Thus a file called "foo.py" somewhere in
sys.path will still have precedence over the weak link of "foo" to "ufoo".
See issues: #1740, #4449, #5229, #5241.
If MICROPY_PERSISTENT_CODE_LOAD or MICROPY_ENABLE_COMPILER are enabled then
code gets enabled that calls file reading functions which may be disabled
if no readers have been implemented.
To fix this, introduce a MICROPY_HAS_FILE_READER variable, which is
automatically set if MICROPY_READER_POSIX or MICROPY_READER_VFS is set but
can also be manually set if a custom reader is being implemented. Then
disable the file reading calls if this is not set.
The new option is MICROPY_ENABLE_EXTERNAL_IMPORT and is enabled by default
so that the default behaviour is the same as before. With it disabled
import is only supported for built-in modules, not for external files nor
frozen modules. This allows to support targets that have no filesystem of
any kind and that only have access to pre-supplied built-in modules
implemented natively.
This is a bit of a clumsy way of doing it but solves the issue of __init__
not running when a module is imported via its weak-link name. Ideally a
better solution would be found.
This patch simplifies the str creation API to favour the common case of
creating a str object that is not forced to be interned. To force
interning of a new str the new mp_obj_new_str_via_qstr function is added,
and should only be used if warranted.
Apart from simplifying the mp_obj_new_str function (and making it have the
same signature as mp_obj_new_bytes), this patch also reduces code size by a
bit (-16 bytes for bare-arm and roughly -40 bytes on the bare-metal archs).
Header files that are considered internal to the py core and should not
normally be included directly are:
py/nlr.h - internal nlr configuration and declarations
py/bc0.h - contains bytecode macro definitions
py/runtime0.h - contains basic runtime enums
Instead, the top-level header files to include are one of:
py/obj.h - includes runtime0.h and defines everything to use the
mp_obj_t type
py/runtime.h - includes mpstate.h and hence nlr.h, obj.h, runtime0.h,
and defines everything to use the general runtime support functions
Additional, specific headers (eg py/objlist.h) can be included if needed.
The while-loop that calls chop_component will guarantee that level==-1 at
the end of the loop. Hence the code following it is unnecessary.
The check for p==this_name will catch imports that are beyond the
top-level, and also covers the case of new_mod_q==MP_QSTR_ (equivalent to
new_mod_l==0) so that check is removed.
There is also a new check at the start for level>=0 to guard against
__import__ being called with bad level values.
This patch changes mp_uint_t to size_t for the len argument of the
following public facing C functions:
mp_obj_tuple_get
mp_obj_list_get
mp_obj_get_array
These functions take a pointer to the len argument (to be filled in by the
function) and callers of these functions should update their code so the
type of len is changed to size_t. For ports that don't use nan-boxing
there should be no change in generate code because the size of the type
remains the same (word sized), and in a lot of cases there won't even be a
compiler warning if the type remains as mp_uint_t.
The reason for this change is to standardise on the use of size_t for
variables that count memory (or memory related) sizes/lengths. It helps
builds that use nan-boxing.
This patch refactors the error handling in the lexer, to simplify it (ie
reduce code size).
A long time ago, when the lexer/parser/compiler were first written, the
lexer and parser were designed so they didn't use exceptions (ie nlr) to
report errors but rather returned an error code. Over time that has
gradually changed, the parser in particular has more and more ways of
raising exceptions. Also, the lexer never really handled all errors without
raising, eg there were some memory errors which could raise an exception
(and in these rare cases one would get a fatal nlr-not-handled fault).
This patch accepts the fact that the lexer can raise exceptions in some
cases and allows it to raise exceptions to handle all its errors, which are
for the most part just out-of-memory errors during construction of the
lexer. This makes the lexer a bit simpler, and also the persistent code
stuff is simplified.
What this means for users of the lexer is that calls to it must be wrapped
in a nlr handler. But all uses of the lexer already have such an nlr
handler for the parser (and compiler) so that doesn't put any extra burden
on the callers.
The commit d9047d3c8a introduced a bug
whereby "from a.b import c" stopped working for frozen packages. This is
because the path was not properly truncated and became "a//b". Such a
path resolves correctly for a "real" filesystem, but not for a search in
the list of frozen modules.
Now frozen modules is treated just as a kind of VFS, and all operations
performed on it correspond to operations on normal filesystem. This allows
to support packages properly, and potentially also data files.
This change also have changes to rework frozen bytecode modules support to
use the same framework, but it's not finished (and actually may not work,
as older adhox handling of any type of frozen modules is removed).
The config variable MICROPY_MODULE_FROZEN is now made of two separate
parts: MICROPY_MODULE_FROZEN_STR and MICROPY_MODULE_FROZEN_MPY. This
allows to have none, either or both of frozen strings and frozen mpy
files (aka frozen bytecode).
MICROPY_ENABLE_COMPILER can be used to enable/disable the entire compiler,
which is useful when only loading of pre-compiled bytecode is supported.
It is enabled by default.
MICROPY_PY_BUILTINS_EVAL_EXEC controls support of eval and exec builtin
functions. By default they are only included if MICROPY_ENABLE_COMPILER
is enabled.
Disabling both options saves about 40k of code size on 32-bit x86.
This allows the mp_obj_t type to be configured to something other than a
pointer-sized primitive type.
This patch also includes additional changes to allow the code to compile
when sizeof(mp_uint_t) != sizeof(void*), such as using size_t instead of
mp_uint_t, and various casts.
When "micropython -m pkg.mod" command was used, relative imports in pkg.mod
didn't work, because pkg.mod.__name__ was set to __main__, and the fact that
it's a package submodule was missed. This is an original workaround to this
issue. TODO: investigate and compare how CPython deals with this issue.
Relative imports are based of a package, so we're currently at a module
within a package, we should get to package first.
Also, factor out path travsering operation, but this broke testing for
boundary errors with relative imports. TODO: reintroduce them, together
with proper tests.