Benefits: won't crash baremetal targets, will provide Python source location
when not implemented feature used (it will no longer provide C source
location, but just grep for error message).
there are special tweaks and paths to be considered. Just provide some
defaults, in case the values are undefined.
- py-version.sh does not need any bash specific features.
- Use libdl only on Linux for now. FreeBSD provides dl*() calls from libc.
Some small fixed:
- Combine 'x' and 'X' cases in str format code.
- Remove trailing spaces from some lines.
- Make exception messages consistently begin with lower case (then
needed to change those in objarray and objtuple so the same
constant string data could be used).
- Fix bug with exception message having %c instead of %%c.
Add keyword args to dict.update(), and ability to take a dictionary as
argument.
dict() class constructor can now use dict.update() directly.
This patch loses fast path for dict(other_dict), but is that really
needed? Any anyway, this idiom will now re-hash the dictionary, so is
arguably more memory efficient.
Addresses issue #647.
This may seem a bit of a risky change, in that it may introduce crazy
bugs with respect to volatile variables in the VM loop. But, I think it
should be fine: code_state points to some external memory, so the
compiler should always read/write to that memory when accessing the
ip/sp variables (ie not put them in registers).
Anyway, it passes all tests and improves on all efficiency fronts: about
2-4% faster (64-bit unix), 16 bytes less stack space per call (64-bit
unix) and slightly less executable size (unix and stmhal).
The reason it's more efficient is save_ip and save_sp were volatile
variables, so were anyway stored on the stack (in memory, not regs).
Thus converting them to code_state->{ip, sp} doesn't cost an extra
memory dereference (except maybe to get code_state, but that can be put
in a register and then made more efficient for other uses of it).
Conflicts:
py/vm.c
Fixed stack underflow check. Use UINT_FMT/INT_FMT where necessary.
Specify maximum VM-stack byte size by multiple of machine word size, so
that on 64 bit machines it has same functionality as 32 bit.
This improves stack usage in callers to mp_execute_bytecode2, and is step
forward towards unifying execution interface for function and generators
(which is important because generators don't even support full forms
of arguments passing (keywords, etc.)).
Needed to pop the iterator object when breaking out of a for loop. Need
also to be careful to unwind exception handler before popping iterator.
Addresses issue #635.
This helps the compiler do its optimisation, makes it clear which
variables are local per opcode and which global, and makes it consistent
when extra variables are needed in an opcode (in addition to old obj1,
obj2 pair, for example).
Could also make unum local, but that's for another time.
This completes non-automatic interning of strings in the parser, so that
doc strings don't take up RAM. It complicates the parser and compiler,
and bloats stmhal by about 300 bytes. It's complicated because now
there are 2 kinds of parse-nodes that can be strings: interned leaves
and non-interned structs.
io.FileIO is binary I/O, ans actually optional. Default file type is
io.TextIOWrapper, which provides str results. CPython3 explicitly describes
io.TextIOWrapper as buffered I/O, but we don't have buffering support yet
anyway.
Now schedule is: for native types, we call ->make_new() C-level method, which
should perform actions of __new__ and __init__ (note that this is not
compliant, but is efficient), but for user types, __new__ and __init__ are
called as expected.
Also, make sure we convert scalar attribute value to a bound-pair tight in
mp_obj_class_lookup() method, which avoids converting it again and again in
its callers.
__debug__ now resolves to True or False. Its value needs to be set by
mp_set_debug().
TODO: call mp_set_debug in unix/ port.
TODO: optimise away "if False:" statements in compiler.
Updated functions now do proper checking that n_kw==0, and are simpler
because they don't have to explicitly raise an exception. Down side is
that the error messages no longer include the function name, but that's
acceptable.
Saves order 300 text bytes on x64 and ARM.
This is not fully correct re: error handling, because we should check that
that types are used consistently (only str's or only bytes), but magically
makes lot of functions support bytes.
Two things are handled here: allow to compare native subtypes of tuple,
e.g. namedtuple (TODO: should compare type too, currently compared
duck-typedly by content). Secondly, allow user sunclasses of tuples
(and its subtypes) be compared either. "Magic" I did previously in
objtype.c covers only one argument (lhs is many), so we're in trouble
when lhs is native type - there's no other option besides handling
rhs in special manner. Fortunately, this patch outlines approach with
fast path for native types.
This was hit when trying to make urlparse.py from stdlib run. Took
quite some time to debug.
TODO: Reconsile bound method creation process better, maybe callable is
to generic type to bind at all?
Parser shouldn't raise exceptions, so needs to check when memory
allocation fails. This patch does that for the initial set up of the
parser state.
Also, we now put the parser object on the stack. It's small enough to
go there instead of on the heap.
This partially addresses issue #558.
"object" type in MicroPython currently doesn't implement any methods, and
hopefully, we'll try to stay like that for as long as possible. Even if we
have to add something eventually, look up from there might be handled in
adhoc manner, as last resort (that's not compliant with Python3 MRO, but
we're already non-compliant). Hence: 1) no need to spend type trying to
lookup anything in object; 2) no need to allocate subobject when explicitly
inheriting from object; 3) and having multiple bases inheriting from object
is not a case of incompatible multiple inheritance.
This patch simplifies the glue between native emitter and runtime,
and handles viper code like inline assember: return values are
converted to Python objects.
Fixes issue #531.
You can now do:
X = const(123)
Y = const(456 + X)
and the compiler will replace X and Y with their values.
See discussion in issue #266 and issue #573.
In case of empty non-blocking read()/write(), both return None. read()
cannot return 0, as that means EOF, so returns another value, and then
write() just follows. This is still pretty unexpected, and typical
"if not len:" check would treat this as EOF. Well, non-blocking files
require special handling!
This also kind of makes it depending on POSIX, but well, anything else
should emulate POSIX anyway ;-).
Need to have a policy as to how far we go adding keyword support to
built ins. It's nice to have, and gets better CPython compatibility,
but hurts the micro nature of uPy.
Addresses issue #577.
There are 2 locations in parser, and 1 in compiler, where memory
allocation is not precise. In the parser it's the rule stack and result
stack, in the compiler it's the array for the identifiers in the current
scope. All other mallocs are exact (ie they don't allocate more than is
needed).
This patch adds tuning options (MP_ALLOC_*) to mpconfig.h for these 3
inexact allocations.
The inexact allocations in the parser should actually be close to
logarithmic: you need an exponentially larger script (absent pathological
cases) to use up more room on the rule and result stacks. As such, the
default allocation policy for these is now to start with a modest sized
stack, but grow only in small increments.
For the identifier arrays in the compiler, these now start out quite
small (4 entries, since most functions don't have that many ids), and
grow incrementally by 6 (since if you have more ids than 4, you probably
have quite a few more, but it wouldn't be exponentially more).
Partially addresses issue #560.
This will work if MICROPY_DEBUG_PRINTERS is defined, which is only for
unix/windows ports. This makes it convenient to user uPy normally, but
easily get bytecode dump on the spot if needed, without constant recompiles
back and forth.
TODO: Add more useful debug output, adjust verbosity level on which
specifically bytecode dump happens.
Blanket wide to all .c and .h files. Some files originating from ST are
difficult to deal with (license wise) so it was left out of those.
Also merged modpyb.h, modos.h, modstm.h and modtime.h in stmhal/.
By default mingw outputs 3 digits instead of the standard 2 so all float
tests using printf fail. Using setenv at the start of the program fixes this.
To accomodate calling platform specific initialization a
MICROPY_MAIN_INIT_FUNC macro is used which is called in mp_init()
The original parsing would error out on any C declarations that are not typedefs
or extern variables. This limits what can go in mpconfig.h and mpconfigport.h,
as they are included in qstr.h. For instance even a function declaration would be
rejected and including system headers is a complete no-go.
That seems too limiting for a global config header, so makeqstrdata now
ignores everything that does not match a qstr definition.
alloca() is declared in alloca.h which als happens to be included by stdlib.h.
On mingw however it resides in malloc.h only.
So if we include alloca.h directly, and add an alloca.h for mingw in it's port
directory we can get rid of the mingw-specific define to include malloc.h
and the other ports are happy as well.
Biggest part of this support is refactoring mp_obj_class_lookup() to return
standard "bound member" pair (mp_obj_t[2]). Actual support of inherited
native methods is 3 lines then. Some inherited features may be not supported
yet (e.g. native class methods, native properties, etc., etc.). There may
be opportunities for further optimization too.
This implements checking of base types, allocation and basic initialization,
and optimized support for special method lookups. Other features are not yet
supported.
Of course, keywords are turned into lexer tokens in the lexer, so will
never need to be interned (unless you do something like x="def").
As it is now, the following on pyboard makes no new qstrs:
import pyb
pyb.info()
New way uses slightly less ROM and RAM, should be slightly faster, and,
most importantly, allows to catch the error "non-keyword arg following
keyword arg".
Addresses issue #466.
Also add some more debugging output to gc_dump_alloc_table().
Now that newly allocated heap is always zero'd, maybe we just make this
a policy for the uPy API to keep it simple (ie any new implementation of
memory allocation must zero all allocations). This follows the D
language philosophy.
Before this patch, a previously used memory block which had pointers in
it may still retain those pointers if the new user of that block does
not actually use the entire block. Eg, if I want 5 blocks worth of
heap, I actually get 8 (round up to nearest 4). Then I never use the
last 3, so they keep their old values, which may be pointers pointing to
the heap, hence preventing GC.
In rare (or maybe not that rare) cases, this leads to long, unintentional
"linked lists" within the GC'd heap, filling it up completely. It's
pretty rare, because you have to reuse exactly that memory which is part
of this "linked list", and reuse it in just the right way.
This should fix issue #522, and might have something to do with
issue #510.
3 emitter functions are needed only for emitcpy, and so we can #if them
out when compiling with emitcpy support.
Also remove unused SETUP_LOOP bytecode.
Closed over variables are now passed on the stack, instead of creating a
tuple and passing that. This way memory for the closed over variables
can be allocated within the closure object itself. See issue #510 for
background.
There were typos, various rounding errors trying to do concurrent counting
in bytes vs blocks, complex conditional paths, superfluous variables, etc.,
etc., all leading to obscure segfaults.
These are to assist in writing native C functions that take positional
and keyword arguments. mp_arg_check_num is for just checking the
number of arguments is correct. mp_arg_parse_all is for parsing
positional and keyword arguments with default values.
When querying an object that supports the buffer protocol, that object
must now return a typecode (as per binary.[ch]). This does not have to
be honoured by the caller, but can be useful for determining element
size.
Test usecase I used is print(time.time()) and print(time.time() - time.time()).
On Linux/Glibc they now give the same output as CPython 3.3. Specifically,
time.time() gives non-exponential output with 7 decimal digits, and subtraction
gives exponential output e-06/e-07.
On stmhal, computed gotos make the binary about 1k bigger, but makes it
run faster, and we have the room, so why not. All tests pass on
pyboard using computed gotos.
This follows pattern already used for objtuple, etc.: objfun.h's content
is not public - each and every piece of code should not have access to it.
It's not private either - with out architecture and implementation language
(C) it doesn't make sense to keep implementation of each object strictly
private and maintain cumbersome accessors. It's "local" - intended to be
used by a small set of "friend" (in C++ terms) objects.
Things get tricky when using the nlr code to catch exceptions. Need to
ensure that the variables (stack layout) in the exception handler are
the same as in the bit protected by the exception handler.
Prior to this patch there were a few bugs. 1) The constant
mp_const_MemoryError_obj was being preloaded to a specific location on
the stack at the start of the function. But this location on the stack
was being overwritten in the opcode loop (since it didn't think that
variable would ever be referenced again), and so when an exception
occurred, the variable holding the address of MemoryError was corrupt.
2) The FOR_ITER opcode detection in the exception handler used sp, which
may or may not contain the right value coming out of the main opcode
loop.
With this patch there is a clear separation of variables used in the
opcode loop and in the exception handler (should fix issue (2) above).
Furthermore, nlr_raise is no longer used in the opcode loop. Instead,
it jumps directly into the exception handler. This tells the C compiler
more about the possible code flow, and means that it should have the
same stack layout for the exception handler. This should fix issue (1)
above. Indeed, the generated (ARM) assembler has been checked explicitly,
and with 'goto exception_handler', the problem with &MemoryError is
fixed.
This may now fix problems with rge-sm, and probably many other subtle
bugs yet to show themselves. Incidentally, rge-sm now passes on
pyboard (with a reduced range of integration)!
Main lesson: nlr is tricky. Don't use nlr_push unless you know what you
are doing! Luckily, it's not used in many places. Using nlr_raise/jump
is fine.
The autogenerated header files have been moved about, and an extra
include dir has been added, which means you can give a custom
BUILD=newbuilddir option to make, and everything "just works"
Also tidied up the way the different Makefiles build their include-
directory flags
That was easy - just avoid erroring out on seeing candidate dir for namespace
package. That's far from being complete though - namespace packages should
support importing portions of package from different sys.path entries, here
we require first matching entry to contain all namespace package's portions.
And yet, that's a way to put parts of the same Python package into multiple
installable package - something we really need for *Micro*Python.
The logic appears to be that (at least beginning of) sys.versions is the
version of reference Python language implemented, not version of particular
implementation.
Also, bump set versions at 3.4.0, based on @dpgeorge preference.
Attempt to address issue #386. unique_code_id's have been removed and
replaced with a pointer to the "raw code" information. This pointer is
stored in the actual byte code (aligned, so the GC can trace it), so
that raw code (ie byte code, native code and inline assembler) is kept
only for as long as it is needed. In memory it's now like a tree: the
outer module's byte code points directly to its children's raw code. So
when the outer code gets freed, if there are no remaining functions that
need the raw code, then the children's code gets freed as well.
This is pretty much like CPython does it, except that CPython stores
indexes in the byte code rather than machine pointers. These indices
index the per-function constant table in order to find the relevant
code.
Improved the Thumb assembler back end. Added many more Thumb
instructions to the inline assembler. Improved parsing of assembler
instructions and arguments. Assembler functions can now be passed the
address of any object that supports the buffer protocol (to get the
address of the buffer). Added an example of how to sum numbers from
an array in assembler.
This is necessary to catch all cases where locals are referenced before
assignment. We still keep the _0, _1, _2 versions of LOAD_FAST to help
reduced the byte code size in RAM.
Addresses issue #457.
I'm pretty sure these are never reached, since NOT_EQUAL is always
converted into EQUAL in mp_binary_op. No one should call
type.binary_op directly, they should always go through mp_binary_op
(or mp_obj_is_equal).
Per https://docs.python.org/3.3/reference/import.html , this is the way to
tell module from package: "Specifically, any module that contains a __path__
attribute is considered a package." And it for sure will be needed to
implement relative imports.
This simplifies the compiler a little, since now it can do 1 pass over
a function declaration, to determine default arguments. I would have
done this originally, but CPython 3.3 somehow had the default keyword
args compiled before the default position args (even though they appear
in the other order in the text of the script), and I thought it was
important to have the same order of execution when evaluating default
arguments. CPython 3.4 has changed the order to the more obvious one,
so we can also change.
It has (again) a fast path for ints, and a simplified "slow" path for
everything else.
Also simplify the way str indexing is done (now matches tuple and list).
A specific target can define either MP_ENDIANNESS_LITTLE or MP_ENDIANNESS_BIG
to 1. Default is MP_ENDIANNESS_LITTLE.
TODO: Autodetect based on compiler predefined macros?
Working towards trying to support compile-time constants (see discussion
in issue #227), this patch allows the compiler to look inside arbitrary
uPy objects at compile time. The objects to search are given by the
macro MICROPY_EXTRA_CONSTANTS (so they must be constant/ROM objects),
and the constant folding occures on forms base.attr (both base and attr
must be id's).
It works, but it breaks strict CPython compatibility, since the lookup
will succeed even without importing the namespace.
Previously, a failed malloc/realloc would throw an exception, which was
not caught. I think it's better to keep the parser free from NLR
(exception throwing), hence this patch.
Only calcsize() and unpack() functions provided so far, for little-endian
byte order. Format strings don't support repition spec (like "2b3i").
Unfortunately, dealing with all the various binary type sizes and alignments
will lead to quite a bloated "binary" helper functions - if optimizing for
speed. Need to think if using dynamic parametrized algos makes more sense.
With the implementation of proper string formatting, code to print a
small int was delegated to mpz_as_str_inpl (after first converting the
small int to an mpz using stack memory). But mpz_as_str_inpl allocates
heap memory to do the conversion, so small ints needed heap memory just
to be printed.
This fix has a separate function to print small ints, which does not
allocate heap, and allocates less stack.
String formatting, printf and pfenv are now large beasts, with some
semi-duplicated code.
These two are apprerently the most concise and efficient way to convert
int to/from bytes in Python. The alternatives are struct and array modules,
but methods using them are more verbose in Python code and less efficient
in memory/cycles.
Full CPython compatibility with this requires actually parsing the
input so far collected, and if it fails parsing due to lack of tokens,
then continue collecting input. It's not worth doing it this way. Not
having compatibility at this level does not hurt the goals of Micro
Python.
stmhal relies on pfenv_* to implement its printf. Thus, it needs a
pfenv_print_int which prints a proper 32-bit integer. With latest
change to pfenv, this function became one that took mp_obj_t, and
extracted the integer value from that object.
To fix temporarily, pfenv_print_int has been renamed to
pfenv_print_mp_int (to indicate it takes a mp_obj_t for the int), and
pfenv_print_int has been added (which takes a normal C int). Currently,
pfenv_print_int proxies to pfenv_print_mp_int, but this means it looses
the MSB. Need to find a way to fix this, but the only way I can think
of will duplicate lots of code.
Two things: 1) set flags in copy properly; make mp_map_init() not be too
smart and do something with requested alloc size. Policy of using prime
numbers for alloc size is high-level policy which should be applied at
corresponding high levels. Low-level functions should just do what they're
asked to, because they don't have enough context to be smarter than that.
For example, munging with alloc size of course breaks dict copying (as
changing sizes requires rehashing).
Based on the discussion in #433. mp_load_attr() is critical-path function,
so any extra check will slowdown any script. As supporting default val
required only for getattr() builtin, move correspending implementation
there (still as a separate function due to concerns of maintainability
of such almost-duplicated code instances).
This is to reduce ROM usage. stream_p is used in file and socket types
only (at the moment), so seems a good idea to make the protocol
functions a pointer instead of the actual structure.
It saves 308 bytes of ROM in the stmhal/ port, 928 in unix/.
Finishes addressing issue #424.
In the end this was a very neat refactor that now makes things a lot
more consistent across the py code base. It allowed some
simplifications in certain places, now that everything is a dict object.
Also converted builtins tables to dictionaries. This will be useful
when we need to turn builtins into a proper module.
When searching next time, such entry should be just skipped, not terminate
the search. It's known that marking techique is not efficient at the presense
of many removes, but namespace usage should not require many deletes, and
as for user dictionaries - well, open addressing map table with linear
rehashing and load factor of ~1 is not particularly efficient at all ;-).
TODO: May consider "shift other entries in cluster" approach as an
alternative.
Very little has changed. In Python 3.4 they removed the opcode
STORE_LOCALS, but in Micro Python we only ever used this for CPython
compatibility, so it was a trivial thing to remove. It also allowed to
clean up some dead code (eg the 0xdeadbeef in class construction), and
now class builders use 1 less stack word.
Python 3.4.0 introduced the LOAD_CLASSDEREF opcode, which I have not
yet understood. Still, all tests (apart from bytecode test) still pass.
Bytecode tests needs some more attention, but they are not that
important anymore.
This adds support for almost everything (the comma isn't currently
supported).
The "unspecified" type with floats also doesn't behave exactly like
python.
Tested under unix with float and double
Spot tested on stmhal
It's not completely satisfactory, because a failed call to __getattr__
should not raise an exception.
__setattr__ could be implemented, but it would slow down all stores to a
user created object. Need to implement some caching system.
Because it's runtime reflection feature, not required for many apps.
Rant time:
Python could really use better str() vs repr() distinction, for example,
repr(type) could be "<class 'foo'>" (as it is now), and str(type) just
"foo". But alas, getting straight name requires adhoc attribute.
Don't store final, failing value to the loop variable. This fix also
makes for .. range a bit more efficient, as it uses less store/load
pairs for the loop variable.
There was thinkos that either send_value or throw_value is specified, but
there were cases with both. Note that send_value is pushed onto generator's
stack - but that's probably only good, because if we throw exception into
gen, it should not ever use send_value, and that will be just extra "assert".
In this case, the exception is just re-thrown - the ideas is that object
doesn't handle this exception specially, so it will propagated per Python
semantics.
.throw() propagates any exceptions, and .close() swallows them. Yielding
in reponse to .throw(GeneratorExit) is still fatal, and we need to
handle it for .throw() case separately (previously it was handled only
for .close() case).
Obscure corner cases due to test_pep380.py.
Adding this bytecode allows to remove 4 others related to
function/method calls with * and ** support. Will also help with
bytecodes that make functions/closures with default positional and
keyword args.
One of the reason for separate "message" (besides still unfulfilled desire to
optimize memory usage) was apparent special handling of exception with
messages by CPython. Well, the message is still just an exception argument,
it just printed specially. Implement that with PRINT_EXC printing format.
Pretty much everyone needs to include map.h, since it's such an integral
part of the Micro Python object implementation. Thus, the definitions
are now in obj.h instead. map.h is removed.
Mostly just a global search and replace. Except rt_is_true which
becomes mp_obj_is_true.
Still would like to tidy up some of the names, but this will do for now.
Required to reraise correct exceptions in except block, regardless if more
try blocks with active exceptions happen in the same except block.
P.S. This "automagic reraise" appears to be quite wasteful feature of Python
- we need to save pending exception just in case it *might* be reraised.
Instead, programmer could explcitly capture exception to a variable using
"except ... as var", and reraise that. So, consider disabling argless raise
support as an optimization.
The compiler allocates 7 entries on the stack for a with statement
(following CPython, but probably can be reduced). This is enough for
the method load and call in SETUP_WITH.
Partly (very partly!) addresses issue #386. Most importantly, at the
REPL command line, each invocation does not now lead to increased memory
usage (unless you define a function/lambda).
This reduntant triple is one of the ugliest parts of Python, which they
chickened out to fix in Python3. We really should consider passing just
as single exception instance (without breaking Python-level APIs of course),
but until we do, let's follow CPython layout.
Rationale: setting up the stack (state for locals and exceptions) is
really part of the "code", it's the prelude of the function. For
example, native code adjusts the stack pointer on entry to the function.
Native code doesn't need to know n_state for any other reason. So
putting the state size in the bytecode prelude is sensible.
It reduced ROM usage on STM by about 30 bytes :) And makes it easier to
pass information about the bytecode between functions.
Originally, .methods was used for methods in a ROM class, and
locals_dict for methods in a user-created class. That distinction is
unnecessary, and we can use locals_dict for ROM classes now that we have
ROMable maps.
This removes an entry in the bloated mp_obj_type_t struct, saving a word
for each ROM object and each RAM object. ROM objects that have a
methods table (now a locals_dict) need an extra word in total (removed
the methods pointer (1 word), no longer need the sentinel (2 words), but
now need an mp_obj_dict_t wrapper (4 words)). But RAM objects save a
word because they never used the methods entry.
Overall the ROM usage is down by a few hundred bytes, and RAM usage is
down 1 word per user-defined type/class.
There is less code (no need to check 2 tables), and now consistent with
the way ROM modules have their tables initialised.
Efficiency is very close to equivaluent.
This gets "value" of exceptions in the sense as it's defined for
StopIteration.value (i.e. args[0] or None).
TODO: This really should be inline function.
Return with value gets converted to StopIteration(value). Implementation
keeps optimizing against creating of possibly unneeded exception objects,
so there're considerable refactoring to implement these features.