Also add some more debugging output to gc_dump_alloc_table().
Now that newly allocated heap is always zero'd, maybe we just make this
a policy for the uPy API to keep it simple (ie any new implementation of
memory allocation must zero all allocations). This follows the D
language philosophy.
Before this patch, a previously used memory block which had pointers in
it may still retain those pointers if the new user of that block does
not actually use the entire block. Eg, if I want 5 blocks worth of
heap, I actually get 8 (round up to nearest 4). Then I never use the
last 3, so they keep their old values, which may be pointers pointing to
the heap, hence preventing GC.
In rare (or maybe not that rare) cases, this leads to long, unintentional
"linked lists" within the GC'd heap, filling it up completely. It's
pretty rare, because you have to reuse exactly that memory which is part
of this "linked list", and reuse it in just the right way.
This should fix issue #522, and might have something to do with
issue #510.
3 emitter functions are needed only for emitcpy, and so we can #if them
out when compiling with emitcpy support.
Also remove unused SETUP_LOOP bytecode.
Closed over variables are now passed on the stack, instead of creating a
tuple and passing that. This way memory for the closed over variables
can be allocated within the closure object itself. See issue #510 for
background.
There were typos, various rounding errors trying to do concurrent counting
in bytes vs blocks, complex conditional paths, superfluous variables, etc.,
etc., all leading to obscure segfaults.
These are to assist in writing native C functions that take positional
and keyword arguments. mp_arg_check_num is for just checking the
number of arguments is correct. mp_arg_parse_all is for parsing
positional and keyword arguments with default values.
When querying an object that supports the buffer protocol, that object
must now return a typecode (as per binary.[ch]). This does not have to
be honoured by the caller, but can be useful for determining element
size.
Test usecase I used is print(time.time()) and print(time.time() - time.time()).
On Linux/Glibc they now give the same output as CPython 3.3. Specifically,
time.time() gives non-exponential output with 7 decimal digits, and subtraction
gives exponential output e-06/e-07.
On stmhal, computed gotos make the binary about 1k bigger, but makes it
run faster, and we have the room, so why not. All tests pass on
pyboard using computed gotos.
This follows pattern already used for objtuple, etc.: objfun.h's content
is not public - each and every piece of code should not have access to it.
It's not private either - with out architecture and implementation language
(C) it doesn't make sense to keep implementation of each object strictly
private and maintain cumbersome accessors. It's "local" - intended to be
used by a small set of "friend" (in C++ terms) objects.
Things get tricky when using the nlr code to catch exceptions. Need to
ensure that the variables (stack layout) in the exception handler are
the same as in the bit protected by the exception handler.
Prior to this patch there were a few bugs. 1) The constant
mp_const_MemoryError_obj was being preloaded to a specific location on
the stack at the start of the function. But this location on the stack
was being overwritten in the opcode loop (since it didn't think that
variable would ever be referenced again), and so when an exception
occurred, the variable holding the address of MemoryError was corrupt.
2) The FOR_ITER opcode detection in the exception handler used sp, which
may or may not contain the right value coming out of the main opcode
loop.
With this patch there is a clear separation of variables used in the
opcode loop and in the exception handler (should fix issue (2) above).
Furthermore, nlr_raise is no longer used in the opcode loop. Instead,
it jumps directly into the exception handler. This tells the C compiler
more about the possible code flow, and means that it should have the
same stack layout for the exception handler. This should fix issue (1)
above. Indeed, the generated (ARM) assembler has been checked explicitly,
and with 'goto exception_handler', the problem with &MemoryError is
fixed.
This may now fix problems with rge-sm, and probably many other subtle
bugs yet to show themselves. Incidentally, rge-sm now passes on
pyboard (with a reduced range of integration)!
Main lesson: nlr is tricky. Don't use nlr_push unless you know what you
are doing! Luckily, it's not used in many places. Using nlr_raise/jump
is fine.
The autogenerated header files have been moved about, and an extra
include dir has been added, which means you can give a custom
BUILD=newbuilddir option to make, and everything "just works"
Also tidied up the way the different Makefiles build their include-
directory flags
That was easy - just avoid erroring out on seeing candidate dir for namespace
package. That's far from being complete though - namespace packages should
support importing portions of package from different sys.path entries, here
we require first matching entry to contain all namespace package's portions.
And yet, that's a way to put parts of the same Python package into multiple
installable package - something we really need for *Micro*Python.
The logic appears to be that (at least beginning of) sys.versions is the
version of reference Python language implemented, not version of particular
implementation.
Also, bump set versions at 3.4.0, based on @dpgeorge preference.
Attempt to address issue #386. unique_code_id's have been removed and
replaced with a pointer to the "raw code" information. This pointer is
stored in the actual byte code (aligned, so the GC can trace it), so
that raw code (ie byte code, native code and inline assembler) is kept
only for as long as it is needed. In memory it's now like a tree: the
outer module's byte code points directly to its children's raw code. So
when the outer code gets freed, if there are no remaining functions that
need the raw code, then the children's code gets freed as well.
This is pretty much like CPython does it, except that CPython stores
indexes in the byte code rather than machine pointers. These indices
index the per-function constant table in order to find the relevant
code.
Improved the Thumb assembler back end. Added many more Thumb
instructions to the inline assembler. Improved parsing of assembler
instructions and arguments. Assembler functions can now be passed the
address of any object that supports the buffer protocol (to get the
address of the buffer). Added an example of how to sum numbers from
an array in assembler.
This is necessary to catch all cases where locals are referenced before
assignment. We still keep the _0, _1, _2 versions of LOAD_FAST to help
reduced the byte code size in RAM.
Addresses issue #457.
I'm pretty sure these are never reached, since NOT_EQUAL is always
converted into EQUAL in mp_binary_op. No one should call
type.binary_op directly, they should always go through mp_binary_op
(or mp_obj_is_equal).
Per https://docs.python.org/3.3/reference/import.html , this is the way to
tell module from package: "Specifically, any module that contains a __path__
attribute is considered a package." And it for sure will be needed to
implement relative imports.
This simplifies the compiler a little, since now it can do 1 pass over
a function declaration, to determine default arguments. I would have
done this originally, but CPython 3.3 somehow had the default keyword
args compiled before the default position args (even though they appear
in the other order in the text of the script), and I thought it was
important to have the same order of execution when evaluating default
arguments. CPython 3.4 has changed the order to the more obvious one,
so we can also change.
It has (again) a fast path for ints, and a simplified "slow" path for
everything else.
Also simplify the way str indexing is done (now matches tuple and list).
A specific target can define either MP_ENDIANNESS_LITTLE or MP_ENDIANNESS_BIG
to 1. Default is MP_ENDIANNESS_LITTLE.
TODO: Autodetect based on compiler predefined macros?
Working towards trying to support compile-time constants (see discussion
in issue #227), this patch allows the compiler to look inside arbitrary
uPy objects at compile time. The objects to search are given by the
macro MICROPY_EXTRA_CONSTANTS (so they must be constant/ROM objects),
and the constant folding occures on forms base.attr (both base and attr
must be id's).
It works, but it breaks strict CPython compatibility, since the lookup
will succeed even without importing the namespace.
Previously, a failed malloc/realloc would throw an exception, which was
not caught. I think it's better to keep the parser free from NLR
(exception throwing), hence this patch.
Only calcsize() and unpack() functions provided so far, for little-endian
byte order. Format strings don't support repition spec (like "2b3i").
Unfortunately, dealing with all the various binary type sizes and alignments
will lead to quite a bloated "binary" helper functions - if optimizing for
speed. Need to think if using dynamic parametrized algos makes more sense.
With the implementation of proper string formatting, code to print a
small int was delegated to mpz_as_str_inpl (after first converting the
small int to an mpz using stack memory). But mpz_as_str_inpl allocates
heap memory to do the conversion, so small ints needed heap memory just
to be printed.
This fix has a separate function to print small ints, which does not
allocate heap, and allocates less stack.
String formatting, printf and pfenv are now large beasts, with some
semi-duplicated code.
These two are apprerently the most concise and efficient way to convert
int to/from bytes in Python. The alternatives are struct and array modules,
but methods using them are more verbose in Python code and less efficient
in memory/cycles.
Full CPython compatibility with this requires actually parsing the
input so far collected, and if it fails parsing due to lack of tokens,
then continue collecting input. It's not worth doing it this way. Not
having compatibility at this level does not hurt the goals of Micro
Python.
stmhal relies on pfenv_* to implement its printf. Thus, it needs a
pfenv_print_int which prints a proper 32-bit integer. With latest
change to pfenv, this function became one that took mp_obj_t, and
extracted the integer value from that object.
To fix temporarily, pfenv_print_int has been renamed to
pfenv_print_mp_int (to indicate it takes a mp_obj_t for the int), and
pfenv_print_int has been added (which takes a normal C int). Currently,
pfenv_print_int proxies to pfenv_print_mp_int, but this means it looses
the MSB. Need to find a way to fix this, but the only way I can think
of will duplicate lots of code.
Two things: 1) set flags in copy properly; make mp_map_init() not be too
smart and do something with requested alloc size. Policy of using prime
numbers for alloc size is high-level policy which should be applied at
corresponding high levels. Low-level functions should just do what they're
asked to, because they don't have enough context to be smarter than that.
For example, munging with alloc size of course breaks dict copying (as
changing sizes requires rehashing).
Based on the discussion in #433. mp_load_attr() is critical-path function,
so any extra check will slowdown any script. As supporting default val
required only for getattr() builtin, move correspending implementation
there (still as a separate function due to concerns of maintainability
of such almost-duplicated code instances).
This is to reduce ROM usage. stream_p is used in file and socket types
only (at the moment), so seems a good idea to make the protocol
functions a pointer instead of the actual structure.
It saves 308 bytes of ROM in the stmhal/ port, 928 in unix/.
Finishes addressing issue #424.
In the end this was a very neat refactor that now makes things a lot
more consistent across the py code base. It allowed some
simplifications in certain places, now that everything is a dict object.
Also converted builtins tables to dictionaries. This will be useful
when we need to turn builtins into a proper module.
When searching next time, such entry should be just skipped, not terminate
the search. It's known that marking techique is not efficient at the presense
of many removes, but namespace usage should not require many deletes, and
as for user dictionaries - well, open addressing map table with linear
rehashing and load factor of ~1 is not particularly efficient at all ;-).
TODO: May consider "shift other entries in cluster" approach as an
alternative.
Very little has changed. In Python 3.4 they removed the opcode
STORE_LOCALS, but in Micro Python we only ever used this for CPython
compatibility, so it was a trivial thing to remove. It also allowed to
clean up some dead code (eg the 0xdeadbeef in class construction), and
now class builders use 1 less stack word.
Python 3.4.0 introduced the LOAD_CLASSDEREF opcode, which I have not
yet understood. Still, all tests (apart from bytecode test) still pass.
Bytecode tests needs some more attention, but they are not that
important anymore.
This adds support for almost everything (the comma isn't currently
supported).
The "unspecified" type with floats also doesn't behave exactly like
python.
Tested under unix with float and double
Spot tested on stmhal
It's not completely satisfactory, because a failed call to __getattr__
should not raise an exception.
__setattr__ could be implemented, but it would slow down all stores to a
user created object. Need to implement some caching system.
Because it's runtime reflection feature, not required for many apps.
Rant time:
Python could really use better str() vs repr() distinction, for example,
repr(type) could be "<class 'foo'>" (as it is now), and str(type) just
"foo". But alas, getting straight name requires adhoc attribute.
Don't store final, failing value to the loop variable. This fix also
makes for .. range a bit more efficient, as it uses less store/load
pairs for the loop variable.
There was thinkos that either send_value or throw_value is specified, but
there were cases with both. Note that send_value is pushed onto generator's
stack - but that's probably only good, because if we throw exception into
gen, it should not ever use send_value, and that will be just extra "assert".
In this case, the exception is just re-thrown - the ideas is that object
doesn't handle this exception specially, so it will propagated per Python
semantics.
.throw() propagates any exceptions, and .close() swallows them. Yielding
in reponse to .throw(GeneratorExit) is still fatal, and we need to
handle it for .throw() case separately (previously it was handled only
for .close() case).
Obscure corner cases due to test_pep380.py.
Adding this bytecode allows to remove 4 others related to
function/method calls with * and ** support. Will also help with
bytecodes that make functions/closures with default positional and
keyword args.
One of the reason for separate "message" (besides still unfulfilled desire to
optimize memory usage) was apparent special handling of exception with
messages by CPython. Well, the message is still just an exception argument,
it just printed specially. Implement that with PRINT_EXC printing format.