char can be signedness, and using signedness types is dangerous - it can
lead to negative offsets when doing table lookups. We apparently should just
ban char usage.
This will allow roughly the same behavior as Python3 for non-ASCII strings,
for example, print("<phrase in non-Latin script>".split()) will print list
of words, not weird hex dump (like Python2 behaves). (Of course, that it
will print list of words, if there're "words" in that phrase at all, separated
by ASCII-compatible whitespace; that surely won't apply to every human
language in existence).
Functionality we provide in builtin io module is fairly minimal. Some
code, including CPython stdlib, depends on more functionality. So, there's
a choice to either implement it in C, or move it _io, and let implement other
functionality in Python. 2nd choice is pursued. This setup matches CPython
too (_io is builtin, io is Python-level).
Benefits: won't crash baremetal targets, will provide Python source location
when not implemented feature used (it will no longer provide C source
location, but just grep for error message).
there are special tweaks and paths to be considered. Just provide some
defaults, in case the values are undefined.
- py-version.sh does not need any bash specific features.
- Use libdl only on Linux for now. FreeBSD provides dl*() calls from libc.
Some small fixed:
- Combine 'x' and 'X' cases in str format code.
- Remove trailing spaces from some lines.
- Make exception messages consistently begin with lower case (then
needed to change those in objarray and objtuple so the same
constant string data could be used).
- Fix bug with exception message having %c instead of %%c.
Add keyword args to dict.update(), and ability to take a dictionary as
argument.
dict() class constructor can now use dict.update() directly.
This patch loses fast path for dict(other_dict), but is that really
needed? Any anyway, this idiom will now re-hash the dictionary, so is
arguably more memory efficient.
Addresses issue #647.
This may seem a bit of a risky change, in that it may introduce crazy
bugs with respect to volatile variables in the VM loop. But, I think it
should be fine: code_state points to some external memory, so the
compiler should always read/write to that memory when accessing the
ip/sp variables (ie not put them in registers).
Anyway, it passes all tests and improves on all efficiency fronts: about
2-4% faster (64-bit unix), 16 bytes less stack space per call (64-bit
unix) and slightly less executable size (unix and stmhal).
The reason it's more efficient is save_ip and save_sp were volatile
variables, so were anyway stored on the stack (in memory, not regs).
Thus converting them to code_state->{ip, sp} doesn't cost an extra
memory dereference (except maybe to get code_state, but that can be put
in a register and then made more efficient for other uses of it).
Conflicts:
py/vm.c
Fixed stack underflow check. Use UINT_FMT/INT_FMT where necessary.
Specify maximum VM-stack byte size by multiple of machine word size, so
that on 64 bit machines it has same functionality as 32 bit.
This improves stack usage in callers to mp_execute_bytecode2, and is step
forward towards unifying execution interface for function and generators
(which is important because generators don't even support full forms
of arguments passing (keywords, etc.)).
Needed to pop the iterator object when breaking out of a for loop. Need
also to be careful to unwind exception handler before popping iterator.
Addresses issue #635.
This helps the compiler do its optimisation, makes it clear which
variables are local per opcode and which global, and makes it consistent
when extra variables are needed in an opcode (in addition to old obj1,
obj2 pair, for example).
Could also make unum local, but that's for another time.
This completes non-automatic interning of strings in the parser, so that
doc strings don't take up RAM. It complicates the parser and compiler,
and bloats stmhal by about 300 bytes. It's complicated because now
there are 2 kinds of parse-nodes that can be strings: interned leaves
and non-interned structs.
io.FileIO is binary I/O, ans actually optional. Default file type is
io.TextIOWrapper, which provides str results. CPython3 explicitly describes
io.TextIOWrapper as buffered I/O, but we don't have buffering support yet
anyway.
Now schedule is: for native types, we call ->make_new() C-level method, which
should perform actions of __new__ and __init__ (note that this is not
compliant, but is efficient), but for user types, __new__ and __init__ are
called as expected.
Also, make sure we convert scalar attribute value to a bound-pair tight in
mp_obj_class_lookup() method, which avoids converting it again and again in
its callers.
__debug__ now resolves to True or False. Its value needs to be set by
mp_set_debug().
TODO: call mp_set_debug in unix/ port.
TODO: optimise away "if False:" statements in compiler.
Updated functions now do proper checking that n_kw==0, and are simpler
because they don't have to explicitly raise an exception. Down side is
that the error messages no longer include the function name, but that's
acceptable.
Saves order 300 text bytes on x64 and ARM.
This is not fully correct re: error handling, because we should check that
that types are used consistently (only str's or only bytes), but magically
makes lot of functions support bytes.
Two things are handled here: allow to compare native subtypes of tuple,
e.g. namedtuple (TODO: should compare type too, currently compared
duck-typedly by content). Secondly, allow user sunclasses of tuples
(and its subtypes) be compared either. "Magic" I did previously in
objtype.c covers only one argument (lhs is many), so we're in trouble
when lhs is native type - there's no other option besides handling
rhs in special manner. Fortunately, this patch outlines approach with
fast path for native types.
This was hit when trying to make urlparse.py from stdlib run. Took
quite some time to debug.
TODO: Reconsile bound method creation process better, maybe callable is
to generic type to bind at all?
Parser shouldn't raise exceptions, so needs to check when memory
allocation fails. This patch does that for the initial set up of the
parser state.
Also, we now put the parser object on the stack. It's small enough to
go there instead of on the heap.
This partially addresses issue #558.
"object" type in MicroPython currently doesn't implement any methods, and
hopefully, we'll try to stay like that for as long as possible. Even if we
have to add something eventually, look up from there might be handled in
adhoc manner, as last resort (that's not compliant with Python3 MRO, but
we're already non-compliant). Hence: 1) no need to spend type trying to
lookup anything in object; 2) no need to allocate subobject when explicitly
inheriting from object; 3) and having multiple bases inheriting from object
is not a case of incompatible multiple inheritance.
This patch simplifies the glue between native emitter and runtime,
and handles viper code like inline assember: return values are
converted to Python objects.
Fixes issue #531.
You can now do:
X = const(123)
Y = const(456 + X)
and the compiler will replace X and Y with their values.
See discussion in issue #266 and issue #573.
In case of empty non-blocking read()/write(), both return None. read()
cannot return 0, as that means EOF, so returns another value, and then
write() just follows. This is still pretty unexpected, and typical
"if not len:" check would treat this as EOF. Well, non-blocking files
require special handling!
This also kind of makes it depending on POSIX, but well, anything else
should emulate POSIX anyway ;-).
Need to have a policy as to how far we go adding keyword support to
built ins. It's nice to have, and gets better CPython compatibility,
but hurts the micro nature of uPy.
Addresses issue #577.
There are 2 locations in parser, and 1 in compiler, where memory
allocation is not precise. In the parser it's the rule stack and result
stack, in the compiler it's the array for the identifiers in the current
scope. All other mallocs are exact (ie they don't allocate more than is
needed).
This patch adds tuning options (MP_ALLOC_*) to mpconfig.h for these 3
inexact allocations.
The inexact allocations in the parser should actually be close to
logarithmic: you need an exponentially larger script (absent pathological
cases) to use up more room on the rule and result stacks. As such, the
default allocation policy for these is now to start with a modest sized
stack, but grow only in small increments.
For the identifier arrays in the compiler, these now start out quite
small (4 entries, since most functions don't have that many ids), and
grow incrementally by 6 (since if you have more ids than 4, you probably
have quite a few more, but it wouldn't be exponentially more).
Partially addresses issue #560.
This will work if MICROPY_DEBUG_PRINTERS is defined, which is only for
unix/windows ports. This makes it convenient to user uPy normally, but
easily get bytecode dump on the spot if needed, without constant recompiles
back and forth.
TODO: Add more useful debug output, adjust verbosity level on which
specifically bytecode dump happens.
Blanket wide to all .c and .h files. Some files originating from ST are
difficult to deal with (license wise) so it was left out of those.
Also merged modpyb.h, modos.h, modstm.h and modtime.h in stmhal/.