Commit Graph

3222 Commits

Author SHA1 Message Date
Damien George
af0932a779 py/modio: Add uio.IOBase class to allow to define user streams.
A user class derived from IOBase and implementing readinto/write/ioctl can
now be used anywhere a native stream object is accepted.

The mapping from C to Python is:

    stream_p->read  --> readinto(buf)
    stream_p->write --> write(buf)
    stream_p->ioctl --> ioctl(request, arg)

Among other things it allows the user to:

- create an object which can be passed as the file argument to print:
  print(..., file=myobj), and then print will pass all the data to the
  object via the objects write method (same as CPython)
- pass a user object to uio.BufferedWriter to buffer the writes (same as
  CPython)
- use select.select on a user object
- register user objects with select.poll, in particular so user objects can
  be used with uasyncio
- create user files that can be returned from user filesystems, and import
  can import scripts from these user files

For example:

    class MyOut(io.IOBase):
        def write(self, buf):
            print('write', repr(buf))
            return len(buf)

    print('hello', file=MyOut())

The feature is enabled via MICROPY_PY_IO_IOBASE which is disabled by
default.
2018-06-12 12:29:26 +10:00
Damien George
6a445b60fa py/lexer: Add support for underscores in numeric literals.
This is a very convenient feature introduced in Python 3.6 by PEP 515.
2018-06-12 12:17:43 +10:00
Damien George
522ea80f06 py/gc: Add gc_sweep_all() function to run all remaining finalisers.
This patch adds the gc_sweep_all() function which does a garbage collection
without tracing any root pointers, so frees all the memory, and most
importantly runs any remaining finalisers.

This helps primarily for soft reset: it will close any open files, any open
sockets, and help to get the system back to a clean state upon soft reset.
2018-06-12 11:55:29 +10:00
Damien George
36c1052183 py/objtype: Optimise instance get/set/del by skipping special accessors.
This patch is a code optimisation, trading text bytes for speed.  On
pyboard it's an increase of 0.06% in code size for a gain (in pystone
performance) of roughly 6.5%.

The patch optimises load/store/delete of attributes in user defined classes
by not looking up special accessors (@property, __get__, __delete__,
__set__, __setattr__ and __getattr_) if they are guaranteed not to exist in
the class.

Currently, if you do my_obj.foo() then the runtime has to do a few checks
to see if foo is a property or has __get__, and if so delegate the call.
And for stores things like my_obj.foo = 1 has to first check if foo is a
property or has __set__ defined on it.

Doing all those checks each and every time the attribute is accessed has a
performance penalty.  This patch eliminates all those checks for cases when
it's guaranteed that the checks will always fail, ie no attributes are
properties nor have any special accessor methods defined on them.

To make this guarantee it checks all attributes of a user-defined class
when it is first created.  If any of the attributes of the user class are
properties or have special accessors, or any of the base classes of the
user class have them, then it sets a flag in the class to indicate that
special accessors must be checked for.  Then in the load/store/delete code
it checks this flag to see if it can take the shortcut and optimise the
lookup.

It's an optimisation that's pretty widely applicable because it improves
lookup performance for all methods of user defined classes, and stores of
attributes, at least for those that don't have special accessors.  And, it
allows to enable descriptors with minimal additional runtime overhead if
they are not used for a particular user class.

There is one restriction on dynamic class creation that has been introduced
by this patch: a user-defined class cannot go from zero special accessors
to one special accessor (or more) after that class has been subclassed.  If
the script attempts this an AttributeError is raised (see addition to
tests/misc/non_compliant.py for an example of this case).

The cost in code space bytes for the optimisation in this patch is:

   unix x64:  +528
unix nanbox:  +508
      stm32:  +192
     cc3200:  +200
    esp8266:  +332
      esp32:  +244

Performance tests that were done:

- on unix x86-64, pystone improved by about 5%
- on pyboard, pystone improved by about 6.5%, from 1683 up to 1794
- on pyboard, bm_chaos (from CPython benchmark suite) improved by about 5%
- on esp32, pystone improved by about 30% (but there are caching effects)
- on esp32, bm_chaos improved by about 11%
2018-06-08 12:12:08 +10:00
Damien George
bace1a16d0 py/objtype: Don't expose mp_obj_instance_attr().
mp_obj_is_instance_type() can be used instead to check for instance types.
2018-06-08 11:48:25 +10:00
Damien George
db5d8c97f1 py/obj.h: Introduce a "flags" entry in mp_obj_type_t. 2018-06-08 11:48:25 +10:00
Damien George
a8b9e71ac1 py/mpconfig.h: Add default MICROPY_VFS_FAT config value.
At least to document it's existence.
2018-06-06 14:33:42 +10:00
Damien George
a93144cb65 py/reader: Allow MICROPY_VFS_POSIX to work with MICROPY_READER_POSIX. 2018-06-06 14:28:23 +10:00
Damien George
8d82b0edbd extmod: Add VfsPosix filesystem component.
This VFS component allows to mount a host POSIX filesystem within the uPy
VFS sub-system.  All traditional POSIX file access then goes through the
VFS, allowing to sandbox a uPy process to a certain sub-dir of the host
system, as well as mount other filesystem types alongside the host
filesystem.
2018-06-06 14:28:23 +10:00
Damien George
1427f8f593 py/stream: Move definition of mp_stream_p_t from obj.h to stream.h.
Since a long time now, mp_obj_type_t no longer refers explicitly to
mp_stream_p_t but rather to an abstract "const void *protocol".  So there's
no longer any need to define mp_stream_p_t in obj.h and it can go with all
its associated definitions in stream.h.  Pretty much all users of this type
will already include the stream header.
2018-06-04 16:53:17 +10:00
Jeff Epler
c60589c02b py/objtype: Fix assertion failures in super_attr by checking type.
Fixes assertion failures and segmentation faults when making calls like:

    super(1, 1).x
2018-05-30 11:14:07 +10:00
Jeff Epler
05b13fd292 py/objtype: Fix assertion failures in mp_obj_new_type by checking types.
Fixes assertion failures when the arguments to type() were not of valid
types, e.g., when making calls like:

    type("", (), 3)
    type("", 3, {})
2018-05-30 11:11:24 +10:00
Damien George
dfeaea1441 py/objtype: Remove TODO comment about needing to check for property.
Instance members are always treated as values, even if they are properties.
A test is added to show this is the case.
2018-05-25 10:59:40 +10:00
Damien George
18e6358480 py/emit: Combine setup with/except/finally into one emit function.
This patch reduces code size by:

   bare-arm:   -16
minimal x86:  -156
   unix x64:  -288
unix nanbox:  -184
      stm32:   -48
     cc3200:   -16
    esp8266:   -96
      esp32:   -16

The last 10 patches combined reduce code size by:

   bare-arm:  -164
minimal x86: -1260
   unix x64: -3416
unix nanbox: -1616
      stm32:  -676
     cc3200:  -232
    esp8266: -1144
      esp32:  -268
2018-05-23 00:35:16 +10:00
Damien George
436e0d4c54 py/emit: Merge build set/slice into existing build emit function.
Reduces code size by:

   bare-arm:    +0
minimal x86:    +0
   unix x64:  -368
unix nanbox:  -248
      stm32:  -128
     cc3200:   -48
    esp8266:  -184
      esp32:   -40
2018-05-23 00:23:36 +10:00
Damien George
d97906ca9a py/emit: Combine import from/name/star into one emit function.
Change in code size is:

   bare-arm:    +4
minimal x86:   -88
   unix x64:  -456
unix nanbox:   -88
      stm32:   -44
     cc3200:    +0
    esp8266:  -104
      esp32:    +8
2018-05-23 00:23:08 +10:00
Damien George
8a513da5a5 py/emit: Combine break_loop and continue_loop into one emit function.
Reduces code size by:

   bare-arm:    +0
minimal x86:    +0
   unix x64:   -80
unix nanbox:    +0
      stm32:   -12
     cc3200:    +0
    esp8266:   -28
      esp32:    +0
2018-05-23 00:23:04 +10:00
Damien George
6211d979ee py/emit: Combine load/store/delete attr into one emit function.
Reduces code size by:

   bare-arm:   -20
minimal x86:  -140
   unix x64:  -408
unix nanbox:  -140
      stm32:   -68
     cc3200:   -16
    esp8266:   -80
      esp32:   -32
2018-05-23 00:22:59 +10:00
Damien George
a4941a8ba4 py/emit: Combine load/store/delete subscr into one emit function.
Reduces code size by:

   bare-arm:    -8
minimal x86:  -104
   unix x64:  -312
unix nanbox:  -120
      stm32:   -60
     cc3200:   -16
    esp8266:   -92
      esp32:   -24
2018-05-23 00:22:55 +10:00
Damien George
d298013939 py/emit: Combine name and global into one func for load/store/delete.
Reduces code size by:

   bare-arm:   -56
minimal x86:  -300
   unix x64:  -576
unix nanbox:  -300
      stm32:  -164
     cc3200:   -56
    esp8266:  -236
      esp32:   -76
2018-05-23 00:22:47 +10:00
Damien George
26b5754092 py/emit: Combine build tuple/list/map emit funcs into one.
Reduces code size by:

   bare-arm:   -24
minimal x86:  -192
   unix x64:  -288
unix nanbox:  -184
      stm32:   -72
     cc3200:   -16
    esp8266:  -148
      esp32:   -32
2018-05-23 00:22:44 +10:00
Damien George
e686c94052 py/emit: Combine yield value and yield-from emit funcs into one.
Reduces code size by:

   bare-arm:   -24
minimal x86:   -72
   unix x64:  -200
unix nanbox:   -72
      stm32:   -52
     cc3200:   -32
    esp8266:   -84
      esp32:   -24
2018-05-23 00:22:35 +10:00
Damien George
0a25fff956 py/emit: Combine fast and deref into one function for load/store/delete.
Reduces code size by:

   bare-arm:   -16
minimal x86:  -208
   unix x64:  -408
unix nanbox:  -248
      stm32:   -12
     cc3200:   -24
    esp8266:   -96
      esp32:   -44
2018-05-23 00:22:20 +10:00
Damien George
400273a799 py/objgenerator: Protect against reentering a generator.
Generators that are already executing cannot be reexecuted.  This patch
puts in a check for such a case.

Thanks to @jepler for finding the bug.
2018-05-22 16:54:03 +10:00
Damien George
771cb359af py/objgenerator: Save state in old_globals instead of local variable.
The code_state.old_globals variable is there to save the globals state so
should be used for this purpose, to avoid the need for additional local
variables on the C stack.
2018-05-22 16:39:19 +10:00
Jan Klusacek
b318ebf101 py/modbuiltins: Add support for rounding integers.
As per CPython semantics.  This feature is controlled by
MICROPY_PY_BUILTINS_ROUND_INT which is disabled by default.
2018-05-22 14:18:16 +10:00
Damien George
f2ec792554 py/parsenum: Adjust braces so they are balanced. 2018-05-22 13:20:00 +10:00
Damien George
6bd78741c1 py/gc: When GC threshold is hit don't unnecessarily collect twice.
Without this, if GC threshold is hit and there is not enough memory left to
satisfy the request, gc_collect() will run a second time and the search for
memory will happen again and will fail again.

Thanks to @adritium for pointing out this issue, see #3786.
2018-05-21 13:36:21 +10:00
Jeff Epler
95e43efc99 py/objfloat: Fix undefined integer behavior hashing negative zero.
Under ubsan, when evaluating hash(-0.) the following diagnostic occurs:

    ../../py/objfloat.c:102:15: runtime error: negation of
    -9223372036854775808 cannot be represented in type 'mp_int_t' (aka
    'long'); cast to an unsigned type to negate this value to itself

So do just that, to tell the compiler that we want to perform this
operation using modulo arithmetic rules.
2018-05-21 12:49:56 +10:00
Jeff Epler
c4dafcef4f py/mpz: Avoid undefined behavior at integer overflow in mpz_hash.
Before this, ubsan would detect a problem when executing
hash(006699999999999999999999999999999999999999999999999999999999999999999999)

    ../../py/mpz.c:1539:20: runtime error: left shift of 1067371580458 by
    32 places cannot be represented in type 'mp_int_t' (aka 'long')

When the overflow does occur it now happens as defined by the rules of
unsigned arithmetic.
2018-05-21 12:48:26 +10:00
Jeff Epler
60eb5305f6 py/objfloat: Fix undefined shifting behavior in high-quality float hash.
When computing e.g. hash(0.4e3) with ubsan enabled, a diagnostic like the
following would occur:

    ../../py/objfloat.c:91:30: runtime error: shift exponent 44 is too
    large for 32-bit type 'int'

By casting constant "1" to the right type the intended value is preserved.
2018-05-21 12:42:22 +10:00
Jeff Epler
4f71a2a75a py/parsenum: Avoid undefined behavior parsing floats with large exponents.
Fuzz testing combined with the undefined behavior sanitizer found that
parsing unreasonable float literals like 1e+9999999999999 resulted in
undefined behavior due to overflow in signed integer arithmetic, and a
wrong result being returned.
2018-05-21 12:37:57 +10:00
Damien George
5efc575067 py/parsenum: Use int instead of mp_int_t for parsing float exponent.
There is no need to use the mp_int_t type which may be 64-bits wide, there
is enough bit-width in a normal int to parse reasonable exponents.  Using
int helps to reduce code size for 64-bit ports, especially nan-boxing
builds.  (Similarly for the "dig" variable which is now an unsigned int.)
2018-05-21 12:27:38 +10:00
Jeff Epler
bc6c0b28bf py/emitbc: Avoid undefined behavior calling memset() with NULL 1st arg.
Calling memset(NULL, value, 0) is not standards compliant so we must add an
explicit check that emit->label_offsets is indeed not NULL before calling
memset (this pointer will be NULL on the first pass of the parse tree and
it's more logical / safer to check this pointer rather than check that the
pass is not the first one).

Code sanitizers will warn if NULL is passed as the first value to memset,
and compilers may optimise the code based on the knowledge that any pointer
passed to memset is guaranteed not to be NULL.
2018-05-21 12:04:20 +10:00
Damien George
828ce16dc8 py/compile: Change comment about ITER_BUF_NSLOTS to a static assertion. 2018-05-18 23:31:00 +10:00
Damien George
43d08d6dd6 py/misc.h: Add MP_STATIC_ASSERT macro to do static assertions. 2018-05-18 23:31:00 +10:00
Li Weiwei
3e6ab82179 py/repl: Fix handling of unmatched brackets and unfinished quotes.
Before this patch:

    >>> print(')
    ... ')
    Traceback (most recent call last):
      File "<stdin>", line 1
    SyntaxError: invalid syntax

After this patch:

    >>> print(')
    Traceback (most recent call last):
      File "<stdin>", line 1
    SyntaxError: invalid syntax

This matches CPython and prevents getting stuck in REPL continuation when a
1-quote is unmatched.
2018-05-18 15:23:02 +10:00
Damien George
869024dd6e py/vm: Improve performance of opcode dispatch when using switch stmt.
Before this patch, when using the switch statement for dispatch in the VM
(not computed goto) a pending exception check was done after each opcode.
This is not necessary and this patch makes the pending exception check only
happen when explicitly requested by certain opcodes, like jump.  This
improves performance of the VM by about 2.5% when using the switch.
2018-05-18 11:47:03 +10:00
Damien George
46ce395130 py/vm: Use enum names instead of magic numbers in multi-opcode dispatch. 2018-05-18 11:44:26 +10:00
Tom Collins
a883fe12d9 py/objfun: Fix variable name in DECODE_CODESTATE_SIZE() macro.
This patch fixes the macro so you can pass any name in, and the macro will
make more sense if you're reading it on its own.  It worked previously
because n_state is always passed in as n_state_out_var.
2018-05-17 11:20:06 +10:00
Damien George
1b7487e519 py/vm: Adjust #if logic for gil_divisor so braces are balanced.
Having balanced braces { and } makes it easier to navigate the function.
2018-05-16 12:33:39 +10:00
Damien George
c97607db5c py/nlrx86: Use naked attribute on nlr_push for gcc 8.0 and higher.
gcc 8.0 supports the naked attribute for x86 systems so it can now be used
here.  And in fact it is necessary to use this for nlr_push because gcc 8.0
no longer generates a prelude for this function (even without the naked
attribute).
2018-05-15 11:17:28 +10:00
Damien George
749b16174b py/mpstate.h: Adjust start of root pointer section to exclude non-ptrs.
This patch moves the start of the root pointer section in mp_state_ctx_t
so that it skips entries that are not pointers and don't need scanning.

Previously, the start of the root pointer section was at the very beginning
of the mp_state_ctx_t struct (which is the beginning of mp_state_thread_t).
This was the original assembler version of the NLR code was hard-coded to
have the nlr_top pointer at the start of this state structure.  But now
that the NLR code is partially written in C there is no longer this
restriction on the location of nlr_top (and a comment to this effect has
been removed in this patch).

So now the root pointer section starts part way through the
mp_state_thread_t structure, after the entries which are not root pointers.

This patch also moves the non-pointer entries for MICROPY_ENABLE_SCHEDULER
outside the root pointer section.

Moving non-pointer entries out of the root pointer section helps to make
the GC more precise and should help to prevent some cases of collectable
garbage being kept.

This patch also has a measurable improvement in performance of the
pystone.py benchmark: on unix x86-64 and stm32 there was an improvement of
roughly 0.6% (tested with both gcc 7.3 and gcc 8.1).
2018-05-13 22:53:28 +10:00
Damien George
9630376dbc py/mpconfig.h: Be stricter when autodetecting machine endianness.
This patch changes 2 things in the endianness detection:

1. Don't assume that __BYTE_ORDER__ not being __ORDER_LITTLE_ENDIAN__ means
   that the machine is big endian, so add an explicit check that this macro
   is indeed __ORDER_BIG_ENDIAN__ (same with __BYTE_ORDER, __LITTLE_ENDIAN
   and __BIG_ENDIAN).  A machine could have PDP endianness.

2. Remove the checks which base their autodetection decision on whether any
   little or big endian macros are defined (eg __LITTLE_ENDIAN__ or
   __BIG_ENDIAN__).  Just because a system defines these does not mean it
   has that endianness.

See issue #3760.
2018-05-11 21:51:34 +10:00
Damien George
6046e68fe1 py/repl: Initialise q_last variable to prevent compiler warnings.
Some older compilers cannot deduce that q_last is always written to before
being read.
2018-05-11 13:48:47 +10:00
Damien George
095d397017 py/objdeque: Fix sign extension bug when computing len of deque object.
For cases where size_t is smaller than mp_int_t (eg nan-boxing builds) the
difference between two size_t's is not sign extended into mp_int_t and so
the result is never negative.  This patch fixes this bug by using ssize_t
for the type of the result.
2018-05-11 13:44:50 +10:00
Damien George
3678a6bdc6 py/modbuiltins: Make built-in dir support the __dir__ special method.
If MICROPY_PY_ALL_SPECIAL_METHODS is enabled then dir() will now delegate
to the special method __dir__ if the object it is listing has this method.
2018-05-10 23:14:23 +10:00
Damien George
29d28c2574 py/modbuiltins: In built-in dir make use of mp_load_method_protected.
This gives dir() better behaviour when listing the attributes of a user
type that defines __getattr__: it will now not list those attributes for
which __getattr__ raises AttributeError (meaning the attribute is not
supported by the object).
2018-05-10 23:07:19 +10:00
Damien George
7241d90272 py/repl: Use mp_load_method_protected to prevent leaking of exceptions.
This patch fixes the possibility of a crash of the REPL when tab-completing
an object which raises an exception when its attributes are accessed.

See issue #3729.
2018-05-10 23:05:43 +10:00
Damien George
529860643b py/modbuiltins: Make built-in hasattr work properly for user types.
It now allows __getattr__ in a user type to raise AttributeError when the
attribute does not exist.
2018-05-10 23:03:30 +10:00