All uses of this are either tiny strings or not-known-to-be-safe.
Update comments for mp_obj_new_str_copy and mp_obj_new_str_of_type.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
Since f7f56d4285 consolidated all uses of
these to a single locals dict, they no longer need to be made public.
Signed-off-by: Damien George <damien@micropython.org>
These were added in Python 3.5.
Enabled via MICROPY_PY_BUILTINS_BYTES_HEX, and enabled by default for all
ports that currently have ubinascii.
Rework ubinascii to use the implementation of these methods.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This commit adds the bytes methods to bytearray, matching CPython. The
existing implementations of these methods for str/bytes are reused for
bytearray with minor updates to match CPython return types.
For details on the CPython behaviour see
https://docs.python.org/3/library/stdtypes.html#bytes-and-bytearray-operations
The work to merge locals tables for str/bytes/bytearray/array was done by
@jimmo. Because of this merging of locals the change in code size for this
commit is mostly negative:
bare-arm: +0 +0.000%
minimal x86: +29 +0.018%
unix x64: -792 -0.128% standard[incl -448(data)]
unix nanbox: -436 -0.078% nanbox[incl -448(data)]
stm32: -40 -0.010% PYBV10
cc3200: -32 -0.017%
esp8266: -28 -0.004% GENERIC
esp32: -72 -0.005% GENERIC[incl -200(data)]
mimxrt: -40 -0.011% TEENSY40
renesas-ra: -40 -0.006% RA6M2_EK
nrf: -16 -0.009% pca10040
rp2: -64 -0.013% PICO
samd: +148 +0.105% ADAFRUIT_ITSYBITSY_M4_EXPRESS
The hash is either 8 or 16 bits (depending on MICROPY_QSTR_BYTES_IN_HASH)
so will fit in a size_t.
This saves 268 bytes on the unix nanbox build. Non-nanbox configurations
are unchanged because mp_uint_t is the same size as size_t.
Signed-off-by: Damien George <damien@micropython.org>
Reuse the implementation for bytes since it works the same way regardless
of the underlying type. This method gets added for CPython compatibility
of bytearray, but to keep the code simple and small array.array now also
has a working decode method, which is non-standard but doesn't hurt.
These macros could in principle be (inline) functions so it makes sense to
have them lower case, to match the other C API functions.
The remaining macros that are upper case are:
- MP_OBJ_TO_PTR, MP_OBJ_FROM_PTR
- MP_OBJ_NEW_SMALL_INT, MP_OBJ_SMALL_INT_VALUE
- MP_OBJ_NEW_QSTR, MP_OBJ_QSTR_VALUE
- MP_OBJ_FUN_MAKE_SIG
- MP_DECLARE_CONST_xxx
- MP_DEFINE_CONST_xxx
These must remain macros because they are used when defining const data (at
least, MP_OBJ_NEW_SMALL_INT is so it makes sense to have
MP_OBJ_SMALL_INT_VALUE also a macro).
For those macros that have been made lower case, compatibility macros are
provided for the old names so that users do not need to change their code
immediately.
The function mp_obj_new_str_of_type is a general str object constructor
used in many places in the code to create either a str or bytes object.
When creating a str it should first check if the string data already exists
as an interned qstr, and if so then return the qstr object. This patch
makes the function have such behaviour, which helps to reduce heap usage by
reusing existing interned data where possible.
The old behaviour of mp_obj_new_str_of_type (which didn't check for
existing interned data) is made available through the function
mp_obj_new_str_copy, but should only be used in very special cases.
One consequence of this patch is that the following expression is now True:
'abc' is ' abc '.split()[0]
The unary-op/binary-op enums are already defined, and there are no
arithmetic tricks used with these types, so it makes sense to use the
correct enum type for arguments that take these values. It also reduces
code size quite a bit for nan-boxing builds.
The code conventions suggest using header guards, but do not define how
those should look like and instead point to existing files. However, not
all existing files follow the same scheme, sometimes omitting header guards
altogether, sometimes using non-standard names, making it easy to
accidentally pick a "wrong" example.
This commit ensures that all header files of the MicroPython project (that
were not simply copied from somewhere else) follow the same pattern, that
was already present in the majority of files, especially in the py folder.
The rules are as follows.
Naming convention:
* start with the words MICROPY_INCLUDED
* contain the full path to the file
* replace special characters with _
In addition, there are no empty lines before #ifndef, between #ifndef and
one empty line before #endif. #endif is followed by a comment containing
the name of the guard macro.
py/grammar.h cannot use header guards by design, since it has to be
included multiple times in a single C file. Several other files also do not
need header guards as they are only used internally and guaranteed to be
included only once:
* MICROPY_MPHALPORT_H
* mpconfigboard.h
* mpconfigport.h
* mpthreadport.h
* pin_defs_*.h
* qstrdefs*.h
In order to have more fine-grained control over how builtin functions are
constructed, the MP_DECLARE_CONST_FUN_OBJ macros are made more specific,
with suffix of _0, _1, _2, _3, _VAR, _VAR_BETEEN or _KW. These names now
match the MP_DEFINE_CONST_FUN_OBJ macros.
Disabled by default, enabled in unix port. Need for this method easily
pops up when working with text UI/reporting, and coding workalike
manually again and again counter-productive.
The first argument to the type.make_new method is naturally a uPy type,
and all uses of this argument cast it directly to a pointer to a type
structure. So it makes sense to just have it a pointer to a type from
the very beginning (and a const pointer at that). This patch makes
such a change, and removes all unnecessary casting to/from mp_obj_t.
This patch changes the type signature of .make_new and .call object method
slots to use size_t for n_args and n_kw (was mp_uint_t. Makes code more
efficient when mp_uint_t is larger than a machine word. Doesn't affect
ports when size_t and mp_uint_t have the same size.
This allows the mp_obj_t type to be configured to something other than a
pointer-sized primitive type.
This patch also includes additional changes to allow the code to compile
when sizeof(mp_uint_t) != sizeof(void*), such as using size_t instead of
mp_uint_t, and various casts.
This saves around 1000 bytes (Thumb2 arch) because in repr "C" it is
costly to check and extract a qstr. So making such check/extract a
function instead of a macro saves lots of code space.
Previous to this patch the printing mechanism was a bit of a tangled
mess. This patch attempts to consolidate printing into one interface.
All (non-debug) printing now uses the mp_print* family of functions,
mainly mp_printf. All these functions take an mp_print_t structure as
their first argument, and this structure defines the printing backend
through the "print_strn" function of said structure.
Printing from the uPy core can reach the platform-defined print code via
two paths: either through mp_sys_stdout_obj (defined pert port) in
conjunction with mp_stream_write; or through the mp_plat_print structure
which uses the MP_PLAT_PRINT_STRN macro to define how string are printed
on the platform. The former is only used when MICROPY_PY_IO is defined.
With this new scheme printing is generally more efficient (less layers
to go through, less arguments to pass), and, given an mp_print_t*
structure, one can call mp_print_str for efficiency instead of
mp_printf("%s", ...). Code size is also reduced by around 200 bytes on
Thumb2 archs.
splitlines() occurs ~179 times in CPython3 standard library, so was
deemed worthy to implement. The method has subtle semantic differences
from just .split("\n"). It is also defined as working for any end-of-line
combination, but this is currently not implemented - it works only with
LF line-endings (which should be OK for text strings on any platforms,
but not OK for bytes).
There was really weird warning (promoted to error) when building Windows
port. Exact cause is still unknown, but it uncovered another issue:
8-bit and unicode str_make_new implementations should be mutually exclusive,
and not built at the same time. What we had is that bytes_decode() pulled
8-bit str_make_new() even for unicode build.