Commit Graph

305 Commits

Author SHA1 Message Date
pohmelie
e3a29de1dc py/objstr: For str.format, add nested/computed fields support.
Eg: '{:{}}'.format(123, '>20')

@pohmelie was the original author of this patch, but @dpgeorge made
significant changes to reduce code size and improve efficiency.
2016-02-02 16:25:24 +00:00
Damien George
22d85ec5be py: Use new code pattern for parsing kw args with mp_arg_parse_all.
Makes code easier to read and more maintainable.
2016-01-13 15:47:56 +00:00
Damien George
5b3f0b7f39 py: Change first arg of type.make_new from mp_obj_t to mp_obj_type_t*.
The first argument to the type.make_new method is naturally a uPy type,
and all uses of this argument cast it directly to a pointer to a type
structure.  So it makes sense to just have it a pointer to a type from
the very beginning (and a const pointer at that).  This patch makes
such a change, and removes all unnecessary casting to/from mp_obj_t.
2016-01-11 00:49:27 +00:00
Damien George
4b72b3a133 py: Change type signature of builtin funs that take variable or kw args.
With this patch the n_args parameter is changed type from mp_uint_t to
size_t.
2016-01-11 00:49:27 +00:00
Damien George
a0c97814df py: Change type of .make_new and .call args: mp_uint_t becomes size_t.
This patch changes the type signature of .make_new and .call object method
slots to use size_t for n_args and n_kw (was mp_uint_t.  Makes code more
efficient when mp_uint_t is larger than a machine word.  Doesn't affect
ports when size_t and mp_uint_t have the same size.
2016-01-11 00:48:41 +00:00
Damien George
d4df8f4925 py/objstr: In str.format, handle case of no format spec for string arg.
Handles, eg, "{:>20}".format("foo"), where there is no explicit spec for
the type of the argument.
2016-01-04 13:13:39 +00:00
Damien George
8212d97317 py: Use polymorphic iterator type where possible to reduce code size.
Only types whose iterator instances still fit in 4 machine words have
been changed to use the polymorphic iterator.

Reduces Thumb2 arch code size by 264 bytes.
2016-01-03 16:27:55 +00:00
Paul Sokolovsky
d50f649cf8 py/objstr: Applying % (format) operator to bytes should return bytes, not str. 2015-12-20 16:52:11 +02:00
Paul Sokolovsky
ef63ab5724 py/objstr: Make sure that b"%s" % b"foo" uses undecorated bytes value.
I.e. the expected result for above is b"foo", whereas previously we got
b"b'foo'".
2015-12-20 16:51:59 +02:00
Damien George
999cedb90f py: Wrap all obj-ptr conversions in MP_OBJ_TO_PTR/MP_OBJ_FROM_PTR.
This allows the mp_obj_t type to be configured to something other than a
pointer-sized primitive type.

This patch also includes additional changes to allow the code to compile
when sizeof(mp_uint_t) != sizeof(void*), such as using size_t instead of
mp_uint_t, and various casts.
2015-11-29 14:25:35 +00:00
Damien George
cbf7674025 py: Add MP_ROM_* macros and mp_rom_* types and use them. 2015-11-29 14:25:04 +00:00
Damien George
c3f64d9799 py: Change qstr_* functions to use size_t as the type for str len arg. 2015-11-29 14:25:04 +00:00
Damien George
04353cc85e py: With obj repr "C", change raw str accessor from macro to function.
This saves around 1000 bytes (Thumb2 arch) because in repr "C" it is
costly to check and extract a qstr.  So making such check/extract a
function instead of a macro saves lots of code space.
2015-10-20 12:38:54 +01:00
Damien George
aaef1851a7 py: Add mp_obj_is_float function (macro) and use it where appropriate. 2015-10-20 12:35:17 +01:00
Paul Sokolovsky
1b586f3a73 py: Rename MP_BOOL() to mp_obj_new_bool() for consistency in naming. 2015-10-11 15:18:15 +03:00
Damien George
3a2171e406 py: Eliminate some cases which trigger unused parameter warnings. 2015-09-04 16:53:46 +01:00
Damien George
42cec5c893 py/objstr: Check for keyword args before checking for no posn args.
Otherwise something like bytes(abc=123) will succeed.
2015-09-04 16:51:55 +01:00
Damien George
55b11e6d38 py/objstr: For str.endswith(s, start) raise NotImpl instead of assert. 2015-09-04 16:49:56 +01:00
Damien George
821b7f22fe py: Use mp_not_implemented consistently for not implemented features. 2015-09-03 23:14:06 +01:00
Damien George
e2aa117798 py/objstr: Simplify printing of bytes objects when unicode enabled. 2015-09-03 23:03:57 +01:00
Damien George
516982242d py: Inline single use of mp_obj_str_get_len in mp_obj_len_maybe.
Gets rid of redundant double check for string type.

Also remove obsolete declaration of mp_obj_str_get_hash.
2015-09-03 23:01:07 +01:00
Damien George
22602cc37b py/objstr: Make str.rsplit(None,n) raise NotImpl instead of assert(0). 2015-09-01 15:35:31 +01:00
Damien George
000730ecaa py/objstr: Simplify error handling for bad conversion specifier. 2015-08-30 12:43:21 +01:00
Damien George
b648e98ad0 py/objstr: Fix error reporting for unexpected end of modulo format str. 2015-08-29 23:13:51 +01:00
Damien George
7ef75f9f75 py/objstr: Fix error type for badly formatted format specifier.
Was KeyError, should be ValueError.
2015-08-29 23:13:51 +01:00
Damien George
51b9a0d0c4 py/objstr: Make string formatting 8-bit clean. 2015-08-29 23:13:51 +01:00
Dave Hylands
9f76dcd682 py: Prevent many extra vstr allocations.
I checked the entire codebase, and every place that vstr_init_len
was called, there was a call to mp_obj_new_str_from_vstr after it.

mp_obj_new_str_from_vstr always tries to reallocate a new buffer
1 byte larger than the original to store the terminating null
character.

In many cases, if we allocated the initial buffer to be 1 byte
longer, we can prevent this extra allocation, and just reuse
the originally allocated buffer.

Asking to read 256 bytes and only getting 100 will still cause
the extra allocation, but if you ask to read 256 and get 256
then the extra allocation will be optimized away.

Yes - the reallocation is optimized in the heap to try and reuse
the buffer if it can, but it takes quite a few cycles to figure
this out.

Note by Damien: vstr_init_len should now be considered as a
string-init convenience function and used only when creating
null-terminated objects.
2015-07-06 17:29:27 +01:00
Paul Sokolovsky
f44cc517a2 objstr: Add note that replace() is nicely optimized.
Doesn't allocate memory and returns original string if no replacements are
to be made.
2015-06-26 17:35:12 +03:00
Damien George
79474c6b16 py: Remove unnecessary extra handling of padding of nan/inf.
C's printf will pad nan/inf differently to CPython.  Our implementation
originally conformed to C, now it conforms to CPython's way.

Tests for this are also added in this patch.
2015-05-28 14:22:12 +00:00
Damien George
44e7cbf019 py: Clean up declarations of str type/funcs that are also in unicode.
Background: trying to make an amalgamation of all the code gave some
errors with redefined types and inconsistent use of static.
2015-05-17 16:44:24 +01:00
Damien George
c2a4e4effc py: Convert hash API to use MP_UNARY_OP_HASH instead of ad-hoc function.
Hashing is now done using mp_unary_op function with MP_UNARY_OP_HASH as
the operator argument.  Hashing for int, str and bytes still go via
fast-path in mp_unary_op since they are the most common objects which
need to be hashed.

This lead to quite a bit of code cleanup, and should be more efficient
if anything.  It saves 176 bytes code space on Thumb2, and 360 bytes on
x86.

The only loss is that the error message "unhashable type" is now the
more generic "unsupported type for __hash__".
2015-05-12 22:46:02 +01:00
Damien George
ede0f3ab3d py: Add optional code to check bytes constructor values are in range.
Compiled in only if MICROPY_CPYTHON_COMPAT is set.

Addresses issue #1093.
2015-04-23 15:28:18 +01:00
Damien George
7f9d1d6ab9 py: Overhaul and simplify printf/pfenv mechanism.
Previous to this patch the printing mechanism was a bit of a tangled
mess.  This patch attempts to consolidate printing into one interface.

All (non-debug) printing now uses the mp_print* family of functions,
mainly mp_printf.  All these functions take an mp_print_t structure as
their first argument, and this structure defines the printing backend
through the "print_strn" function of said structure.

Printing from the uPy core can reach the platform-defined print code via
two paths: either through mp_sys_stdout_obj (defined pert port) in
conjunction with mp_stream_write; or through the mp_plat_print structure
which uses the MP_PLAT_PRINT_STRN macro to define how string are printed
on the platform.  The former is only used when MICROPY_PY_IO is defined.

With this new scheme printing is generally more efficient (less layers
to go through, less arguments to pass), and, given an mp_print_t*
structure, one can call mp_print_str for efficiency instead of
mp_printf("%s", ...).  Code size is also reduced by around 200 bytes on
Thumb2 archs.
2015-04-16 14:30:16 +00:00
Paul Sokolovsky
8b7faa31e1 objstr: split(None): Fix whitespace properly. 2015-04-12 00:17:57 +03:00
Damien George
2801e6fad8 py: Some trivial cosmetic changes, for code style consistency. 2015-04-04 15:53:11 +01:00
Paul Sokolovsky
7f59b4b2ca objstr: Fix bugs introduced by inability to have shadow variables.
Warnings lead to programming errors - as expected.
2015-04-04 01:55:40 +03:00
Paul Sokolovsky
acf6aec71c objstr: Avoid variable shadowing. 2015-04-04 01:24:59 +03:00
Paul Sokolovsky
ac2f7a7f6a objstr: Add .splitlines() method.
splitlines() occurs ~179 times in CPython3 standard library, so was
deemed worthy to implement. The method has subtle semantic differences
from just .split("\n"). It is also defined as working for any end-of-line
combination, but this is currently not implemented - it works only with
LF line-endings (which should be OK for text strings on any platforms,
but not OK for bytes).
2015-04-04 00:09:48 +03:00
Paul Sokolovsky
8705171233 objstr: Expose mp_obj_str_split() for reuse in other modules. 2015-03-23 22:43:37 +02:00
Damien George
fa1edff006 py: Remove unnecessary and unused sgn argument from pfenv_print_mp_int. 2015-03-14 22:32:40 +00:00
Paul Sokolovsky
194117a066 objstr: Fix bytes creation from array of long ints. 2015-02-09 12:11:49 +08:00
Damien George
827b0f747b py: Change vstr_null_terminate -> vstr_null_terminated_str, returns str. 2015-01-29 13:57:23 +00:00
Damien George
0d3cb6726d py: Change vstr so that it doesn't null terminate buffer by default.
This cleans up vstr so that it's a pure "variable buffer", and the user
can decide whether they need to add a terminating null byte.  In most
places where vstr is used, the vstr did not need to be null terminated
and so this patch saves code size, a tiny bit of RAM, and makes vstr
usage more efficient.  When null termination is needed it must be
done explicitly using vstr_null_terminate.
2015-01-28 23:43:01 +00:00
Paul Sokolovsky
bbd9251bac py: bytes(): Make sure we add values as bytes, not as chars. 2015-01-28 22:29:07 +02:00
Damien George
98e3a64694 py: Remove duplicated mp_obj_str_make_new function from objstrunicode.c. 2015-01-28 14:14:57 +00:00
Paul Sokolovsky
344e15b1ae objstr: Remove code duplication and unbreak Windows build.
There was really weird warning (promoted to error) when building Windows
port. Exact cause is still unknown, but it uncovered another issue:
8-bit and unicode str_make_new implementations should be mutually exclusive,
and not built at the same time. What we had is that bytes_decode() pulled
8-bit str_make_new() even for unicode build.
2015-01-23 02:15:56 +02:00
Paul Sokolovsky
6113eb2f33 objstr*: Use separate names for locals_dict of 8-bit and unicode str's.
To somewhat unbreak -DSTATIC="" compile.
2015-01-23 02:05:58 +02:00
Damien George
77089bebd4 py: Add comments for vstr_init and mp_obj_new_str. 2015-01-21 23:18:02 +00:00
Damien George
05005f679e py: Remove mp_obj_str_builder and use vstr instead.
With this patch str/bytes construction is streamlined.  Always use a
vstr to build a str/bytes object.  If the size is known beforehand then
use vstr_init_len to allocate only required memory.  Otherwise use
vstr_init and the vstr will grow as needed.  Then use
mp_obj_new_str_from_vstr to create a str/bytes object using the vstr
memory.

Saves code ROM: 68 bytes on stmhal, 108 bytes on bare-arm, and 336 bytes
on unix x64.
2015-01-21 23:18:02 +00:00
Damien George
0b9ee86133 py: Add mp_obj_new_str_from_vstr, and use it where relevant.
This patch allows to reuse vstr memory when creating str/bytes object.
This improves memory usage.

Also saves code ROM: 128 bytes on stmhal, 92 bytes on bare-arm, and 88
bytes on unix x64.
2015-01-21 23:17:27 +00:00
Damien George
ff8dd3f486 py, unix: Allow to compile with -Wunused-parameter.
See issue #699.
2015-01-20 12:47:20 +00:00
Damien George
50912e7f5d py, unix, stmhal: Allow to compile with -Wshadow.
See issue #699.
2015-01-20 11:55:10 +00:00
Damien George
963a5a3e82 py, unix: Allow to compile with -Wsign-compare.
See issue #699.
2015-01-16 17:47:07 +00:00
Damien George
0178aa9a11 py, unix: Allow to compile with -Wdouble-promotion.
Ref issue #699.
2015-01-12 21:56:35 +00:00
Damien George
e233a55a29 py: Remove unnecessary BINARY_OP_EQUAL code that just checks pointers.
Previous patch c38dc3ccc7 allowed any
object to be compared with any other, using pointer comparison for a
fallback.  As such, existing code which checked for this case is no
longer needed.
2015-01-11 21:07:15 +00:00
Paul Sokolovsky
ff8e35b42e objstr: Common subexpression elimination for vstr_str(field_name). 2015-01-04 13:23:44 +02:00
Paul Sokolovsky
c114496641 objstr: Implement kwargs support for str.format(). 2015-01-04 00:26:31 +02:00
Damien George
51dfcb4bb7 py: Move to guarded includes, everywhere in py/ core.
Addresses issue #1022.
2015-01-01 20:32:09 +00:00
Paul Sokolovsky
2c75665445 objstr: Fix %d-formatting of floats. 2014-12-31 02:21:19 +02:00
Damien George
c55a4d82cf py: Make bytes objs work with more str methods; add tests. 2014-12-24 20:28:30 +00:00
Damien George
81836c28b3 py: Use str_to_int function in more places to reduce code size. 2014-12-21 21:07:03 +00:00
Damien George
b4fe6e28eb py: Fix function type: () -> (void). 2014-12-10 18:05:42 +00:00
Damien George
32ef3a3517 py: Allow bytes/bytearray/array to be init'd by buffer protocol objects.
Behaviour of array initialisation is subtly different for bytes,
bytearray and array.array when argument has buffer protocol.  This patch
gets us CPython conformant (except we allow initialisation of
array.array by buffer with length not a multiple of typecode).
2014-12-04 15:46:14 +00:00
Damien George
6f5eb84c19 py: #if guard str_make_new when not needed. 2014-11-27 16:55:47 +00:00
Damien George
1e9a92f84f py: Use shorter, static error msgs when ERROR_REPORTING_TERSE enabled.
Going from MICROPY_ERROR_REPORTING_NORMAL to
MICROPY_ERROR_REPORTING_TERSE now saves 2020 bytes ROM for ARM Thumb2,
and 2200 bytes ROM for 32-bit x86.

This is about a 2.5% code size reduction for bare-arm.
2014-11-06 17:36:16 +00:00
Damien George
be8e99c7d4 py: Allow bytes object as argument to some str methods.
This turns failing assertions to type exceptions for things like
b"123".find(...).  We still don't support operations like this on bytes
objects (unlike CPython), but at least it no longer crashes.
2014-11-05 16:45:54 +00:00
Damien George
a65c03c6c0 py: Allow +, in, and compare ops between bytes and bytearray/array.
Eg b"123" + bytearray(2) now works.  This patch actually decreases code
size while adding functionality: 32-bit unix down by 128 bytes, stmhal
down by 84 bytes.
2014-11-05 16:30:34 +00:00
Paul Sokolovsky
e62a0fe367 objstr: Allow to convert any buffer proto object to str.
Original motivation is to support converting bytearrays, but easier to just
support buffer protocol at all.
2014-10-31 00:03:53 +02:00
Paul Sokolovsky
31619cc589 py: mp_obj_str_get_str(): Work with bytes too.
It should be fair to say that almost in all cases where some API call
expects string, it should be also possible to pass byte string. For example,
it should be open/delete/rename file with name as bytestring. Note that
similar change was done quite a long ago to mp_obj_str_get_data().
2014-10-31 00:00:39 +02:00
Damien George
3aa09f5784 py: Use MP_OBJ_NULL instead of NULL in a few places. 2014-10-23 12:06:53 +01:00
Damien George
20f59e182e py: Make mp_const_empty_bytes globally available. 2014-10-21 21:02:56 +01:00
Damien George
39dc145478 py: Change [u]int to mp_[u]int_t in qstr.[ch], and some other places.
This should pretty much resolve issue #50.
2014-10-03 19:52:22 +01:00
Damien George
cde0ca21bf py: Simplify JSON str printing (while still conforming to JSON spec).
The JSON specs are relatively flexible and allow us to use one function
to print strings, be they ascii, bytes or utf-8 encoded.
2014-09-25 17:35:56 +01:00
Damien George
612045f53f py: Add native json printing using existing print framework.
Also add start of ujson module with dumps implemented.  Enabled in unix
and stmhal ports.  Test passes on both.
2014-09-17 22:56:34 +01:00
Damien George
4abff7500f py: Change uint to mp_uint_t in runtime.h, stackctrl.h, binary.h.
Part of code cleanup, working towards resolving issue #50.
2014-08-30 14:59:21 +01:00
Damien George
4d91723587 py: Remove use of int type in obj.h.
Part of code cleanup, working towards resolving issue #50.
2014-08-30 14:28:06 +01:00
Damien George
d182b98a37 py: Change all uint to mp_uint_t in obj.h.
Part of code cleanup, working towards resolving issue #50.
2014-08-30 14:19:41 +01:00
Damien George
9c4cbe2ac0 py: Make tuple and list use mp_int_t/mp_uint_t.
Part of code cleanup, to resolve issue #50.
2014-08-30 14:04:14 +01:00
Damien George
ecc88e949c Change some parts of the core API to use mp_uint_t instead of uint/int.
Addressing issue #50, still some way to go yet.
2014-08-30 00:35:11 +01:00
Damien George
17ae2395c2 py: Use memmove instead of memcpy when appropriate.
Found this bug by running unix/ tests with DEBUG=1 enabled when
compiling.
2014-08-29 21:07:54 +01:00
Damien George
a75b02ea9b py: Improve efficiency of MP_OBJ_IS_STR_OR_BYTES.
Saves ROM (16 on stmhal, 240 on 64-bit unix) and should be quicker since
there is 1 less branch.
2014-08-27 09:20:30 +01:00
Dave Hylands
b7f7c655ed Make int(b'123') work properly. 2014-08-26 19:15:04 -07:00
Damien George
9b7a8ee8f1 py: Fix mult by negative number of tuple, list, str, bytes.
Multiplication of a tuple, list, str or bytes now yields an empty
sequence (instead of crashing).  Addresses issue #799

Also added ability to mult bytes on LHS by integer.
2014-08-13 13:22:24 +01:00
Damien George
2eb1f604ee py, objstr: Optimise bytes subscr when unicode is enabled.
Saves code bytes and makes it faster, so why not?
2014-08-11 23:24:29 +01:00
Paul Sokolovsky
9749b2fb0d objstr: Make sure that bytes are indexed as bytes, not as unicode.
Fixes #795.
2014-08-11 22:38:00 +03:00
Paul Sokolovsky
0c5498540b objstr: split(): check arg type consistency (str vs bytes).
Similar to other methods and following CPython3 strictness.
2014-08-10 23:21:16 +03:00
Damien George
bb4c6f35c6 py: Make MP_OBJ_NEW_SMALL_INT cast arg to mp_int_t itself.
Addresses issue #724.
2014-07-31 10:49:14 +01:00
Damien George
5f27a7e811 py: Add mp_obj_str_builder_end_with_len.
This allows to create str's with a smaller length than initially asked
for.
2014-07-31 10:29:56 +01:00
Damien George
40f3c02682 Rename machine_(u)int_t to mp_(u)int_t.
See discussion in issue #50.
2014-07-03 13:25:24 +01:00
Paul Sokolovsky
9e215fa4c2 py: Make unichar_charlen() accept/return machine_uint_t. 2014-06-28 23:15:29 +03:00
Damien George
e04a44e2f6 py: Small comments, name changes, use of machine_int_t. 2014-06-28 10:27:23 +01:00
Paul Sokolovsky
ea2c936c7e objstrunicode: Refactor str_index_to_ptr() following objstr. 2014-06-27 00:04:20 +03:00
Paul Sokolovsky
26fda6dc8e objstr: 64-bit issues. 2014-06-27 00:04:19 +03:00
Paul Sokolovsky
5048df0b7c objstr: find(), rfind(), index(): Make return value be unicode-aware. 2014-06-27 00:04:19 +03:00
Paul Sokolovsky
cdc020da4b objstrunicode: Re-add buffer protocol back for now, required for io.StringIO. 2014-06-27 00:04:18 +03:00
Paul Sokolovsky
d215ee1dc1 py: Make MICROPY_PY_BUILTINS_STR_UNICODE=1 buildable. 2014-06-27 00:04:18 +03:00
Paul Sokolovsky
9731912ccb py: Prune unneeded code from objstrunicode, reuse code in objstr. 2014-06-27 00:04:18 +03:00
Paul Sokolovsky
e3cfc0d33d objstr: Refactor to work with char pointers instead of indexes.
In preparation for unicode support.
2014-06-14 06:30:30 +03:00
Paul Sokolovsky
2ec38a17d4 objstr: Be 8-bit clean even for repr().
This will allow roughly the same behavior as Python3 for non-ASCII strings,
for example, print("<phrase in non-Latin script>".split()) will print list
of words, not weird hex dump (like Python2 behaves). (Of course, that it
will print list of words, if there're "words" in that phrase at all, separated
by ASCII-compatible whitespace; that surely won't apply to every human
language in existence).
2014-06-14 01:21:13 +03:00
Paul Sokolovsky
b4efac14cd py: Make sure getattr() works with non-interned strings (by interning them). 2014-06-08 01:15:06 +03:00