Commit Graph

2458 Commits

Author SHA1 Message Date
Paul Sokolovsky
f5f6c3b792 streams: Reading by char count from unicode text streams is not implemented. 2014-06-27 00:04:20 +03:00
Paul Sokolovsky
ce81312d8a misc: Add count_lead_ones() function, useful for UTF-8 handling. 2014-06-27 00:04:20 +03:00
Paul Sokolovsky
63143c94ce tests: Test for explicit start/end args to str methods for unicode. 2014-06-27 00:04:20 +03:00
Paul Sokolovsky
ea2c936c7e objstrunicode: Refactor str_index_to_ptr() following objstr. 2014-06-27 00:04:20 +03:00
Paul Sokolovsky
26fda6dc8e objstr: 64-bit issues. 2014-06-27 00:04:19 +03:00
Paul Sokolovsky
00c904b47a objstrunicode: Signedness issues. 2014-06-27 00:04:19 +03:00
Paul Sokolovsky
1044c3dfe6 unicode: Make get_char()/next_char()/charlen() be 8-bit compatible.
Based on config define.
2014-06-27 00:04:19 +03:00
Paul Sokolovsky
b1949e4c09 tests: Add tests for unicode find()/rfind()/index(). 2014-06-27 00:04:19 +03:00
Paul Sokolovsky
5048df0b7c objstr: find(), rfind(), index(): Make return value be unicode-aware. 2014-06-27 00:04:19 +03:00
Paul Sokolovsky
46d31e9ca9 unicode: Add utf8_ptr_to_index().
Useful when we have pointer to char inside string, but need to return char
index. (E.g. str.find()).
2014-06-27 00:04:19 +03:00
Paul Sokolovsky
ded0fc77f7 py: Add dedicated unicode header. 2014-06-27 00:04:19 +03:00
Paul Sokolovsky
17994d1bd3 tests: Add test for unicode string iteration. 2014-06-27 00:04:19 +03:00
Paul Sokolovsky
79b7fe2ee5 objstrunicode: Implement iterator. 2014-06-27 00:04:19 +03:00
Paul Sokolovsky
cdc020da4b objstrunicode: Re-add buffer protocol back for now, required for io.StringIO. 2014-06-27 00:04:18 +03:00
Paul Sokolovsky
e7f2b4c875 objstrunicode: Revamp len() handling for unicode, and optimize bool(). 2014-06-27 00:04:18 +03:00
Paul Sokolovsky
86d3898e70 objstrunicode: Get rid of bytes checking, it's separate type. 2014-06-27 00:04:18 +03:00
Paul Sokolovsky
d215ee1dc1 py: Make MICROPY_PY_BUILTINS_STR_UNICODE=1 buildable. 2014-06-27 00:04:18 +03:00
Paul Sokolovsky
9731912ccb py: Prune unneeded code from objstrunicode, reuse code in objstr. 2014-06-27 00:04:18 +03:00
Paul Sokolovsky
165eb69b86 vstr: Restore bytestr compatibility. 2014-06-27 00:04:18 +03:00
Paul Sokolovsky
42a52516fe builtin: Restore bytestr compatibility. 2014-06-27 00:04:18 +03:00
Chris Angelico
2ba2299d28 lexer, vstr: Add unicode support. 2014-06-27 00:04:18 +03:00
Chris Angelico
1e3781bc35 tests: Add unicode test. 2014-06-27 00:04:17 +03:00
Chris Angelico
9a1a4beb56 builtin: ord, chr: Unicode support. 2014-06-27 00:04:17 +03:00
Chris Angelico
64b468d873 objstrunicode: Basic implementation of unicode handling.
Squashed commit of the following:

commit 99dc21b67a895dc10d3c846bc158d27c839cee48
Author: Chris Angelico <rosuav@gmail.com>
Date:   Thu Jun 12 02:18:54 2014 +1000

    Optimize as per TODO (thanks Damien!)

commit 5bf0153ecad8348443058d449d74504fc458fe51
Author: Chris Angelico <rosuav@gmail.com>
Date:   Tue Jun 10 08:42:06 2014 +1000

    Test a default (= UTF-8) encode and decode

commit c962057ac340832c4fde60896f656a3fe3ad78a9
Merge: e2c9782 195de32
Author: Chris Angelico <rosuav@gmail.com>
Date:   Tue Jun 10 05:23:03 2014 +1000

    Merge branch 'master' into unicode, resolving conflict on py/obj.h

commit e2c9782a65eb57f481d441d40161de427e1940ba
Author: Chris Angelico <rosuav@gmail.com>
Date:   Tue Jun 10 05:05:57 2014 +1000

    More whitespace fixups

commit 086a2a0f57afbc1f731697fd5d3a0cbbb80e5418
Author: Chris Angelico <rosuav@gmail.com>
Date:   Tue Jun 10 05:04:20 2014 +1000

    Properly implement string slicing

commit 0d339a143e2b6442366145e7f3d64aada293eaa0
Author: Chris Angelico <rosuav@gmail.com>
Date:   Tue Jun 10 02:24:11 2014 +1000

    Support slicing in str_index_to_ptr, and fix a bounds error

commit 24371c7267d360e77cf5eabc2e8ce9a73d2ee0da
Author: Chris Angelico <rosuav@gmail.com>
Date:   Tue Jun 10 02:10:22 2014 +1000

    Break out index-to-pointer calculation into a function

commit 616c24ac014c3ca56008428c506034dd1bfff7a8
Author: Chris Angelico <rosuav@gmail.com>
Date:   Tue Jun 10 02:03:11 2014 +1000

    Add tests of string slicing, which currently fail

commit a24d19f676fe8cc21dad512d91b826892e162a5b
Author: Chris Angelico <rosuav@gmail.com>
Date:   Tue Jun 10 01:56:53 2014 +1000

    Change string indexing to not precalculate the charlen, and add test for neg indexing

commit 0bcc7ab89eafb2ae53195e94c9bea42a4e886b64
Author: Chris Angelico <rosuav@gmail.com>
Date:   Sun Jun 8 22:09:17 2014 +1000

    Clean up constant qstr declarations now that charlen isn't needed

commit 5473e1a1dba2124b7b0c207f2964293cfbe80167
Author: Chris Angelico <rosuav@gmail.com>
Date:   Sun Jun 8 07:18:42 2014 +1000

    Remove the charlen field from strings, calculating it when required

commit 5c1658ec71aefbdc88c261ce2e57dc7670cdc6ef
Author: Chris Angelico <rosuav@gmail.com>
Date:   Sun Jun 8 07:11:27 2014 +1000

    Get rid of mp_obj_str_get_data_len() which was used in only one place

commit a019ba968b4e8daf7f3674f63c5cc400e304c509
Author: Chris Angelico <rosuav@gmail.com>
Date:   Sun Jun 8 06:58:26 2014 +1000

    Add a unichar_charlen() function to calculate length-in-characters from length-in-bytes

commit 44b0d5cff846ba487c526ed95be1b3d1cd3d762a
Author: Chris Angelico <rosuav@gmail.com>
Date:   Sun Jun 8 06:32:44 2014 +1000

    Use utf8_get/next_char in building up a string's repr

commit 30d1bad33f7af90f1971987c39864c8fcf3f5c21
Author: Chris Angelico <rosuav@gmail.com>
Date:   Sun Jun 8 06:10:45 2014 +1000

    Make utf8_get_char() and utf8_next_char() actually do what their names say

commit bc990dad9afb8ec112f5e7f7f79d5ab415da0e72
Author: Chris Angelico <rosuav@gmail.com>
Date:   Sun Jun 8 02:10:59 2014 +1000

    Revert "Add PEP 393-flags to strings and stub usage."

    This reverts commit c239f509521d1a0f9563bf9c5de0c4fb9a6a33ba.

commit f9bebb28ad52467f2f2d7a752bb033296b6c2f9b
Author: Chris Angelico <rosuav@gmail.com>
Date:   Sat Jun 7 15:41:48 2014 +1000

    Whitespace fixes

commit 279de0c8eb3cb186914799ccc5ee94ea97f56de4
Author: Chris Angelico <rosuav@gmail.com>
Date:   Sat Jun 7 15:28:35 2014 +1000

    Formatting/layout improvements - introduce macros for UTF-8 byte detection, add braces. No functional changes.

commit f1911f53d56da809c97b07245f5728a419e8fb30
Author: Chris Angelico <rosuav@gmail.com>
Date:   Sat Jun 7 11:56:02 2014 +1000

    Make chr() Unicode-aware

commit f51ad737b48ac04c161197a4012821d50885c4c7
Author: Chris Angelico <rosuav@gmail.com>
Date:   Sat Jun 7 11:44:07 2014 +1000

    Make a string's repr Unicode-aware

commit 01bd68684611585d437982dccdf05b33cbedc630
Author: Chris Angelico <rosuav@gmail.com>
Date:   Sat Jun 7 11:33:43 2014 +1000

    Expand the Unicode tests

commit 7bc91904f899f8012089fc14a06495680a51e590
Author: Chris Angelico <rosuav@gmail.com>
Date:   Sat Jun 7 11:27:30 2014 +1000

    Record byte lengths for byte strings

commit bb132120717cf176dcfb26f87fa309378f76ab5f
Author: Chris Angelico <rosuav@gmail.com>
Date:   Sat Jun 7 11:25:06 2014 +1000

    Make ord() Unicode-aware

commit 03f0cbe9051b62192be97b59f84f63f9216668bf
Author: Chris Angelico <rosuav@gmail.com>
Date:   Sat Jun 7 10:24:35 2014 +1000

    Retain characters as UTF-8 encoded Unicode

commit e924659b85c001916a5ff7f4d1d8b3ebe2bf0c2f
Author: Chris Angelico <rosuav@gmail.com>
Date:   Sat Jun 7 08:37:27 2014 +1000

    Add support for \u and \U escapes, but not \N (with explanatory comment)

commit 231031ac5f0346e4ffcf9c4abec2bd33f566232c
Author: Chris Angelico <rosuav@gmail.com>
Date:   Sat Jun 7 05:09:35 2014 +1000

    Add character length to qstr

commit 6df1b946fb17d8d5df3d91b21cde627c3d4556a8
Author: Chris Angelico <rosuav@gmail.com>
Date:   Fri Jun 6 13:48:36 2014 +1000

    Add test of UTF-8 encoded source file resulting in properly formed string

commit 16429b81a8483cf25865ed11afd81a7d9c253c26
Author: Chris Angelico <rosuav@gmail.com>
Date:   Fri Jun 6 13:44:15 2014 +1000

    Make len(s) return character length (even though creation's still buggy)

commit cd2cf6663cc47831dbc97819ad5c50ad33f939d3
Author: Chris Angelico <rosuav@gmail.com>
Date:   Fri Jun 6 13:15:36 2014 +1000

    HACK - When indexing a qstr, count its charlen. Stupidly inefficient but POC.

    All tests pass now, though string creation is still buggy.

commit 47c234584d3358dfa6b4003d5e7264105d17b8f7
Author: Chris Angelico <rosuav@gmail.com>
Date:   Fri Jun 6 13:15:32 2014 +1000

    objstr: Record character length separately from byte length

    CAUTION: Buggy, may crash stuff - qstr needs equivalent functionality too

commit b0f41c72af27d3b361027146025877b3d7e8785c
Author: Chris Angelico <rosuav@gmail.com>
Date:   Fri Jun 6 05:37:36 2014 +1000

    Beginnings of UTF-8 support - construct strings from that many UTF-8-encoded chars, and subscript bytes the same way

commit 89452be641674601e9bfce86dc71c17c3140a6cf
Author: Chris Angelico <rosuav@gmail.com>
Date:   Fri Jun 6 05:28:47 2014 +1000

    Update comments - now aiming for UTF-8 rather than PEP 393 strings

commit c239f509521d1a0f9563bf9c5de0c4fb9a6a33ba
Author: Chris Angelico <rosuav@gmail.com>
Date:   Wed Jun 4 05:28:12 2014 +1000

    Add PEP 393-flags to strings and stub usage.

    The test suite all passes, but nothing has actually been changed.
2014-06-27 00:04:17 +03:00
Paul Sokolovsky
83865347db objstrunicode: Complete copy of objstr, to be patched for unicode support. 2014-06-27 00:04:17 +03:00
Chris Angelico
c88987c1af py: Implement basic unicode functions. 2014-06-27 00:04:17 +03:00
Paul Sokolovsky
12bc13eeb8 mpconfig.h: Add MICROPY_PY_BUILTINS_STR_UNICODE. 2014-06-27 00:04:17 +03:00
Paul Sokolovsky
16ac4962ae tests: Add test for catching infinite function recursion.
Put into misc/ to not complicate life for builds with check disabled.
2014-06-27 00:03:56 +03:00
Paul Sokolovsky
7a8ab5a730 stmhal: Use stackctrl framework. 2014-06-27 00:03:55 +03:00
Paul Sokolovsky
23668698cb py: Add portable framework to query/check C stack usage.
Such mechanism is important to get stable Python functioning, because Python
function calling is handled with C stack. The idea is to sprinkle
STACK_CHECK() calls in places where there can be C recursion.

TODO: Add more STACK_CHECK()'s.
2014-06-27 00:03:55 +03:00
Paul Sokolovsky
91b576d147 Merge pull request #719 from dhylands/pin_fix
Use mp_const_none to initialize mapper and map_dict (fix #701)
2014-06-26 22:49:44 +03:00
Dave Hylands
f170735b73 Use mp_const_none to initialize mapper and map_dict 2014-06-25 16:01:19 -07:00
Paul Sokolovsky
f3de62e6c2 binary: machine_uint_t vs uint dichotomy starts doing real damage. 2014-06-26 00:41:08 +03:00
Paul Sokolovsky
8e01291c18 travis: Use unified diffs for failed tests. 2014-06-26 00:05:53 +03:00
Paul Sokolovsky
7a2f166949 modstruct: Fix alignment handling issues.
Also, factor out mp_binary_get_int() function.
2014-06-25 23:34:44 +03:00
Paul Sokolovsky
5aa740c3e2 modgc: Add mem_free()/mem_alloc() methods.
Return free/allocated memory on GC heap.
2014-06-25 14:28:11 +03:00
Damien George
e973acde81 Merge branch 'master' of github.com:micropython/micropython 2014-06-25 04:10:34 +01:00
Paul Sokolovsky
939c2e7f44 Merge pull request #690 from stinos/msvc-gc
msvc: Enable GC
2014-06-24 21:34:51 +03:00
Paul Sokolovsky
3c9b24bebe modsocket: Fix uClibc detection. 2014-06-24 21:20:38 +03:00
Paul Sokolovsky
141df2d350 unix: Dump default heap size in usage message. 2014-06-24 16:58:00 +03:00
Damien George
780e54cdc3 py: Implement delete_attr in native emitter. 2014-06-22 18:35:04 +01:00
Paul Sokolovsky
cd590cbfaa unix: Don't error out on #warning directive. 2014-06-22 19:20:55 +03:00
Paul Sokolovsky
ff5932a8d8 modsocket: Workaround uClibc issue with numeric port for getaddrinfo().
It sucks to workaround this on uPy side, but upgrading not upgradable
embedded systems sucks even more.
2014-06-22 19:20:55 +03:00
Paul Sokolovsky
949a49c9da modsocket: Add call to freeaddrinfo(). 2014-06-22 19:11:34 +03:00
Paul Sokolovsky
69d0a1c540 unix: uClibc doesn't like NULL as a buffer arg to realpath().
So, allocate one explicitly.
2014-06-22 19:08:32 +03:00
stijn
de5ce6d461 gc: Use simple cast instead of union to silence compiler 2014-06-22 11:32:32 +02:00
stijn
8abcf666cb windows: Enable GC and implement bss start and end symbols
The pointers to the bss section are acquired in init.c()
by inspecting the PE header. Works for msvc and mingw.
2014-06-22 11:31:16 +02:00
Paul Sokolovsky
a96cc824bd py: Support arm and thumb ARM ISAs, in addition to thumb2.
These changes were tested with QEMU, and by few people of real hardware.
2014-06-22 01:40:45 +03:00
Paul Sokolovsky
59c675a64c py: Include mpconfig.h before all other includes.
It defines types used by all other headers.

Fixes #691.
2014-06-21 22:43:22 +03:00
Paul Sokolovsky
4c4b9d15ab mkrules.mk: Pass $(COPT) to link stage.
In generalize case, optimization options should be passed to all stages of
the build process.
2014-06-20 23:49:30 +03:00