Commit Graph

3925 Commits

Author SHA1 Message Date
Christian Walther bde1c4166d Revert "Prevent exceptions from accumulating in REPL"
This reverts commit 0cd951fb73.

It is not a correct solution because it prevents printing the same exception twice.
2020-11-28 23:10:17 +01:00
Christian Walther 2ba9805f84 Use movable allocation system for terminal tilegrid.
Moving memory is now done by the infrastructure and neither necessary nor correct here anymore.
2020-11-28 17:54:34 +01:00
Christian Walther c7404a3ff8 Add movable allocation system.
This allows calls to `allocate_memory()` while the VM is running, it will then allocate from the GC heap (unless there is a suitable hole among the supervisor allocations), and when the VM exits and the GC heap is freed, the allocation will be moved to the bottom of the former GC heap and transformed into a proper supervisor allocation. Existing movable allocations will also be moved to defragment the supervisor heap and ensure that the next VM run gets as much memory as possible for the GC heap.

By itself this breaks terminalio because it violates the assumption that supervisor_display_move_memory() still has access to an undisturbed heap to copy the tilegrid from. It will work in many cases, but if you're unlucky you will get garbled terminal contents after exiting from the vm run that created the display. This will be fixed in the following commit, which is separate to simplify review.
2020-11-28 17:50:23 +01:00
Dan Halbert 28d9e9186e Disable complex arithmetic on SAMD21 builds to make space 2020-11-28 10:12:46 -05:00
Dan Halbert ef0830bfe2 merge from upstream + wip 2020-11-25 17:52:06 -05:00
Dan Halbert 9dbea36eac changed alarm.time API 2020-11-25 15:09:27 -05:00
Jeff Epler e778fc1f87
Merge pull request #3741 from hathach/fix-cdc-connection-race
update tinyusb to fix cdc connection race
2020-11-24 19:11:41 -06:00
Jeff Epler c451b22255 Disable 3-arg pow() function on m0 boards
`pow(a, b, c)` can compute `(a ** b) % c` efficiently (in time and memory).
This can be useful for extremely specific applications, like implementing
the RSA cryptosystem.  For typical uses of CircuitPython, this is not an
important feature.  A survey of the bundle and learn system didn't find
any uses.

Disable it on M0 builds so that we can fit in needed upgrades to the USB
stack.
2020-11-24 16:54:33 -06:00
Dan Halbert 7a45afc549 working, but need to avoid deep sleeping too fast before USB ready 2020-11-23 22:44:53 -05:00
Scott Shawcroft f8dcb25170
Merge pull request #3694 from jepler/update-ulab2
ulab: Update to release tag 1.1.0
2020-11-23 15:17:46 -08:00
Jeff Epler 9d8be648ee ulab: Update to release tag 1.1.0
Disable certain classes of diagnostic when building ulab.  We should
submit patches upstream to (A) fix these errors and (B) upgrade their
CI so that the problems are caught before we want to integrate with
CircuitPython, but not right now.
2020-11-23 10:23:50 -06:00
Dan Halbert 75559f35cc wip: ResetReason to microcontroller.cpu 2020-11-21 23:29:52 -05:00
Dan Halbert e4c66990e2 compiles 2020-11-20 23:33:39 -05:00
Jeff Epler 982bce7259 py.mk: allow translation to be overriden in GNUmakefile
I like to use local makefile overrides, in the file GNUmakefile
(or, on case-sensitive systems, makefile) to set compilation choices.
However, writing
    TRANSLATION := de_DE
    include Makefile
did not work, because py.mk would override the TRANSLATION := specified
in an earlier part of the makefiles (but not from the commandline).

By using ?= instead of := the local makefile override works, but when
TRANSLATION is not specified it continues to work as before.
2020-11-19 16:23:35 -06:00
Jeff Epler b2b8520880 Always use preprocessor for MICROPY_ERROR_REPORTING
This ensures that only the translate("") alternative that will be used
is seen after preprocessing.  Improves the quality of the Huffman encoding
and reduces binary size slightly.

Also makes one "enhanced" error message only occur when ERROR_REPORTING_DETAILED:
Instead of the word-for-word python3 error message
"Type object has no attribute '%q'", the message will be
"'type' object has no attribute '%q'".  Also reduces binary size.
(that's rolled into this commit as it was right next to a change to
use the preprocessor for MICROPY_ERROR_REPORTING)

Note that the odd semicolon after "value_error:" in parsenum.c is necessary
due to a detail of the C grammar, in which a declaration cannot follow
a label directly.
2020-11-19 16:18:52 -06:00
Jeff Epler c06fc8e02d Introduce, use mp_raise_arg1
This raises an exception with a given object value.  Saves a bit of
code size.
2020-11-19 16:15:06 -06:00
Jeff Epler d5f6748d1b Use mp_raise instead of nlr_raise(new_exception) where possible
This saves a bit of code space
2020-11-19 16:13:01 -06:00
Jeff Epler 0556f9f851 Revert "samd21: Enable terse error reporting on resource constrained chip family"
This reverts commit 9a642fc049.
2020-11-19 15:12:56 -06:00
Dan Halbert 649c930536 wip 2020-11-19 15:43:39 -05:00
Dan Halbert 5bb3c321e9 merge from main 2020-11-19 00:29:14 -05:00
Dan Halbert 682054a216 WIP: redo API; not compiled yet 2020-11-19 00:23:27 -05:00
Jeff Epler 9a642fc049 samd21: Enable terse error reporting on resource constrained chip family
This reclaims over 1kB of flash space by simplifying certain exception
messages.  e.g., it will no longer display the requested/actual length
when a fixed list/tuple of N items is needed:

        if (MICROPY_ERROR_REPORTING == MICROPY_ERROR_REPORTING_TERSE) {
            mp_raise_ValueError(translate("tuple/list has wrong length"));
        } else {
            mp_raise_ValueError_varg(translate("requested length %d but object has length %d"),
                (int)len, (int)seq_len);

Other chip families including samd51 keep their current error reporting
capabilities.
2020-11-18 20:37:36 -06:00
Dan Halbert ffff02c053 Merge remote-tracking branch 'adafruit/main' into sleep 2020-11-16 12:06:11 -05:00
Dan Halbert bb77f1d130 wip: initial code changes, starting from @tannewt's sleepio branch 2020-11-16 11:56:20 -05:00
root 0cd951fb73 Prevent exceptions from accumulating in REPL 2020-11-16 10:36:05 -06:00
Scott Shawcroft bda3267432
Save flash space
* No weak link for modules. It only impacts _os and _time and is
  already disabled for non-full builds.
* Turn off PA00 and PA01 because they are the crystal on the Metro
  M0 Express.
* Change ejected default to false to move it to BSS. It is set on
  USB connection anyway.
* Set sinc_filter to const. Doesn't help flash but keeps it out of
  RAM.
2020-11-13 18:57:52 -08:00
gamblor21 4c93db3595 Renamed to adafruit_bus_device 2020-11-03 18:35:20 -06:00
Dan Halbert 72b829dff0 add binascii to most builds 2020-11-01 14:52:03 -05:00
gamblor21 78477a374a Initial SPI commit 2020-10-31 12:17:29 -05:00
microDev 930cf14dce
Add check for invalid io, function to disable all alarms 2020-10-27 16:17:26 -07:00
microDev 59df1a11ad
Add alarm_touch module 2020-10-27 16:16:52 -07:00
microDev da449723df
Fix build error 2020-10-27 16:16:15 -07:00
microDev 4d8ffdca8d
restructure alarm modules 2020-10-27 16:15:09 -07:00
microDev e5ff55b15c
Renamed alarm modules 2020-10-27 16:13:25 -07:00
microDev e310b871c8
Get io wake working 2020-10-27 16:13:25 -07:00
microDev 90b9ec6f2c
Initial Sleep Support 2020-10-27 16:13:22 -07:00
gamblor21 b637d3911e Initial commit 2020-10-24 20:48:35 -05:00
Christian Walther 1eab0692b5 Fix missing `nproc` on macOS.
396979a breaks building on macOS: `nproc` is a Linux thing, use a cross-platform alternative.
2020-10-20 16:39:32 +02:00
Dan Halbert 82b49afe43 enable CIRCUITPY_BLEIO_HCI on non-nRF boards where it will fit 2020-10-15 11:27:21 -04:00
Dan Halbert f1e8f2b404
Merge pull request #3554 from gamblor21/move_ordereddict
Moved ORDEREDDICT define to central location
2020-10-14 22:39:04 -04:00
gamblor21 f6f89565d8 Remove ordered dict from SAMD21 2020-10-14 20:18:49 -05:00
gamblor21 e6d0b207ec Removed MICROPY_PY_COLLECTIONS_NAMEDTUPLE__ASDICT from unix coverage 2020-10-14 14:06:34 -05:00
gamblor21 4270061db4 Moved ORDEREDDICT define to central location 2020-10-13 18:52:27 -05:00
Scott Shawcroft 9de96786ad
Merge pull request #3538 from jepler/parallel-qstrlast
build: parallelize the qstr build steps
2020-10-12 15:52:43 -07:00
Scott Shawcroft 1eb1434fc9
Merge pull request #3537 from jepler/update-protomatter-2
rgbmatrix: update protomatter to 1.0.5 tag
2020-10-12 15:45:51 -07:00
Scott Shawcroft 179e13f103
Merge pull request #3539 from jepler/lto-type-mismatch
remove warning-disable flag that seems unneeded now
2020-10-12 15:44:25 -07:00
Kenny 94beeabc51 remove unnecessary board configuration and address feedback 2020-10-11 22:42:59 -07:00
Jeff Epler 479552ce56 build: Make genlast write the "split" files
This gets a further speedup of about 2s (12s -> 9.5s elapsed build time)
for stm32f405_feather

For what are probably historical reasons, the qstr process involves
preprocessing a large number of source files into a single "qstr.i.last"
file, then reading this and splitting it into one "qstr" file for each
original source ("*.c") file.

By eliminating the step of writing qstr.i.last as well as making the
regular-expression-matching part be parallelized, build speed is further
improved.

Because the step to build QSTR_DEFS_COLLECTED does not access
qstr.i.last, the path is replaced with "-" in the Makefile.
2020-10-11 21:18:03 -05:00
Jeff Epler 607e4a905a build: parallelize the creation of qstr.i.last
Rather than simply invoking gcc in preprocessor mode with a list of files, use
a Python script with the (python3) ThreadPoolExecutor to invoke the
preprocessor in parallel.

The amount of concurrency is the number of system CPUs, not the makefile "-j"
parallelism setting, because there is no simple and correct way for a Python
program to correctly work together with make's idea of parallelism.

This reduces the build time of stm32f405 feather (a non-LTO build) from 16s to
12s on my 16-thread Ryzen machine.
2020-10-11 20:19:59 -05:00
Jeff Epler c139eccc92 remove warning that seems unneeded now 2020-10-11 16:23:02 -05:00
Kenny 98aa4b7943 update async tests with less upython workaround and more cpython compatibility 2020-10-10 23:39:32 -07:00
Kenny 5d96afc5c2 i do not know if this is needed but this is not the vm i use anymore 2020-10-10 15:45:08 -07:00
Kenny bf849ff674 async def syntax rigor and __await__ magic method
Some examples of improved compliance with CPython that currently
have divergent behavior in CircuitPython are listed below:

* yield from is not allowed in async methods
```
>>> async def f():
...     yield from 'abc'
...
Traceback (most recent call last):
  File "<stdin>", line 2, in f
SyntaxError: 'yield from' inside async function
```

* await only works on awaitable expressions
```
>>> async def f():
...     await 'not awaitable'
...
>>> f().send(None)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in f
AttributeError: 'str' object has no attribute '__await__'
```

* only __await__()able expressions are awaitable
Okay this one actually does not work in circuitpython at all today.
This is how CPython works though and pretending __await__ does not
exist will only bite users who write both.
```
>>> class c:
...     pass
...
>>> def f(self):
...     yield
...     yield
...     return 'f to pay respects'
...
>>> c.__await__ = f  # could just as easily have put it on the class but this shows how it's wired
>>> async def g():
...     awaitable_thing = c()
...     partial = await awaitable_thing
...     return 'press ' + partial
...
>>> q = g()
>>> q.send(None)
>>> q.send(None)
>>> q.send(None)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration: press f to pay respects
```
2020-10-10 15:45:08 -07:00
warriorofwire 5cadf525bd fix missing cflag defeating the board gating 2020-10-10 15:45:08 -07:00
warriorofwire d94d2d2975 Add async/await syntax to FULL_BUILD
This adds the `async def` and `await` verbs to valid CircuitPython syntax using the Micropython implementation.

Consider:
```
>>> class Awaitable:
...     def __iter__(self):
...         for i in range(3):
...             print('awaiting', i)
...             yield
...         return 42
...
>>> async def wait_for_it():
...     a = Awaitable()
...     result = await a
...     return result
...
>>> task = wait_for_it()
>>> next(task)
awaiting 0
>>> next(task)
awaiting 1
>>> next(task)
awaiting 2
>>> next(task)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  StopIteration: 42
>>>
```

and more excitingly:
```
>>> async def it_awaits_a_subtask():
...     value = await wait_for_it()
...     print('twice as good', value * 2)
...
>>> task = it_awaits_a_subtask()
>>> next(task)
awaiting 0
>>> next(task)
awaiting 1
>>> next(task)
awaiting 2
>>> next(task)
twice as good 84
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  StopIteration:
```

Note that this is just syntax plumbing, not an all-encompassing implementation of an asynchronous task scheduler or asynchronous hardware apis.
  uasyncio might be a good module to bring in, or something else - but the standard Python syntax does not _strictly require_ deeper hardware
  support.
Micropython implements the await verb via the __iter__ function rather than __await__.  It's okay.

The syntax being present will enable users to write clean and expressive multi-step state machines that are written serially and interleaved
  according to the rules provided by those users.

Given that this does not include an all-encompassing C scheduler, this is expected to be an advanced functionality until the community settles
  on the future of deep hardware support for async/await in CircuitPython.  Users will implement yield-based schedulers and tasks wrapping
  synchronous hardware APIs with polling to avoid blocking, while their application business logic gets simple `await` statements.
2020-10-10 15:38:40 -07:00
Jeff Epler 5e38bb98cb rgbmatrix: update protomatter to 1.0.5 tag
this is compile-tested on
 stm32f405 feather
 matrixportal
 nrf52840 feather

but not actually tested-tested.
2020-10-10 14:30:37 -05:00
Jeff Epler a4cc3ad6cb canio: RemoteTransmissionRequest: Split implementation, keep one structure
This already begins obscuring things, because now there are two sets of
shared-module functions for manipulating the same structure, e.g.,
common_hal_canio_remote_transmission_request_get_id and
common_hal_canio_message_get_id
2020-09-28 17:22:00 -05:00
Scott Shawcroft a8558a48ed
Merge pull request #3456 from jepler/qstr-and-or-demagic
makeqstrdefs: don't make _and_, _or_ poisoned substrings for QSTRs
2020-09-23 12:24:02 -07:00
Jeff Epler 28e80e47d7 makeqstrdefs: don't make _and_, _or_ poisoned substrings for QSTRs
New contributor @mdroberts1243 encountered an interesting problem in
which the argument they had named "column_underscore_and_page_addressing"
simply couldn't be used; I discovered that internally this had been
transformed into "column_underscore∧page_addressing", because QSTR
makes _ENTITY_ stand for the same thing as &ENTITY; does in HTML.

This might be nice for some things, but we don't want it here!
I was unable to find a sensible way to "escape" and prevent this entity
coding, so instead I ripped out support for the _and_ and _or_ escapes.
2020-09-22 17:39:44 -05:00
Jeff Epler 4869dbdc67 canio: rename from _canio
This reflects our belief that the API is stable enough to avoid incompatible changes during 6.x.
2020-09-21 16:44:26 -05:00
Jeff Epler a2e1867f69 _canio: Minimal implementation for SAM E5x MCUs
Tested & working:

 * Send standard packets
 * Receive standard packets (1 FIFO, no filter)

Interoperation between SAM E54 Xplained running this tree and
MicroPython running on STM32F405 Feather with an external
transceiver was also tested.

Many other aspects of a full implementation are not yet present,
such as error detection and recovery.
2020-09-21 16:44:26 -05:00
Jeff Epler e7a213a114 py: Add enum helper code
This makes it much easier to implement enums, and the printing code is
shared.  We might want to convert other enums to this in the future.
2020-09-21 16:44:26 -05:00
Jeff Epler 0318eb359f makeqstrdata: Work around python3.6 compatibility problem
Discord user Folknology encountered a problem building with Python 3.6.9,
`TypeError: ord() expected a character, but string of length 0 found`.

I was able to reproduce the problem using Python3.5*, and discovered that
the meaning of the regular expression `"|."` had changed in 3.7.  Before,
```
>>> [m.group(0) for m in re.finditer("|.", "hello")]
['', '', '', '', '', '']
```
After:
```
>>> [m.group(0) for m in re.finditer("|.", "hello")]
['', 'h', '', 'e', '', 'l', '', 'l', '', 'o', '']
```
Check if `words` is empty and if so use `"."` as the regular expression
instead.  This gives the same result on both versions:
```
['h', 'e', 'l', 'l', 'o']
```
and fixes the generation of the huffman dictionary.

Folknology verified that this fix worked for them.

 * I could easily install 3.5 but not 3.6.  3.5 reproduced the same problem
2020-09-21 10:03:07 -05:00
Jeff Epler bfbbbd6c5c makeqstrdata: Work with older Python
This construct (which I added without sufficient testing,
apparently) is only supported in Python 3.7 and newer.  Make it
optional so that this script works on other Python versions.  This
means that if you have a system with non-UTF-8 encoding you will
need to use Python 3.7.

In particular, this affects a problem building circuitpython in
github's ubuntu-18.04 virtual environment when Python 3.7 is not
explicitly installed.  cookie-cuttered libraries call for Python
3.6:
```
    - name: Set up Python 3.6
      uses: actions/setup-python@v1
      with:
        python-version: 3.6
```
Since CircuitPython's own build calls for 3.8, this problem was not
detected.

This problem was also encountered by discord user mdroberts1243.

The failure I encountered was here:
https://github.com/jepler/Jepler_CircuitPython_udecimal/runs/1138045020?check_suite_focus=true
.. while my step of "clone and build circuitpython unix port" is
unusual, I think the same problem would have affected "build assets"
if that step had been reached.
2020-09-19 10:16:13 -05:00
Scott Shawcroft 750bc1e04a
Merge pull request #3398 from jepler/better-dictionary-compression
compression: Implement @ciscorn's dictionary approach
2020-09-16 11:10:22 -07:00
Jeff Epler a8e98cda83 makeqstrdata: comment my understanding of @ciscorn's code 2020-09-16 08:28:15 -05:00
Kamil Tomaszewski c2fc592c2c camera: Change API 2020-09-14 13:11:15 +02:00
Kamil Tomaszewski 064c597b60 camera: Implement new library for camera 2020-09-14 13:11:15 +02:00
Jeff Epler 90f7340bfc move implicit-fallthrough warning enable to defns.mk 2020-09-13 13:13:09 -05:00
Taku Fukada d18d79ac47 Small improvements to the dictionary compression 2020-09-14 01:50:01 +09:00
Jeff Epler 15964a4750 makeqstrdata: Avoid encoding problems
Most users and the CI system are running in configurations where Python
configures stdout and stderr in UTF-8 mode.  However, Windows is different,
setting values like CP1252.  This led to a build failure on Windows, because
makeqstrdata printed Unicode strings to its stdout, expecting them to be
encoded as UTF-8.

This script is writing (stdout) to a compiler input file and potentially
printing messages (stderr) to a log or console.  Explicitly configure stdout to
use utf-8 to get consistent behavior on all platforms, and configure stderr so
that if any log/diagnostic messages are printed that cannot be displayed
correctly, they are still displayed instead of creating an error while trying
to print the diagnostic information.

I considered setting the encodings both to ascii, but this would just be
occasionally inconvenient to developers like me who want to show diagnostic
info on stderr and in comments while working with the compression code.

Closes: #3408
2020-09-12 19:43:08 -05:00
Jeff Epler 12d826d941 Add FALLTHROUGH comments as needed
I investigated these cases and confirmed that the fallthrough behavior
was intentional.
2020-09-12 15:11:29 -05:00
Jeff Epler 54d97251fe modstruct: Improve compliance with python3
While checking whether we can enable -Wimplicit-fallthrough, I encountered
a diagnostic in mp_binary_set_val_array_from_int which led to discovering
the following bug:
```
>>> struct.pack("xb", 3)
b'\x03\x03'
```
That is, the next value (3) was used as the value of a padding byte, while
standard Python always fills "x" bytes with zeros.  I initially thought
this had to do with the unintentional fallthrough, but it doesn't.
Instead, this code would relate to an array.array with a typecode of
padding ('x'), which is ALSO not desktop Python compliant:
```
>>> array.array('x', (1, 2, 3))
array('x', [1, 0, 0])
```
Possibly this is dead code that used to be shared between struct-setting
and array-setting, but it no longer is.

I also discovered that the argument list length for struct.pack
and struct.pack_into were not checked, and that the length of binary data
passed to array.array was not checked to be a multiple of the element
size.

I have corrected all of these to conform more closely to standard Python
and revised some tests where necessary.  Some tests for micropython-specific
behavior that does not conform to standard Python and is not present
in CircuitPython was deleted outright.
2020-09-12 14:07:23 -05:00
Jeff Epler 40ab5c6b21 compression: Implement ciscorn's dictionary approach
Massive savings.  Thanks so much @ciscorn for providing the initial
code for choosing the dictionary.

This adds a bit of time to the build, both to find the dictionary
but also because (for reasons I don't fully understand), the binary
search in the compress() function no longer worked and had to be
replaced with a linear search.

I think this is because the intended invariant is that for codebook
entries that encode to the same number of bits, the entries are ordered
in ascending value.  However, I mis-placed the transition from "words"
to "byte/char values" so the codebook entries for words are in word-order
rather than their code order.

Because this price is only paid at build time, I didn't care to determine
exactly where the correct fix was.

I also commented out a line to produce the "estimated total memory size"
-- at least on the unix build with TRANSLATION=ja, this led to a build
time KeyError trying to compute the codebook size for all the strings.
I think this occurs because some single unicode code point ('ァ') is
no longer present as itself in the compressed strings, due to always
being replaced by a word.

As promised, this seems to save hundreds of bytes in the German translation
on the trinket m0.

Testing performed:
 - built trinket_m0 in several languages
 - built and ran unix port in several languages (en, de_DE, ja) and ran
   simple error-producing codes like ./micropython -c '1/0'
2020-09-12 10:10:45 -05:00
Scott Shawcroft 1ba28b3edc
Merge pull request #3370 from jepler/compression-bigrams
add bigram compression to makeqstrdata (save ~100 bytes on trinket m0 de_DE)
2020-09-10 11:44:56 -07:00
Scott Shawcroft 683462c1b1
Merge pull request #3326 from tannewt/native_wifi
Add native wifi API with ESP32S2 support
2020-09-10 11:20:44 -07:00
Jeff Epler bdb07adfcc translations: Make decompression clearer
Now this gets filled in with values e.g., 128 (0x80) and 159 (0x9f).
2020-09-08 19:07:53 -05:00
Jeff Epler 73858ea682 circuitpy_mpconfig: enable 3-arg pow() with CIRCUITPY_FULL_BUILD
This is needed for a port of python3's decimal.py module.
2020-09-06 10:07:57 -05:00
Jeff Epler 20c2dd0c08 core: add int.bit_length() when MICROPY_CYPTHON_COMPAT is enabled
This method of integer objects is needed for a port of python3's
decimal.py module.

MICROPY_CPYTHON_COMPAT is enabled by CIRCUITPY_FULL_BUILD.
2020-09-06 09:53:16 -05:00
Scott Shawcroft 96cf60fbbd
Merge remote-tracking branch 'adafruit/main' into native_wifi 2020-09-03 16:34:56 -07:00
Scott Shawcroft 0b94638aeb
Changes based on Dan's feedback 2020-09-03 16:32:12 -07:00
Jeff Epler cbfd38d1ce Rename functions to encode_ngrams / decode_ngrams 2020-09-02 19:09:23 -05:00
Jeff Epler c34cb82ecb makeqstrdata: correct range of low code points to 0x80..0x9f inclusive
The previous range was unintentionally big and overlaps some characters
we'd like to use (and also 0xa0, which we don't intentionally use)
2020-09-02 15:52:02 -05:00
Jeff Epler 07740d19f3 add bigram compression to makeqstrdata
Compress common unicode bigrams by making code points in the range
0x80 - 0xbf (inclusive) represent them.  Then, they can be greedily
encoded and the substituted code points handled by the existing Huffman
compression.  Normally code points in the range 0x80-0xbf are not used
in Unicode, so we stake our own claim.  Using the more arguably correct
"Private Use Area" (PUA) would mean that for scripts that only use
code points under 256 we would use more memory for the "values" table.

bigram means "two letters", and is also sometimes called a "digram".
It's nothing to do with "big RAM".  For our purposes, a bigram represents
two successive unicode code points, so for instance in our build on
trinket m0 for english the most frequent are:
['t ', 'e ', 'in', 'd ', ...].

The bigrams are selected based on frequency in the corpus, but the
selection is not necessarily optimal, for these reasons I can think of:
 * Suppose the corpus was just "tea" repeated 100 times.  The
   top bigrams would be "te", and "ea".  However,
   overlap, "te" could never be used.  Thus, some bigrams might actually
   waste space
    * I _assume_ this has to be why e.g., bigram 0x86 "s " is more
      frequent than bigram 0x85 " a" in English for Trinket M0, because
      sequences like "can't add" would get the "t " digram and then
      be unable to use the " a" digram.

 * And generally, if a bigram is frequent then so are its constituents.
   Say that "i" and "n" both encode to just 5 or 6 bits, then the huffman
   code for "in" had better compress to 10 or fewer bits or it's a net
   loss!
    * I checked though!  "i" is 5 bits, "n" is 6 bits (lucky guess)
      but the bigram 0x83 also just 6 bits, so this one is a win of
      5 bits for every "it" minus overhead.  Yay, this round goes to team
      compression.
    * On the other hand, the least frequent bigram 0x9d " n" is 10 bits
      long and its constituent code points are 4+6 bits so there's no
      savings, but there is the cost of the table entry.
    * and somehow 0x9f 'an' is never used at all!

With or without accounting for overlaps, there is some optimum number
of bigrams.  Adding one more bigram uses at least 2 bytes (for the
entry in the bigram table; 4 bytes if code points >255 are in the
source text) and also needs a slot in the Huffman dictionary, so
adding bigrams beyond the optimim number makes compression worse again.

If it's an improvement, the fact that it's not guaranteed optimal
doesn't seem to matter too much.  It just leaves a little more fruit
for the next sweep to pick up.  Perhaps try adding the most frequent
bigram not yet present, until it doesn't improve compression overall.

Right now, de_DE is again the "fullest" build on trinket_m0.  (It's
reclaimed that spot from the ja translation somehow)  This change saves
104 bytes there, increasing free space about 6.8%.  In the larger
(but not critically full) pyportal build it saves 324 bytes.

The specific number of bigrams used (32) was chosen as it is the max
number that fit within the 0x80..0xbf range.  Larger tables would
require the use of 16 bit code points in the de_DE build, losing savings
overall.

(Side note: The most frequent letters in English have been said
to be: ETA OIN SHRDLU; but we have UAC EIL MOPRST in our corpus)
2020-09-01 17:12:22 -05:00
Scott Shawcroft f0e60da51f
Merge pull request #3310 from dhalbert/ble_hci
_bleio HCI implementation
2020-09-01 11:28:05 -07:00
Dan Halbert 6dbd369272 merge from upstream 2020-08-30 14:39:03 -04:00
Dan Halbert b27d511251 address review; use constructor for HCI Adapter 2020-08-30 14:06:48 -04:00
Jeff Epler 455226ffde builtinimport: Fix a crash with 'import ulab.linalg' on unix port only
A crash like the following occurs in the unix port:
```
Program received signal SIGSEGV, Segmentation fault.
0x00005555555a2d7a in mp_obj_module_set_globals (self_in=0x55555562c860 <ulab_user_cmodule>, globals=0x55555562c840 <mp_module_ulab_globals>) at ../../py/objmodule.c:145
145	    self->globals = globals;
(gdb) up
#1  0x00005555555b2781 in mp_builtin___import__ (n_args=5, args=0x7fffffffdbb0) at ../../py/builtinimport.c:496
496	                mp_obj_module_set_globals(outer_module_obj,
(gdb)
#2  0x00005555555940c9 in mp_import_name (name=824, fromlist=0x555555621f10 <mp_const_none_obj>, level=0x1) at ../../py/runtime.c:1392
1392	    return mp_builtin___import__(5, args);
```

I don't understand how it doesn't happen on the embedded ports, because
the module object should reside in ROM and the assignment of self->globals
should trigger a Hard Fault.

By checking VERIFY_PTR, we know that the pointed-to data is on the heap
so we can do things like mutate it.
2020-08-30 11:09:49 -05:00
Scott Shawcroft 767ca5c3dc
Merge remote-tracking branch 'adafruit/main' into native_wifi 2020-08-27 11:42:31 -07:00
Jeff Epler 2e0a109331
Merge pull request #3318 from jepler/interrupt-serial-rx
supervisor: check for interrupt during rx_chr
2020-08-25 21:01:33 -05:00
Scott Shawcroft 8b71e26abd
Merge remote-tracking branch 'adafruit/main' into native_wifi 2020-08-25 16:39:23 -07:00
Jeff Epler c0753c1afb mp_obj_print_helper: Handle a ctrl-c that comes in during printing
In #2689, hitting ctrl-c during the printing of an object with a lot of sub-objects could cause the screen to stop updating (without showing a KeyboardInterrupt).  This makes the printing of such objects acutally interruptable, and also correctly handles the KeyboardInterrupt:

```
>>> l = ["a" * 100] * 200
>>> l
['aaaaaaaaaaaaaaaaaaaaaa...aaaaaaaaaaa', Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyboardInterrupt:
>>>
```
2020-08-25 11:47:50 -05:00
Scott Shawcroft 701e80a025
Make socket reads interruptable 2020-08-21 11:00:02 -07:00
Dan Halbert 0e30dd8bcc merge from upstream; working; includes debug_out code for debugging via Saleae for posterity 2020-08-20 20:29:57 -04:00
Scott Shawcroft eb8b42aff1
Add basic error handling 2020-08-19 14:23:28 -07:00
Scott Shawcroft 1034cc1217
Add espidf module. 2020-08-19 14:23:28 -07:00
Scott Shawcroft 430530c74b
SSL works until it runs out of memory 2020-08-19 14:23:28 -07:00
Scott Shawcroft c9ece21c28
SocketPool stubbed out 2020-08-19 14:22:13 -07:00
Scott Shawcroft 3860991111
Ping work and start to add socketpool 2020-08-19 14:22:13 -07:00
Scott Shawcroft c53a72d3f5
Fix ipaddress import and parse ipv4 strings 2020-08-19 14:22:13 -07:00