This is a modest code savings, but more importantly it reduces
boilerplate in bitmap-modifying routines.
Callers need only ensure they call displayio_bitmap_set_dirty_area in
advance of the bitmap modifications they perform.
(note that this assumes that no bitmap operation can enter background
tasks. If an operation COULD enter background tasks, it MUST re-dirty
the area it touches when it exits, simply by a fresh call to
set_dirty_area with the same area as before)
.. simplifying code in the process. For instance, now fill_region
uses area routines to order and constrain its coordinates.
Happily, this change also frees a modest amount of code space.
.. and simplify the implmentation of displayio_area_union
This _slightly_ changes the behavior of displayio_area_union:
Formerly, if one of the areas was empty, its coordinates were still
used in the min/max calculations.
Now, if one of the areas is empty, the result gets the other area's coords
In particular, taking the union of the empty area with coords (0,0,0,0)
with the non-empty area (x1,y1,x2,y2) would give the area (0,0,x2,y2)
before, and (x1,y1,x2,y2) after the change.
When reading uncompressed bitmap data directly, readinto can work
much more quickly than a Python-coded loop.
On a Raspberry Pi Pico, I benchmarked a modified version of
adafruit_bitmap_font's pcf reader which uses readinto instead of
the existing code. My test font was a 72-point file created from Arial.
This decreased the time to load all the ASCII glyphs from 4.9 seconds to
just 0.44 seconds.
While this attempts to support many pixel configurations (1/2/4/8/16/24/32
bpp; swapped words and pixels) only the single combination used by
PCF fonts was tested.
This is a first go at it, done by naive replacing of all array
operations with corresponding operations on the list. Note that
there is a lot of unnecessary type conversions, here. Also, list_pop
has been copied, because it's decalerd STATIC in py/objlist.h
Since we want to expose the list of group's children to the user,
we should only have the original objects in it, without any other
additional data, and compute the native object as needed.
* Comment on the reason for scaling by 256
* Divide by 256 instead of shifting
* fix a cast; eliminate an unneeded roundf() to get a few bytes code back
On the Pico, this increases the "fill rate" of
pixels[:] = newvalues
considerably. On a strip of 240 RGB LEDs, auto_write=False, the timings
are:
|| Brightness || Before || After || Improvement ||
|| 1.0 || 117 kpix/s || 307 kpix/s || 2.62x ||
|| 0.07 || 117 kpix/s || 273 kpix/s || 2.33x ||
It's worth noting that even the "before" rate is fast compared to the
time to transmit a single neopixel, but any time we can gain back
in the whole pipeline will let marginal animations work a little better.
To set all the pixels in this way and then show() gives a pleasant bump
to the framerate, from about 108Hz to 124Hz (1.15x)
The main source of speed-up is using integer math instead of floating
point math for the calculation of the post-scaled pixel values. A slight
secondary gain is achieved by avoiding the scaling altogether when
the scale factor is 1.0.
Because the math is not exactly the same, some scaled pixel values may
change by +- 1 RGBW "step". In practice, this is unlikely to matter.
The gains are bigger on the Pico and other M0 microcontrollers than M4
microcontrollers with floating point math in the hardware.
Happily, flash size is also improved a bit on the Pico build I did,
going from
> 542552 bytes used, 506024 bytes free in flash firmware space out of 1048576 bytes (1024.0kB).
to
> 542376 bytes used, 506200 bytes free in flash firmware space out of 1048576 bytes (1024.0kB).
Also found a race condition between timer_disable and redraw, which
would happen if I debugger-paused inside common_hal_rgbmatrix_timer_disable
or put a delay or print inside it. That's what pausing inside reconstruct
fixes.
So that the "right timer" can be chosen, `timer_allocate` now gets the `self`
pointer. It's guaranteed at this point that the pin information is accurate,
so you can e.g., find a PWM unit related to the pins themselves.
This required touching each port to add the parameter even though it's
unused everywhere but raspberrypi.
* Introduce explicit serpentine: bool argument instead of using negative
numbers (thanks, ghost of @tannewt sitting on one shoulder)
* Fix several calculations of height
Testing performed (matrixportal):
* set up a serpentine 64x64 virtual display with 2 64x32 tiles
* tried all 4 rotations
* looked at output of REPL
Changed calls: PointSize(), LineWidth(), VertexTranslateX() and VertexTranslateY()
Units for all the above are now pixels, not fixed-point integers. This matches OpenGL.
Docstrings updated accordingly
The RP2040 is new microcontroller from Raspberry Pi that features
two Cortex M0s and eight PIO state machines that are good for
crunching lots of data. It has 264k RAM and a built in UF2
bootloader too.
Datasheet: https://pico.raspberrypi.org/files/rp2040_datasheet.pdf
Microsoft documentation says:
> If biCompression equals BI_RGB and the bitmap uses 8 bpp or less, the bitmap has a color table immediatelly following the BITMAPINFOHEADER structure. The color table consists of an array of RGBQUAD values. The size of the array is given by the biClrUsed member. If biClrUsed is zero, the array contains the maximum number of colors for the given bitdepth; that is, 2^biBitCount colors.
Formerly, we treated 0 colors as "no image palette" during construction,
but then during common_hal_displayio_ondiskbitmap_get_pixel indexed into
the palette anyway. This could have unpredictable results. On a pygamer,
it gave an image that was blue and black. On magtag, it gave a crash.
The transparent_color field was never initialized. I _think_ this means
its value was always set to 0, or the blackest of blacks. Instead,
initialize it to the sentinel value, newly given the name
NO_TRANSPARENT_COLOR.
This exposed a second problem: The test for whether there was an existing
transparent color was wrong (backwards). I am guessing that this was not
found due to the first bug; since the converter had a transparent color,
the correct test would have led to always getting the error "Only one
color can be transparent at a time".
Closes#3723
Hybrid allocation is now part of the infrastructure. Moving memory contents would not be necessary because displayio can recreate them, but does not hurt.
Hybrid allocation is now part of the infrastructure. Moving memory contents would not be necessary because displayio can recreate them, but does not hurt.
This allows calls to `allocate_memory()` while the VM is running, it will then allocate from the GC heap (unless there is a suitable hole among the supervisor allocations), and when the VM exits and the GC heap is freed, the allocation will be moved to the bottom of the former GC heap and transformed into a proper supervisor allocation. Existing movable allocations will also be moved to defragment the supervisor heap and ensure that the next VM run gets as much memory as possible for the GC heap.
By itself this breaks terminalio because it violates the assumption that supervisor_display_move_memory() still has access to an undisturbed heap to copy the tilegrid from. It will work in many cases, but if you're unlucky you will get garbled terminal contents after exiting from the vm run that created the display. This will be fixed in the following commit, which is separate to simplify review.
* Initialize the EPaper display on the MagTag at start.
* Tweak the display send to take a const buffer.
* Correct Luma math
* Multiply the blue component, not add.
* Add all of the components together before dividing. This
reduces the impact of truncated division.
After calling board.SPI().deinit(), calling board.SPI() again would return the unusable deinited object and there was no way of getting it back into an initialized state until the end of the session.
Fixes#3581.
Pins were marked as never_reset by common_hal_displayio_fourwire_construct() and common_hal_sharpdisplay_framebuffer_construct(), but these marks were never removed, so at the end of a session after displayio.release_displays(), {spi|i2c}_singleton would be set to NULL but the pins would not be reset. In the next session, board.SPI() and board.I2C() were unable to reconstruct the object because the pins were still in use.
For symmetry with creation of the singleton, add deinitialization before setting it to NULL in reset_board_busses(). This makes the pins resettable, so that reset_port(), moved behind it, then resets them.
At the end of a session that called displayio.release_displays() (and did not initialize a new display), a board.I2C() bus that was previously used by a display would wrongly be considered still in use. While I can’t think of any unrecoverable problem this would cause in the next session, it violates the assumption that a soft reboot resets everything not needed by displays, potentially leading to confusion.
By itself, this change does not fix the problem yet - rather, it introduces the same issue as in #3581 for SPI. This needs to be solved in the same way for I2C and SPI.
This is enabled by #3482
I was unable to determine why previously I had added sizeof(void*)
to the GC heap allocation, so I removed that code as a mistake.
@cwalther determined that for boards with 2 displays (monster m4sk),
start_terminal would be called for each one, leaking supervisor heap
entries.
Determine, by comparing addresses, whether the display being acted on
is the first display (number zero) and do (or do not) call start_terminal.
stop_terminal can safely be called multiple times, so there's no need
to guard against calling it more than once.
Slight behavioral change: The terminal size would follow the displays[0]
size, not the displays[1] size
If the display is paused, `_PM_swapbuffer_maybe` will never return.
So, when brightness is 0, refresh does nothing. This makes it necessary
to update the display when unpausing.
Closes: #3524
A call to supervisor_start_terminal remained in
common_hal_displayio_display_construct and was copied to other display
_construct functions, even though it was also being done in
displayio_display_core_construct when that was factored out.
Originally, this was harmless, except it created an extra allocation.
When investigating #3482, I found that this bug became harmful,
especially for displays that were created in Python code, because it
caused a supervisor allocation to leak.
I believe that it is safe to merge #3482 after this PR is merged.
An RGBMatrix has no bus and no bus_free method. It is always possible
to refresh the display.
This was not a problem before, but the fix I suggested (#3449) added
a call to core_bus_free when a FramebufferDisplay was being refreshed.
This was not caught during testing.
This is a band-aid fix and it brings to light a second problem in which
a SharpDisplay + FrameBuffer will not have a 'bus' object, and yet does
operate using a shared SPI bus. This kind of display will need a
"bus-free" like function to be added, or it can have problems like
#3309.
It was incorrect to NULL out the pointer to our heap allocated buffer in
`reset`, because subsequent to framebuffer_reset, but while
the heap was still active, we could call `get_bufinfo` again,
leading to a fresh allocation on the heap that is about to be destroyed.
Typical stack trace:
```
#1 0x0006c368 in sharpdisplay_framebuffer_get_bufinfo
#2 0x0006ad6e in _refresh_display
#3 0x0006b168 in framebufferio_framebufferdisplay_background
#4 0x00069d22 in displayio_background
#5 0x00045496 in supervisor_background_tasks
#6 0x000446e8 in background_callback_run_all
#7 0x00045546 in supervisor_run_background_tasks_if_tick
#8 0x0005b042 in common_hal_neopixel_write
#9 0x00044c4c in clear_temp_status
#10 0x000497de in spi_flash_flush_keep_cache
#11 0x00049a66 in supervisor_external_flash_flush
#12 0x00044b22 in supervisor_flash_flush
#13 0x0004490e in filesystem_flush
#14 0x00043e18 in cleanup_after_vm
#15 0x0004414c in run_repl
#16 0x000441ce in main
```
When this happened -- which was inconsistent -- the display would keep
some heap allocation across reset which is exactly what we need to avoid.
NULLing the pointer in reconstruct follows what RGBMatrix does, and that
code is a bit more battle-tested anyway.
If I had a motivation for structuring the SharpMemory code differently,
I can no longer recall it.
Testing performed: Ran my complicated calculator program over multiple
iterations without observing signs of heap corruption.
Closes: #3473
This already begins obscuring things, because now there are two sets of
shared-module functions for manipulating the same structure, e.g.,
common_hal_canio_remote_transmission_request_get_id and
common_hal_canio_message_get_id
Tested & working:
* Send standard packets
* Receive standard packets (1 FIFO, no filter)
Interoperation between SAM E54 Xplained running this tree and
MicroPython running on STM32F405 Feather with an external
transceiver was also tested.
Many other aspects of a full implementation are not yet present,
such as error detection and recovery.
The expectations of displayio.Display and frambufferio.FramebufferDisplay
are different when it comes to rotation.
In displayio.Display, if you call core_construct with a WxH = 64x32
and rotation=90, you get something that is 32 pixels wide and 64 pixels
tall in the LCD's coordinate system.
This is fine, as the existing definitions were written to work like this.
With framebuffer displays, however, the underlying framebuffer (such as
RGBMatrix) says "I am WxH pixels wide in my coordinate system" and the
constructor is given a rotation; when the rotation indicates a transpose
that means "exchange rows and columns, so that to the Groups displayed
on it, there is an effectively HxW pixel region for use".
Happily, we already have a set_rotation method. Thus (modulo the time
spent debugging things anyway:) the fix is simple: Always request no
rotation from core_construct, then immediately fix up the rotation
to match what was requested.
Testing performed: 32x16 RGBMatrix on Metro M4 Express (but using
the Airlift firmware, as this is the configuration the error was reported
on):
* initially construct display at 0, 90, 180, 270 degrees
* later change angle to 0, 90, 180, 270 degrees
* no garbled display
* no safe mode crashes
e.g., allocating a 192x32x6bpp matrix would be enough to trigger this
reliably on a Metro M4 Express using the "memory hogging" layout.
Allocating 64x32x6bpp could trigger it, but somewhat unreliably.
There are several things going on here:
* we make the failing call with interrupts off
* we were throwing an exception with interrupts off
* protomatter failed badly in _PM_free when it was partially-initialized
Incorporate the fix from protomatter, switch to a non-throwing malloc
variant, and ensure that interrupts get turned back on.
This decreases the quality of the MemoryError (it cannot report the size
of the failed allocation) but allows CircuitPython to survive, rather
than faulting.
this flips the bottom-right style to top-left which is at least
kind of normal. A 2x2 square at (0,0) would be defined like
(0,0), (3,0), (3,3), (0,3)
Which seems kind of surprising but at least less bonkers than
that square being defined at (1,1), which is the current behavior.
The getter for vectorio.Polygon#points was not updated with the data type change of the stored points list.
This moves the implementation to shared_module and updates the data type to reflect the actual state.
Before this, the mp3 file would be read into the in-memory buffer
only when new samples were actually needed. This meant that the time
to read mp3 content always counted against the ~22ms audio buffer length.
Now, when there's at least 1 full disk block of free space in the input
buffer, we can request that the buffer be filled _after_ returning from
audiomp3_mp3file_get_buffer and actually filling the DMA pointers. In
this way, the time taken for reading MP3 data from flash/SD is less
likely to cause an underrun of audio DMA.
The existing calls to fill the inbuf remain, but in most cases during
streaming these become no-ops because the buffer will be over half full.
Testing performed: That a card is successfully mounted on Pygamer with
the built in SD card slot
This module is enabled for most FULL_BUILD boards, but is disabled for
samd21 ("M0"), litex, and pca10100 for various reasons.
I noticed that this code was referring to samd-specific functionality,
and isn't enabled except in one samd board (pewpew10). Move it.
There is incomplte support for _pew in mimxrt10xx which then caused build
errors; adding a #if guard to check for _pew being enabled fixes it.
The _pew module is not likely to be important on mimxrt but I'll leave the
choice to remove it to someone else.
There were two main problems
- word_buffer was being filled as though with unsigned samples,
but during mixing all samples are kept in signed mode
- If the first buffer was stopped, the voices_active flag got set
anyway, even though the output buffer wasn't initialized yet,
so the samples were mixed with indeterminate data
We also cover the case where no buffer was playing, and ensure
the output buffer is filled.
This now works much better. Tested on neotrellis m4 playing back
4 mp3 streams at a time in signed-16, 22050Hz
When handling negative steps, start and stop need to be mp_int_t so they
can be checked against a potential negative value during the for loop
used to set the slice values.
This change takes polygon from 126k pixels per second fill to 240k pps fill
on a reference 5 point star 50x66px polygon, updating both location and shape
at 10hz. Tested on an m4 express feather.
As a curiosity, the flat-out fill rate of a shape whose get_pixel is `return 0;`
fills just shy of 375k pixels per second.
vectorio builds on m4 express feather
Concrete shapes are composed into a VectorShape which is put into a displayio Group for display.
VectorShape provides transpose and x/y positioning for shape implementations.
Included Shapes:
* Circle
- A radius; Circle is positioned at its axis in the VectorShape.
- You can freely modify the radius to grow and shrink the circle in-place.
* Polygon
- An ordered list of points.
- Beteween each successive point an edge is inferred. A final edge closing the shape is inferred between the last
point and the first point.
- You can modify the points in a Polygon. The points' coordinate system is relative to (0, 0) so if you'd like a
top-center justified 10x20 rectangle you can do points [(-5, 0), (5, 0), (5, 20), (0, 20)] and your VectorShape
x and y properties will position the rectangle relative to its top center point
* Rectangle
A width and a height.
This adds initial support for an AES module named aesio. This
implementation supports only a subset of AES modes, namely
ECB, CBC, and CTR modes.
Example usage:
```
>>> import aesio
>>>
>>> key = b'Sixteen byte key'
>>> cipher = aesio.AES(key, aesio.MODE_ECB)
>>> output = bytearray(16)
>>> cipher.encrypt_into(b'Circuit Python!!', output)
>>> output
bytearray(b'E\x14\x85\x18\x9a\x9c\r\x95>\xa7kV\xa2`\x8b\n')
>>>
```
This key is 16-bytes, so it uses AES128. If your key is 24- or 32-
bytes long, it will switch to AES192 or AES256 respectively.
This has been tested with many of the official NIST test vectors,
such as those used in `pycryptodome` at
39626a5b01/lib/Crypto/SelfTest/Cipher/test_vectors/AES
CTR has not been tested as NIST does not provide test vectors for it.
Signed-off-by: Sean Cross <sean@xobs.io>
This gets all the purely internal references. Some uses of
protomatter/Protomatter/PROTOMATTER remain, as they are references
to symbols in the Protomatter C library itself.
I originally believed that there would be a wrapper library around it,
like with _pixelbuf; but this proves not to be the case, as there's
too little for the library to do.
- bump supervisor alloc count by 4 (we actually use 5)
- move reconstruct to after gc heap is reset
- destroy protomatter object entirely if not used by a FramebufferDisplay
- ensure previous supervisor allocations are released
- zero out pointers so GC can collect them
It was fixed as 0/0 even though it used to get it from the current
SPI state. This makes it more explicit with kwargs.
Thanks to magpie_lark and kmatocha on the Adafruit Support forum
for finding the issue: https://forums.adafruit.com/viewtopic.php?f=60&t=162515
This removes downscaling (halving-add) when multiple voices are
being mixed. To avoid clipping, and get similar behavior to before,
set the "level" of each voice to (1/voice_count).
Slow paths that were applicable to only M0 chips were removed.
As a side effect, the internal volume representation is now 0 ..
0x8000 (inclusive), which additionally makes a level of exactly 0.5
representable.
Testing performed, on PyGamer: For all 4 data cases, for stereo and
mono, for 1 and 2 voices, play pure sign waves represented as
RawSamples and view the result on a scope and through headphones.
Also, scope the amount of time spent in background tasks.
Code size: growth of +272 bytes
Performance (time in background task when mixing 2 stereo 16-bit voices):
76us per down from 135us (once per ~2.9ms long term average)
(Decrease from 4.7% to 2.4% of all CPU time)
After adding the ability to change files in an existing MP3File object,
it became apparent that at the beginning of a track some part of an
existing buffer was playing first.
I noticed that in get_buffer, the just-populated buffer wasn't being
returned, but the other one was. But still after fixing this, I heard
wrong audio at the beginning of a track, so I took the heavy duty approach
and zeroed the buffers out. That means there's a remaining bug to chase,
which is merely hidden by the memset()s.
This enables jeplayer to allocate just one MP3File at startup, rather
than have to make repeated large allocations while the application is
running.
The buffers have to be allocated their theoretical maximum, but that
doesn't matter much as all the real-life MP3 files I checked needed
that much allocation anyway.
Apparently sometimes, a proper "frame info" block is not found after
a "sync word". Keep looking for one as needed, instead of giving up
after one try.
This was one reason that the "bartlebeats" mp3s would not play.