This adds support for the x format code in struct.pack and struct.unpack.
The primary use case for this is ignoring bytes while unpacking. When
interfacing with existing systems, it may often happen that you either have
fields in a struct that aren't properly specified or you simply don't care
about them. Being able to easily skip them is useful.
Signed-off-by: Daniël van de Giessen <daniel@dvdgiessen.nl>
While checking whether we can enable -Wimplicit-fallthrough, I encountered
a diagnostic in mp_binary_set_val_array_from_int which led to discovering
the following bug:
```
>>> struct.pack("xb", 3)
b'\x03\x03'
```
That is, the next value (3) was used as the value of a padding byte, while
standard Python always fills "x" bytes with zeros. I initially thought
this had to do with the unintentional fallthrough, but it doesn't.
Instead, this code would relate to an array.array with a typecode of
padding ('x'), which is ALSO not desktop Python compliant:
```
>>> array.array('x', (1, 2, 3))
array('x', [1, 0, 0])
```
Possibly this is dead code that used to be shared between struct-setting
and array-setting, but it no longer is.
I also discovered that the argument list length for struct.pack
and struct.pack_into were not checked, and that the length of binary data
passed to array.array was not checked to be a multiple of the element
size.
I have corrected all of these to conform more closely to standard Python
and revised some tests where necessary. Some tests for micropython-specific
behavior that does not conform to standard Python and is not present
in CircuitPython was deleted outright.
Prior to this patch, the size of the buffer given to pack_into() was checked
for being too small by using the count of the arguments, not their actual
size. For example, a format spec of '4I' would only check that there was 4
bytes available, not 16; and 'I' would check for 1 byte, not 4.
The pack() function is ok because its buffer is created to be exactly the
correct size.
The fix in this patch calculates the total size of the format spec at the
start of pack_into() and verifies that the buffer is large enough. This
adds some computational overhead, to iterate through the whole format spec.
The alternative is to check during the packing, but that requires extra
code to handle alignment, and the check is anyway not needed for pack().
So to maintain minimal code size the check is done using struct_calcsize.
Prior to this patch, the size of the buffer given to unpack/unpack_from was
checked for being too small by using the count of the arguments, not their
actual size. For example, a format spec of '4I' would only check that
there was 4 bytes available, not 16; and 'I' would check for 1 byte, not 4.
This bug is fixed in this patch by calculating the total size of the format
spec at the start of the unpacking function. This function anyway needs to
calculate the number of items at the start, so calculating the total size
can be done at the same time.
Tests which don't work with small ints are suffixed with _intbig.py. Some
of these may still work with long long ints and need to be reclassified
later.
mp_obj_int_get_truncated is used as a "fast path" int accessor that
doesn't check for overflow and returns the int truncated to the machine
word size, ie mp_int_t.
Use mp_obj_int_get_truncated to fix struct.pack when packing maximum word
sized values.
Addresses issues #779 and #998.
Only calcsize() and unpack() functions provided so far, for little-endian
byte order. Format strings don't support repition spec (like "2b3i").
Unfortunately, dealing with all the various binary type sizes and alignments
will lead to quite a bloated "binary" helper functions - if optimizing for
speed. Need to think if using dynamic parametrized algos makes more sense.