makeqstrdata: permit longer "compressed" outputs
It is possible for this routine to expand some inputs, and in fact it does for certan strings in the proposed Korean translation of CircuitPython (#1858). I did not determine what the maximum expansion is -- it's probably modest, like len()/7+2 bytes or something -- so I tried to just make enc[] an adequate over-allocation, and then ensured that all the strings in the proposed ko.po now worked. The worst actual expansion seems to be a string that goes from 65 UTF-8-encoded bytes to 68 compressed bytes (+4.6%). Only a few out of all strings are reported as non-compressed.
This commit is contained in:
parent
95d2694bc3
commit
c4f3a02b3b
|
@ -180,7 +180,7 @@ def compress(encoding_table, decompressed):
|
|||
if not isinstance(decompressed, bytes):
|
||||
raise TypeError()
|
||||
values, lengths = encoding_table
|
||||
enc = bytearray(len(decompressed))
|
||||
enc = bytearray(len(decompressed) * 2)
|
||||
#print(decompressed)
|
||||
#print(lengths)
|
||||
current_bit = 7
|
||||
|
@ -227,6 +227,8 @@ def compress(encoding_table, decompressed):
|
|||
current_bit -= 1
|
||||
if current_bit != 7:
|
||||
current_byte += 1
|
||||
if current_byte > len(decompressed):
|
||||
print("Note: compression increased length", repr(decompressed.decode('utf-8')), len(decompressed), current_byte, file=sys.stderr)
|
||||
return enc[:current_byte]
|
||||
|
||||
def qstr_escape(qst):
|
||||
|
|
Loading…
Reference in New Issue