circuitpython/py
Jeff Epler 08ed09acc6 makeqstrdata: don't print "compression incrased length" messages
This check as implemented is misleading, because it compares the
compressed size in bytes (including the length indication) with the source
string length in Unicode code points.  For English this is approximately
fair, but for Japanese this is quite unfair and produces an excess of
"increased length" messages.

This message might have existed for one of two reasons:
 * to alert to an improperly function huffman compression
 * to call attention to a need for a "string is stored uncompressed" case
We know by now that the huffman compression is functioning as designed and
effective in general.

Just to be on the safe side, I did some back-of-the-envelope estimates.
I considered these three replacements for "the true source string size, in bytes":
+    decompressed_len_utf8 = len(decompressed.encode('utf-8'))
+    decompressed_len_utf16 = len(decompressed.encode('utf-16be'))
+    decompressed_len_bitsize = ((1+len(decompressed)) * math.ceil(math.log(1+len(values), 2)) + 7) // 8

The third counts how many bits each character requires (fewer than 128
characters in the source character set = 7, fewer than 256 = 8, fewer than 512
= 9, etc, adding a string-terminating value) and is in some way representative
of the best way we would be able to store "uncompressed strings".  The Japanese
translation (largest as of writing) has just a few strings which increase by
this metric.  However, the amount of loss due to expansion in those cases is
outweighed by the cost of adding 1 bit per string to indicate whether it's
compressed or not.  For instance, in the BOARD=trinket_m0 TRANSLATION=ja build
the loss is 47 bytes over 300 strings.  Adding 1 bit to each of 300 strings will
cost about 37 bytes, leaving just 5 Thumb instructions to implement the code to
check and decode "uncompressed" strings in order to break even.
2020-08-16 20:50:48 -05:00
..
argcheck.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
asmarm.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
asmarm.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
asmbase.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
asmbase.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
asmthumb.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
asmthumb.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
asmx64.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
asmx64.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
asmx86.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
asmx86.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
asmxtensa.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
asmxtensa.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
bc0.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
bc.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
bc.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
binary.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
binary.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
builtin.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
builtinevex.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
builtinhelp.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
builtinimport.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
circuitpy_defns.mk sharpmemory: Implement support for Sharp Memory Displays in framebufferio 2020-08-12 07:32:18 -05:00
circuitpy_mpconfig.h sharpmemory: Implement support for Sharp Memory Displays in framebufferio 2020-08-12 07:32:18 -05:00
circuitpy_mpconfig.mk Don't define SHARPDISPLAY when !DISPLAYIO 2020-08-12 07:39:28 -05:00
compile.c Merge pull request #3222 from WarriorOfWire/pick_micropython 2020-07-29 10:54:37 -07:00
compile.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
emit.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
emitbc.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
emitcommon.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
emitglue.c add coroutine behavior for generators 2020-07-23 20:40:16 -07:00
emitglue.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
emitinlinethumb.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
emitinlinextensa.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
emitnarm.c py: Refactor how native emitter code is compiled with a file per arch. 2018-04-10 15:06:47 +10:00
emitnative.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
emitnthumb.c py: Refactor how native emitter code is compiled with a file per arch. 2018-04-10 15:06:47 +10:00
emitnx64.c py: Refactor how native emitter code is compiled with a file per arch. 2018-04-10 15:06:47 +10:00
emitnx86.c py/emitnx86: Fix 32-bit x86 native emitter build by including header. 2018-05-04 20:39:16 +10:00
emitnxtensa.c py: Refactor how native emitter code is compiled with a file per arch. 2018-04-10 15:06:47 +10:00
formatfloat.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
formatfloat.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
frozenmod.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
frozenmod.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
gc_long_lived.c Remove debug extern 2020-03-03 10:55:50 -08:00
gc_long_lived.h Introduce a long lived section of the heap. 2018-01-24 10:33:46 -08:00
gc.c Get AllocationAlarm working 2020-07-17 17:15:03 -07:00
gc.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
grammar.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
ioctl.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
lexer.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
lexer.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
makemoduledefs.py Add files via upload 2019-04-05 21:38:32 +02:00
makeqstrdata.py makeqstrdata: don't print "compression incrased length" messages 2020-08-16 20:50:48 -05:00
makeqstrdefs.py Update filter and handle nested quotes 2018-08-09 14:16:28 -07:00
makeversionhdr.py Support internationalisation. 2018-08-07 14:58:57 -07:00
malloc.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
map.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
misc.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
mkenv.mk Use python3 for mpy-tool 2018-10-18 10:37:42 -07:00
mkrules.mk Make stripping circuitpython optional, not the default 2020-04-14 18:24:58 -05:00
modarray.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
modbuiltins.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
modcmath.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
modcollections.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
modgc.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
modio.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
modmath.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
modmicropython.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
modstruct.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
modsys.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
modthread.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
moduerrno.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
mpconfig.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
mperrno.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
mphal.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
mpprint.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
mpprint.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
mpstate.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
mpstate.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
mpthread.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
mpz.c Fix to pass mpy-cross build 2020-07-13 22:54:52 -05:00
mpz.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
nativeglue.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
nlr.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
nlr.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
nlrsetjmp.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
nlrthumb.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
nlrx64.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
nlrx86.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
nlrxtensa.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
obj.c py: mp_obj_get_type_qstr as macro saves 24 bytes 2020-08-04 14:45:45 -05:00
obj.h py: mp_obj_get_type_qstr as macro saves 24 bytes 2020-08-04 14:45:45 -05:00
objarray.c Turn off find when CPYTHON_COMPAT is off 2020-07-21 15:40:51 -07:00
objarray.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objattrtuple.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objbool.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objboundmeth.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objcell.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objclosure.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objcomplex.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objdeque.c More make_new fixes for unix build 2019-01-18 11:53:09 -08:00
objdict.c "pop from empty %q" 2020-08-04 18:42:09 -05:00
objenumerate.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objexcept.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objexcept.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objfilter.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objfloat.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objfun.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objfun.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objgenerator.c remove new char*s because m0 is way oversubscribed 2020-07-23 20:41:10 -07:00
objgenerator.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objgetitemiter.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objint_longlong.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objint_mpz.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objint.c Use qstrs to save an additional 4 bytes 2020-08-04 14:45:45 -05:00
objint.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objlist.c "pop from empty %q" 2020-08-04 18:42:09 -05:00
objlist.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objmap.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objmodule.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objmodule.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objnamedtuple.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objnamedtuple.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objnone.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objobject.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objpolyiter.c all: Remove inclusion of internal py header files. 2017-10-04 12:37:50 +11:00
objproperty.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objproperty.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objrange.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objreversed.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objset.c "pop from empty %q" 2020-08-04 18:42:09 -05:00
objsingleton.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objslice.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objstr.c Combine 'index out of range' messages 2020-08-04 14:45:45 -05:00
objstr.h Add externs. GCC10 complains about duplicate defines 2020-07-22 16:26:46 -07:00
objstringio.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objstringio.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objstrunicode.c Combine 'index out of range' messages 2020-08-04 14:45:45 -05:00
objtuple.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objtuple.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objtype.c various: Use mp_obj_get_type_qstr more widely 2020-08-04 14:45:45 -05:00
objtype.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
objzip.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
opmethods.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
parse.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
parse.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
parsenum.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
parsenum.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
parsenumbase.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
parsenumbase.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
persistentcode.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
persistentcode.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
proto.c various: Use mp_obj_get_type_qstr more widely 2020-08-04 14:45:45 -05:00
proto.h Fix up end of file and trailing whitespace. 2020-06-03 10:56:35 +01:00
py.mk Upgrade ulab 2020-07-28 16:57:48 -05:00
pystack.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
pystack.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
qstr.c sharpmemory: Implement support for Sharp Memory Displays in framebufferio 2020-08-12 07:32:18 -05:00
qstr.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
qstrdefs.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
reader.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
reader.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
reload.c move reload exception to reload.c 2018-05-14 17:41:17 -04:00
reload.h move reload exception to reload.c 2018-05-14 17:41:17 -04:00
repl.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
repl.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
ringbuf.c ringbuf tested 2020-04-21 22:40:12 -04:00
ringbuf.h ringbuf tested 2020-04-21 22:40:12 -04:00
runtime0.h add coroutine behavior for generators 2020-07-23 20:40:16 -07:00
runtime_utils.c WIP: after merge; before testing 2018-07-11 16:45:30 -04:00
runtime.c various: Use mp_obj_get_type_qstr more widely 2020-08-04 14:45:45 -05:00
runtime.h py: introduce, use mp_raise_msg_vlist 2020-08-04 13:34:29 -05:00
scheduler.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
scope.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
scope.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
sequence.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
showbc.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
smallint.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
smallint.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
stackctrl.c Fix up end of file and trailing whitespace. 2020-06-03 10:56:35 +01:00
stackctrl.h Initial merge of micropython v1.9.2 into circuitpython 2.0.0 (in development) master. 2017-08-25 22:17:07 -04:00
stream.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
stream.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
unicode.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
unicode.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
vm.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
vmentrytable.h Add license to some obvious files. 2020-07-06 19:16:25 +01:00
vstr.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00
warning.c Add license to some obvious files. 2020-07-06 19:16:25 +01:00