I was puzzled by why the dictionary words were sorted by length.
It was because TextSplitter sorted its parameter, instead of a copy.
This doesn't affect encoding size, but does affect the encoding NUMBER
of the found words. We'll deliberately restore sorting by length next,
for other reasons, but not by spooky action.
Try to accurately measure the costs of including a word in the dictionary
vs the gains from using it in messages.
This saves about 160 bytes on trinket_m0 ja, the fullest translation
for that board. Other translations on the same board all have savings,
ranging from 24 to 228 bytes.
```
Translation Before After Savings
ja 1164 1324 160
de_DE 1260 1396 136
fr 1424 1652 228
zh_Latn_pinyin 1448 1520 72
pt_BR 1584 1736 152
pl 1592 1640 48
es 1724 1816 92
ko 1724 1816 92
fil 1764 1800 36
it_IT 1896 2040 144
nl 1956 2136 180
ID 2072 2180 108
cs 2124 2148 24
sv 2340 2448 108
en_x_pirate 2644 2740 96
en_GB 2652 2752 100
el 2656 2768 112
en_US 2656 2768 112
hi 2656 2768 112
```
By comparing the address of the initial 'name' field instead of the
addresses of the objects themselves, a small amount of type safety is
added back, vs just casting to void.
In the event that some other kind of object is passed in as 't',
which happens to have a 'name' field of the right type, the construct
would be (undesirably) accepted but it would almost certainly evaluate
to false at runtime.
Prior to this commit, cache flushing for ARM native code was done only in
the assembler code asm_thumb_end_pass()/asm_arm_end_pass(), at the last
pass of the assembler. But this misses flushing the cache when loading
native code from an .mpy file, ie in persistentcode.c.
The change here makes sure the cache is always flushed/cleaned/invalidated
when assigning native code on ARM architectures.
This problem was found running tests/micropython/import_mpy_native_gc.py on
the mimxrt port.
Signed-off-by: Damien George <damien@micropython.org>