Two problems: The lead byte for 3-byte sequences was wrong, and one
mid-byte was not even filled in due to a missing "++"!
Apparently this was broken ever since the first "Compress as unicode,
not bytes" commit, but I believed I'd "tested" it by running on the
Pinyin translation.
This rendered at least the Korean and Japanese translations completely
illegible, affecting 5.0 and all later releases.