This blends two "565"-format bitmaps, including byteswapped ones. All
the bitmaps have to have the same memory format.
The routine takes about 63ms on a Kaluga when operating on 320x240 bitmaps.
Of course, displaying the bitmap also takes time.
There's untested code for the L8 (8-bit greyscale) case. This can be
enabled once gifio is merged.