Add some features which are already enabled in the unix port and
default to using the Python stack for scoped allocations: this can be
more performant in cases the heap is heavily used because for example
the memory needed for storing *args and **kwargs doesn't require
scanning the heap to find a free block.