Speed LTO builds by using multiple threads

On my i5-1235U laptop this speeds LTO "partition=balanced" builds
substantially, because each "partition" can be run on a separate
CPU thread. I used "pygamer" as my test build with a parallelism of
`-j4`, and took the best elapsed time reported over 4 builds.

The improvement was from 34.6s to 24.0s (-30%).

A link-only build (rm build-pygamer/firmware.elf; make -j...) improved
from1 17.4s to 5.1s (-70%)

The size of the resulting firmware is unchanged.

Boards that are nearly full use "-flto-partition=one" to improve code
size optimization. When LTO partition is "one", this feature doesn't help
but it doesn't seem to negatively affect anything either (tested
building trinket_m0)
This commit is contained in:
Jeff Epler 2023-06-20 11:01:46 -05:00
parent 0aaf5a4a98
commit cae02f1cdf
No known key found for this signature in database
GPG Key ID: D5BF15AB975AB4DE

View File

@ -70,7 +70,7 @@ endif
CIRCUITPY_LTO ?= 0
CIRCUITPY_LTO_PARTITION ?= balanced
ifeq ($(CIRCUITPY_LTO),1)
CFLAGS += -flto -flto-partition=$(CIRCUITPY_LTO_PARTITION) -DCIRCUITPY_LTO=1
CFLAGS += -flto=jobserver -flto-partition=$(CIRCUITPY_LTO_PARTITION) -DCIRCUITPY_LTO=1
else
CFLAGS += -DCIRCUITPY_LTO=0
endif