+2
−1
Loading
clang's LTO build is causing havoc when it comes to its memory usage
under Linux. Under nightly, the clang-vg job is suffering from
frequent OOM issue caused by the linker processes which hog all
the memory.
By default we set number of make jobs to number of available cpus.
The affected VM is setup in the following manner:
- 10 CPUs
- 16GB RAM
When we jump into linking part, we end up with 10 instances of the
ld.gold running at the same time. Each one inflates its heap up to
1-2GB so this ends up in the actual RSS - procfs, linker's --stats,
etc. all confirmed that malloc peaks around this range.
Initial suspicion was around mmap since the files linker operates
on in context of the static build are quite huge, but the above
confirmed that wasn't the case (double checked with various
"optimization" opts like --no-mmap-output-file, --no-map-whole-files,
etc. which didn't change linker's behavior even a little).
Same behavior is seen with ld.lld. That said, FreeBSD doesn't show
similar symptoms (though the SPDK there is not build in "full", hence
this comparison may not be entirely fair).
To mitigate, drop number of make jobs by half, but enable ld.gold's
threading to compensate a little - this will hit a penalty if it
comes to build's runtime but should allow the VM to survive.
This issue was seen in the past. Majority of the VMs use 12GB (some
even less) but we were forced to increase it for this particular job
couple of times already - we can't keep doing that as we eventually
exhaust resources available in our VM pool.
Some investigation is needed if this behavior from the ld.{gold,lld}
linkers is expected since currently it forces user to provide quite
a hefty Linux environment for the full SPDK build with clang+LTO to
even complete.
Change-Id: Ie8f78794677399ee1dac5d74c310701d140ac682
Signed-off-by:
Michal Berger <michal.berger@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/21770
Tested-by:
SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by:
Jim Harris <jim.harris@samsung.com>
Reviewed-by:
Konrad Sztyber <konrad.sztyber@intel.com>