Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: PGO-enabled release builds #1409

Closed
bnoordhuis opened this issue Apr 13, 2015 · 7 comments
Closed

build: PGO-enabled release builds #1409

bnoordhuis opened this issue Apr 13, 2015 · 7 comments
Labels
benchmark Issues and PRs related to the benchmark subsystem. build Issues and PRs related to build files or the CI.

Comments

@bnoordhuis
Copy link
Member

This is something that has been at the bottom of my TODO list for some time now. I don't seem to get around to it so I thought I'd file this issue in the hope that someone else might. :-)

I'd like to investigate doing release builds with profile-guided optimizations enabled. Strawman Makefile change:

diff --git a/Makefile b/Makefile
index e93e817..2deffda 100644
--- a/Makefile
+++ b/Makefile
@@ -257,6 +257,13 @@ release-only:
                exit 1 ; \
        fi

+.PHONY: pgo
+pgo:   BUILDTYPE=Release
+pgo:
+       $(MAKE) CFLAGS+="-fprofile-generate" CXXFLAGS+="-fprofile-generate"
+       $(MAKE) bench-all
+       $(MAKE) CFLAGS+="-fprofile-use" CXXFLAGS+="-fprofile-use"
+
 pkg: $(PKG)

 $(PKG): release-only

The idea is to build an instrumented binary first, run the benchmarks to collect profile data that tell the compiler where and what to optimize, then build the final binary using the data from the previous step.

Open questions / unresolved issues:

  • The benchmark suite consists primarily of micro-benchmarks. It's not very representative of real-world applications. We would need something better for PGO.
  • It would make the release process a lot slower, particularly on the ARM buildbots. We could of course do non-PGO builds on slow machines.
  • I couldn't get the 32 bits build to link when PGO and LTO are enabled, at least not with a 32 bits toolchain: it runs out of memory. Could be resolved for the ia32 buildbots by using a 64 bits toolchain (I suspect that's already the case) but for ARM, the only option is to cross-compile and that sucks.
  • PGO may penalize the uncommon case, i.e., it may regress performance for use cases that are not covered by our benchmarks.
  • PGO's net effect may be zero.
@bnoordhuis bnoordhuis added enhancement build Issues and PRs related to build files or the CI. benchmark Issues and PRs related to the benchmark subsystem. labels Apr 13, 2015
@rvagg
Copy link
Member

rvagg commented Apr 14, 2015

I think the biggest problem here is your first point - we don't have a good set of reality-based benchmarks. I've been itching to try and get a WG spun up to focus on this but it hasn't clicked so far (btw I'm not a benchmark person, I'm just interested in seeing this happen). For now @iojs/build have been discussing some regular benchmarking system for builds but this would still require a better benchmark suite.

PGO's net effect may be zero.

Do you have any numbers to share on experiments, or even a gut-feel on this? I've not had experience with PGO compiles.

@jbergstroem
Copy link
Member

@rvagg another issue is that we (@iojs/build) don't really cater for build permutations - so we can't uphold quality assurance (or if it even works across our architecture). It's slightly off topic to this PR, so I'll stop here. I'd also like to see results of this since the PGO tests I've tried in other software hasn't really shown any strong benefits. It's hard to measure though (as mentioned above).

@bnoordhuis
Copy link
Member Author

Do you have any numbers to share on experiments, or even a gut-feel on this? I've not had experience with PGO compiles.

It was a while ago and I didn't properly benchmark it. I ran an x64 PGO build through some http_simple_auto benchmarks and it was about 7-9% faster after but not consistently so. http_simple is notoriously fickle though.

The binary (after stripping) was a few 100 kB smaller though, so that may have helped. In retrospect, I should have profiled the before and after binaries with perf record and check where the differences are. Something for the next guy or gal. :-)

By the way, I turned on -Wl,--gc-sections in the non-PGO build to remove dead code. It's currently disabled because of buggy toolchains but I think we should be able to safely turn it on again. IIRC, the issue was with gcc 4.4 and/or 4.5 in combination with particular binutils versions.

@YurySolovyov
Copy link

Maybe you can pull some benchmarks from projects like express or ws, or other popular(npm top 5/10/20?) modules to get more "real" examples

@bnoordhuis
Copy link
Member Author

/cc @nodejs/benchmarking - perhaps you can incorporate this into your roadmap? I'll close this issue.

@HyperHCl
Copy link

to get more "real" examples

Running profiling against such cases should also be preferred over internal benchmarks since they are more closer to 'real-life usage'. Actually Firefox has some very complicated PGO cases to mimic daily use cases.

(What's the status of this issue now?)

@bnoordhuis
Copy link
Member Author

(What's the status of this issue now?)

I don't believe anyone is or has been working on it.

octaviansoldea pushed a commit to octaviansoldea/node that referenced this issue Aug 31, 2018
This modification allows for compiling with profiled guided
optimization (pgo) using the flags
--enable-pgo-generate and --enable-pgo-use.

Refs: nodejs#21583
Refs: nodejs#1409
gabrielschulhof pushed a commit that referenced this issue Sep 4, 2018
This modification allows for compiling with profiled guided
optimization (pgo) using the flags
--enable-pgo-generate and --enable-pgo-use.

Refs: #21583
Refs: #1409
PR-URL: #21596
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Denys Otrishko <shishugi@gmail.com>
targos pushed a commit that referenced this issue Sep 5, 2018
This modification allows for compiling with profiled guided
optimization (pgo) using the flags
--enable-pgo-generate and --enable-pgo-use.

Refs: #21583
Refs: #1409
PR-URL: #21596
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Denys Otrishko <shishugi@gmail.com>
targos pushed a commit that referenced this issue Sep 27, 2018
This modification allows for compiling with profiled guided
optimization (pgo) using the flags
--enable-pgo-generate and --enable-pgo-use.

Refs: #21583
Refs: #1409
targos pushed a commit that referenced this issue Oct 3, 2018
This modification allows for compiling with profiled guided
optimization (pgo) using the flags
--enable-pgo-generate and --enable-pgo-use.

Refs: #21583
Refs: #1409
PR-URL: #21596
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Denys Otrishko <shishugi@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmark Issues and PRs related to the benchmark subsystem. build Issues and PRs related to build files or the CI.
Projects
None yet
Development

No branches or pull requests

6 participants