Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include openssl and ICU statically linked #7956

Merged
merged 5 commits into from
Jun 7, 2024
Merged

Conversation

rahulinux
Copy link
Contributor

@rahulinux rahulinux commented Jun 4, 2024

Problem

Due to the upcoming End of Life (EOL) for Debian 11, we need to upgrade
the base OS for Pageservers from Debian 11 to Debian 12 for security reasons.

When deploying a new Pageserver on Debian 12 with the same binary built on
Debian 11, we encountered the following errors:

could not execute operation: pageserver error, status: 500, 
msg: Command failed with status ExitStatus(unix_wait_status(32512)): 
/usr/local/neon/v16/bin/initdb: error while loading shared libraries: 
libicuuc.so.67: cannot open shared object file: No such file or directory

and

could not execute operation: pageserver error, status: 500, 
msg: Command failed with status ExitStatus(unix_wait_status(32512)):
 /usr/local/neon/v14/bin/initdb: error while loading shared libraries: 
 libssl.so.1.1: cannot open shared object file: No such file or directory

These issues occur when creating new projects.

Summary of changes

  • To address these issues, we configured PostgreSQL build to use
    statically linked OpenSSL and ICU libraries.

  • This resolves the missing shared library errors when running the
    binaries on Debian 12.

Closes: https://github.com/neondatabase/cloud/issues/12648

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

@rahulinux rahulinux force-pushed the static-build-binaries branch 2 times, most recently from e62fab2 to 599385d Compare June 4, 2024 14:07
Copy link
Contributor Author

@rahulinux rahulinux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tristan957 I'm trying to build static binary and would like to know if this option is enough to have libssl?

Copy link

github-actions bot commented Jun 4, 2024

3192 tests run: 3053 passed, 0 failed, 139 skipped (full report)


Flaky tests (2)

Postgres 16

  • test_download_remote_layers_api: debug

Postgres 14

  • test_download_remote_layers_api: release

Code coverage* (full report)

  • functions: 31.5% (6598 of 20937 functions)
  • lines: 48.5% (51057 of 105314 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
3f62675 at 2024-06-07T17:39:05.665Z :recycle:

@tristan957
Copy link
Member

I wouldn't think so. Wouldn't you need something as discussed in https://stackoverflow.com/questions/6578484/telling-gcc-directly-to-link-a-library-statically?

@rahulinux rahulinux force-pushed the static-build-binaries branch 4 times, most recently from 2f347c9 to 384cc75 Compare June 4, 2024 19:10
@tristan957
Copy link
Member

What is the output of ldd after the latest push?

@rahulinux rahulinux force-pushed the static-build-binaries branch 9 times, most recently from aa6c567 to 16e0f33 Compare June 5, 2024 15:00
@rahulinux rahulinux marked this pull request as ready for review June 5, 2024 18:18
@rahulinux rahulinux changed the title Build static binaries for debug release Include openssl and ICU statically linked Jun 5, 2024
@tristan957
Copy link
Member

tristan957 commented Jun 5, 2024

Why can we just not recompile the pageservers postgres binaries?

@rahulinux
Copy link
Contributor Author

Why can we just not recompile the pageservers?

Thtat's good point and we have encountered this libs missing issue when creating new project. its uses /usr/local/neon/v16/bin/initdb

Maybe someone from Storage team can explain this better on why its uses initdb @jcsp

@rahulinux
Copy link
Contributor Author

What is the output of ldd after the latest push?

Result from my local build:

ldd /usr/local/pgsql/bin/initdb
	linux-vdso.so.1 (0x0000ffffba1f6000)
	libpq.so.5 => /usr/local/pgsql/lib/libpq.so.5 (0x0000ffffb7d89000)
	libstdc++.so.6 => /usr/lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000ffffb7bb1000)
	libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000ffffb7b06000)
	libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x0000ffffb7ad5000)
	libdl.so.2 => /lib/aarch64-linux-gnu/libdl.so.2 (0x0000ffffb7ac1000)
	libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000ffffb7a9d000)
	libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffffb7929000)
	/lib/ld-linux-aarch64.so.1 (0x0000ffffba1c6000)

old build:

ldd /usr/local/neon/v16/bin/initdb
	linux-vdso.so.1 (0x00007ffc4db3b000)
	libpq.so.5 => not found
	libicuuc.so.67 => not found
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007faeca0b8000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007faeca0b3000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007faec9fd4000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007faec9df1000)
	/lib64/ld-linux-x86-64.so.2 (0x00007faeca0f6000)

Dockerfile.build-tools Outdated Show resolved Hide resolved
@kelvich
Copy link
Contributor

kelvich commented Jun 7, 2024

So we are bringing newer openssl and icu to match versions on the destination Linux. IMO it should work with both static and dynamic linking. Static is more reliable, but if it causes troubles we can roll that back to dynamic.

@rahulinux rahulinux force-pushed the static-build-binaries branch 2 times, most recently from f0c2bd2 to cab653a Compare June 7, 2024 14:07
Makefile Outdated Show resolved Hide resolved
@rahulinux rahulinux force-pushed the static-build-binaries branch 2 times, most recently from 416b0cc to d540a4f Compare June 7, 2024 14:59
@rahulinux rahulinux force-pushed the static-build-binaries branch 5 times, most recently from b08c064 to d9a1c67 Compare June 7, 2024 15:48
Due to the upcoming End of Life (EOL) for Debian 11, we need to upgrade
the base OS for Pageservers from Debian 11 to Debian 12 for security reasons.

When deploying a new Pageserver on Debian 12 with the same binary built on
Debian 11, we encountered the following errors:

```
could not execute operation: pageserver error, status: 500,
msg: Command failed with status ExitStatus(unix_wait_status(32512)):
/usr/local/neon/v16/bin/initdb: error while loading shared libraries:
libicuuc.so.67: cannot open shared object file: No such file or directory
```

and

```
could not execute operation: pageserver error, status: 500,
msg: Command failed with status ExitStatus(unix_wait_status(32512)):
/usr/local/neon/v14/bin/initdb: error while loading shared libraries:
libssl.so.1.1: cannot open shared object file: No such file or directory
```

These issues occur when creating new projects.

Closes: #12648

- To address these issues, we configured PostgreSQL build to use
  statically linked OpenSSL and ICU libraries.

- This resolves the missing shared library errors when running the
  binaries on Debian 12.
We need to rebuild Postgres for changes in Dockerfile.build-tools as well

this commit was override by my force push, so adding it again.
@rahulinux rahulinux enabled auto-merge (squash) June 7, 2024 16:52
@rahulinux rahulinux merged commit 3b647cd into main Jun 7, 2024
61 of 62 checks passed
@rahulinux rahulinux deleted the static-build-binaries branch June 7, 2024 17:28
@hlinnaka
Copy link
Contributor

So we are bringing newer openssl and icu to match versions on the destination Linux.

OpenSSL seems fine.

ICU might be problematic. When the ICU version number changes, you will start to get a notice like this when you connect to a database that was created with the older version:

WARNING:  database "postgres" has a collation version mismatch
DETAIL:  The database was created using collation version 153.14.37, but the operating system provides version 153.120.42.
HINT:  Rebuild all objects in this database that use the default collation and run ALTER DATABASE postgres REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.

That's probably OK, I'm not aware of any ICU collation changes that would be likely to hit (glibc updates have been more problematic in the past). But an entry in the public-facing changelog and a heads up to the support team would be in order.

problame added a commit that referenced this pull request Jun 10, 2024
problame added a commit that referenced this pull request Jun 10, 2024
Reverts #7956

Rationale: compute incompatibilties

Slack thread:
https://neondb.slack.com/archives/C033RQ5SPDH/p1718011276665839?thread_ts=1718008160.431869&cid=C033RQ5SPDH

Relevant quotes from @hlinnaka 

> If we go through with the current release candidate, but the compute
is pinned, people who create new projects will get that warning, which
is silly. To them, it looks like the ICU version was downgraded, because
initdb was run with newer version.

> We should upgrade the ICU version eventually. And when we do that,
users with old projects that use ICU will start to see that warning. I
think that's acceptable, as long as we do homework, notify users, and
communicate that properly.
> When do that, we should to try to upgrade the storage and compute
versions at roughly the same time.
problame added a commit that referenced this pull request Jun 10, 2024
Reverts #7956

Rationale: compute incompatibilties

Slack thread:
https://neondb.slack.com/archives/C033RQ5SPDH/p1718011276665839?thread_ts=1718008160.431869&cid=C033RQ5SPDH

Relevant quotes from @hlinnaka 

> If we go through with the current release candidate, but the compute
is pinned, people who create new projects will get that warning, which
is silly. To them, it looks like the ICU version was downgraded, because
initdb was run with newer version.

> We should upgrade the ICU version eventually. And when we do that,
users with old projects that use ICU will start to see that warning. I
think that's acceptable, as long as we do homework, notify users, and
communicate that properly.
> When do that, we should to try to upgrade the storage and compute
versions at roughly the same time.
problame pushed a commit that referenced this pull request Jun 17, 2024
We had to revert the earlier static linking change due to libicu version
incompatibilities:

- original PR: #7956
- revert PR: #8003

Specifically, the problem manifests for existing projects as error

```
DETAIL:  The collation in the database was created using version 153.120.42, but the operating system provides version 153.14.37.
```

So, this PR reintroduces the original change but with the exact same
libicu version as in Debian `bullseye`, i.e., the libicu version that
we're using today.
This avoids the version incompatibility.

Additional changes made by Christian:
- validation of the libicu tarball checksum
- parallel build (`-j $(nproc)`) for openssl and libicu

Ultimately, we'll have to figure out an upgrade store for libciu.
That work is tracked in epic TODO.

Details
=======

Debian bullseye has a few patches on top of libicu:
https://sources.debian.org/patches/icu/67.1-7/

Eyeballing them (we don't have the expertise to confidently judge
whether any of them affect Postgres), the following two stand out:

* https://sources.debian.org/patches/icu/67.1-7/ICU-13786_Fix_addLikelySubtags_minimizeSubtags.patch/
    * This seems to pull in the patch from unicode-org/icu#1140
    * It's a bugfix (not CVE?) for upstream issue https://unicode-org.atlassian.net/browse/ICU-13786
    * Unsure whether it's relevant to Postgres.
    * Explanation of "minimize subtags" functionality fixed in this patch: https://developer.android.com/ndk/reference/group/icu4c#uloc_minimizesubtags
* https://sources.debian.org/patches/icu/67.1-7/locid_operators.patch/
    * The patch doesn't mention why it's been added.
    * A similar patch was committed upstream as commit `2dc5bea9061b4fb05cd03e21b775dd944a0eb81d`, `ICU-21587 Fix memory bug w/ baseName`
    * Looks like this is the fix for CVE-2021-30535
    * Upstream issue tracker link: https://unicode-org.atlassian.net/browse/ICU-21587
    * We should probably have this patch. But there's no proper patch release from upstream.

Solution for integrating these patches: TBD, they're not integrated in this PR.
Ideas:
1. Build the bullseye package in bookworm (easiest way to maintain exact Debian patchstack)
2. Fork libicu on github, apply patches manually, create a release, make Dockerfile download our fork's release.

Refs
====

Co-authored-by: Christian Schwarz <christian@neon.tech>

refs neondatabase/cloud#12648
rahulinux added a commit that referenced this pull request Jun 18, 2024
We had to revert the earlier static linking change due to libicu version
incompatibilities:

- original PR: #7956
- revert PR: #8003

Specifically, the problem manifests for existing projects as error

```
DETAIL:  The collation in the database was created using version 153.120.42, but the operating system provides version 153.14.37.
```

So, this PR reintroduces the original change but with the exact same
libicu version as in Debian `bullseye`, i.e., the libicu version that
we're using today.
This avoids the version incompatibility.


Additional changes made by Christian
====================================
- `hashFiles` can take multiple arguments, use that feature
- validation of the libicu tarball checksum
- parallel build (`-j $(nproc)`) for openssl and libicu

Follow-ups
==========

Debian bullseye has a few patches on top of libicu:
https://sources.debian.org/patches/icu/67.1-7/
We still decide whether we need to include these patches or not.
=> neondatabase/cloud#14527

Eventually, we'll have to figure out an upgrade story for libicu.
That work is tracked in epic
neondatabase/cloud#14525.

The OpenSSL version in this PR is arbitrary.
We should use `1.1.1w` + Debian patches if applicable.
See neondatabase/cloud#14526.

Longer-term:
* neondatabase/cloud#14519
* neondatabase/cloud#14525

Refs
====

Co-authored-by: Christian Schwarz <christian@neon.tech>

refs neondatabase/cloud#12648

---------

Co-authored-by: Rahul Patil <rahul@neon.tech>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-extra-build-macos When placed on a PR, tells the CI to run a build on macOS. No unit tests are run, though.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants