Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

auto reload TLS resources when loading from local files #12325

Merged

Conversation

zhtaoxiang
Copy link
Contributor

This is a follow up of #12277. That PR makes tls keystore/truststore swappable. This PR auto reload tls keystore/truststore when they are local files.

With this PR, users do not need to restart pinot components when they need to update keystore/truststore (when they are local files). Users only need to override the same keystore/truststore files.

@codecov-commenter
Copy link

codecov-commenter commented Jan 26, 2024

Codecov Report

Attention: 43 lines in your changes are missing coverage. Please review.

Comparison is base (7978d29) 61.62% compared to head (4ebddac) 61.69%.
Report is 9 commits behind head on master.

Files Patch % Lines
...n/java/org/apache/pinot/common/utils/TlsUtils.java 51.47% 25 Missing and 8 partials ⚠️
...pache/pinot/common/utils/grpc/GrpcQueryClient.java 0.00% 7 Missing ⚠️
...che/pinot/core/transport/grpc/GrpcQueryServer.java 0.00% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #12325      +/-   ##
============================================
+ Coverage     61.62%   61.69%   +0.06%     
+ Complexity     1152     1146       -6     
============================================
  Files          2421     2421              
  Lines        131809   131911     +102     
  Branches      20343    20367      +24     
============================================
+ Hits          81227    81378     +151     
+ Misses        44628    44569      -59     
- Partials       5954     5964      +10     
Flag Coverage Δ
custom-integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration <0.01% <0.00%> (-0.01%) ⬇️
integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration2 0.00% <0.00%> (ø)
java-11 61.64% <44.87%> (+0.07%) ⬆️
java-21 61.54% <44.87%> (+0.04%) ⬆️
skip-bytebuffers-false 61.68% <44.87%> (+0.07%) ⬆️
skip-bytebuffers-true 61.51% <44.87%> (+0.02%) ⬆️
temurin 61.69% <44.87%> (+0.06%) ⬆️
unittests 61.68% <44.87%> (+0.06%) ⬆️
unittests1 46.81% <44.87%> (+0.04%) ⬆️
unittests2 27.72% <0.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

String trustStoreType, String trustStorePath, String trustStorePassword,
String sslContextProtocol, SecureRandom secureRandom)
throws IOException, URISyntaxException, InterruptedException {
WatchService watchService = FileSystems.getDefault().newWatchService();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting approach :) I had a question on how we detect the change for the local files and found this code. It's a clever approach 👍 (I thought of keeping modified time or crc)

TlsUtils.registerFile(watchService, watchKeyPathMap, TLS_TRUSTSTORE_FILE_PATH);

// wait for the thread to start
Thread.sleep(1000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 sec may be too long?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updaged to 100ms

// The reloadSslFactoryWhenFileStoreChanges is a blocking call, so we need to create a new thread to run it.
// Creating a new thread to run the reloadSslFactoryWhenFileStoreChanges is costly; however, unless we
// invoke the createAutoRenewedSSLFactoryFromFileStore method crazily, this should not be a problem.
Executors.newSingleThreadExecutor().execute(() -> {
Copy link
Contributor

@snleee snleee Jan 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the usage of createAutoRenewedSSLFactoryFromFileStore() and we may hit this for every single query request? (Since we call it when we initialize GrpcQueryClient. It looks that we will create GrpcQueryClient every time we hit broker query api)

If that's the case, this can hurt the broker (imagine that we get 1000s of queries per sec and we spin up the new thread for each query for renewing ssl). Should we consider to add a background periodic task that monitors the ssl files and make the change if needed?

Let's double check if that's the case and think about the better solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good point. Initially I had the same concern.

Since we call it when we initialize GrpcQueryClient. It looks that we will create GrpcQueryClient every time we hit broker query api

However, the constructor of GrpcQueryClient is only used here: https://github.com/apache/pinot/blob/master/pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/GrpcBrokerRequestHandler.java#L153, and the constructor of GrpcQueryServer is only used here https://github.com/apache/pinot/blob/master/pinot-server/src/main/java/org/apache/pinot/server/starter/ServerInstance.java#L159, they will only be created limited number of times. If no new usage pattern added, there will be no problem for the current implementation.

Let's take a step back first. In theory, there are two ways to handle files updates:

  1. Push based solution: the one that's implemented now.
  2. Pull based solution: add a background periodic task that monitors the ssl files and make the change if needed.

In either case, we need to create a task to wait for push event or pull changes. If the function is invoked numerous times, then equal number of tasks need to be added.

To solve the issue, I am thinking about reuse the same SSLFactory instance if key and trust store files are the same for different invokation of the same method. This will solve the problem if not many different key and trust store files are used - which will normally be the case in practice.

WDYT?

Copy link
Contributor

@snleee snleee Jan 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the GrpcQueryServer is only created on server instance creation, how we will be pushing the reloadSslFactoryWhenFileStoreChanges() to be called when there's a key change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I missed the infinite loop part.

In this case, this code approach looks good to me 👍

Copy link
Contributor

@snleee snleee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

while ((key = watchService.take()) != null) {
for (WatchEvent<?> event : key.pollEvents()) {
Path changedFile = (Path) event.context();
if (watchKeyPathMap.get(key).contains(changedFile)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will there be a problem if only one of the new keyStore file or trustStore file got updated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally when key store or trust store files are updated, there should be some period that both the new key/CA and old key/CA work for seamless migration (i.e., no interruption of existing connections)

Even if only one of key store and trust store files are updated, as long as they are updated properly, there should be no problem.

@xiangfu0 xiangfu0 merged commit 3ab2834 into apache:master Jan 31, 2024
19 checks passed
@xiangfu0
Copy link
Contributor

Thanks for adding this feature!
Super useful!

suyashpatel98 pushed a commit to suyashpatel98/pinot that referenced this pull request Feb 28, 2024
* auto renew TLS resources when loading from local files

* reduce waiting time in test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants