-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Pinot storage metrics for compressed tar.gz and table size w/o replicas #8358
Conversation
de8c02a
to
20a4f5f
Compare
Codecov Report
@@ Coverage Diff @@
## master #8358 +/- ##
=============================================
- Coverage 70.88% 30.52% -40.37%
=============================================
Files 1655 1643 -12
Lines 86607 86285 -322
Branches 13064 13025 -39
=============================================
- Hits 61389 26335 -35054
- Misses 20982 57583 +36601
+ Partials 4236 2367 -1869
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
20a4f5f
to
af9eb40
Compare
pinot-controller/src/main/java/org/apache/pinot/controller/api/upload/ZKOperator.java
Outdated
Show resolved
Hide resolved
af9eb40
to
5b1eb91
Compare
What use case are we trying to solve here? |
@mcvsubbu segment size can be very useful debugging info (we keep total docs in segment metadata for similar purpose). Currently we only have APIs to read segment size on the servers (uncompressed + extra indexes), but no APIs to read the compressed segment size. The overhead of adding one entry into the segment metadata should be minimal |
I agree that one extra item is minimal in overhead. We add one at a time, though, and we need to be aware that we are expanding it. It is good to know the use case. If this is only for debugging purpose, we can always query the store to find the (compressed) size of the segment. I don't see the need for adding a metric in that case. Am I missing anything? |
@mcvsubbu Besides the debugging purpose, we are also planning to add a gauge for the total compressed data size for each table using the size info in the segment ZK metadata. Reading the size info for one segment through rest API is fine, but querying the size info for the whole table requires sending lots of requests (one per segment) to the deep store, which can potentially cause problems. The gauge metric will be integrated into an existing periodic task to avoid the overhead of reading all segments ZK metadata. |
7b0e7ac
to
530de35
Compare
530de35
to
5e437dd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already read the segment ZK metadata in SegmentStatusChecker
and emit metrics for table/segment status. Let's add the new logic into that class, and we can remove the config as it does not add extra overhead
pinot-common/src/main/java/org/apache/pinot/common/metadata/segment/SegmentZKMetadata.java
Outdated
Show resolved
Hide resolved
...ava/org/apache/pinot/controller/api/resources/PinotSegmentUploadDownloadRestletResource.java
Outdated
Show resolved
Hide resolved
pinot-controller/src/main/java/org/apache/pinot/controller/api/upload/ZKOperator.java
Outdated
Show resolved
Hide resolved
pinot-controller/src/main/java/org/apache/pinot/controller/api/upload/ZKOperator.java
Outdated
Show resolved
Hide resolved
pinot-common/src/main/java/org/apache/pinot/common/metrics/ControllerGauge.java
Outdated
Show resolved
Hide resolved
pinot-controller/src/main/java/org/apache/pinot/controller/util/TableSizeReader.java
Outdated
Show resolved
Hide resolved
pinot-controller/src/main/java/org/apache/pinot/controller/util/TableSizeReader.java
Outdated
Show resolved
Hide resolved
pinot-common/src/main/java/org/apache/pinot/common/metrics/ControllerGauge.java
Outdated
Show resolved
Hide resolved
pinot-controller/src/main/java/org/apache/pinot/controller/util/TableSizeReader.java
Outdated
Show resolved
Hide resolved
4c6336d
to
c31970b
Compare
pinot-common/src/main/java/org/apache/pinot/common/metadata/segment/SegmentZKMetadata.java
Outdated
Show resolved
Hide resolved
pinot-common/src/main/java/org/apache/pinot/common/metrics/ControllerGauge.java
Outdated
Show resolved
Hide resolved
pinot-common/src/main/java/org/apache/pinot/common/metrics/ControllerGauge.java
Outdated
Show resolved
Hide resolved
...ntroller/src/main/java/org/apache/pinot/controller/helix/core/PinotHelixResourceManager.java
Outdated
Show resolved
Hide resolved
...ntroller/src/main/java/org/apache/pinot/controller/helix/core/PinotHelixResourceManager.java
Outdated
Show resolved
Hide resolved
pinot-controller/src/main/java/org/apache/pinot/controller/helix/SegmentStatusChecker.java
Outdated
Show resolved
Hide resolved
pinot-controller/src/main/java/org/apache/pinot/controller/util/TableSizeReader.java
Outdated
Show resolved
Hide resolved
pinot-controller/src/main/java/org/apache/pinot/controller/helix/SegmentStatusChecker.java
Outdated
Show resolved
Hide resolved
32f2f64
to
2c891f7
Compare
pinot-common/src/main/java/org/apache/pinot/common/metrics/ControllerGauge.java
Outdated
Show resolved
Hide resolved
pinot-controller/src/main/java/org/apache/pinot/controller/helix/SegmentStatusChecker.java
Outdated
Show resolved
Hide resolved
ca2b6e0
to
66d61a4
Compare
66d61a4
to
737ce64
Compare
Description
New Pinot table/segment storage metrics:
TableCompressedSize
: For computing this metric, I added a newsizeInBytes
field inSegmentZKMetadata
. Aside for being useful for debugging, this allows us to quantify the size of compressed tar.gz files that is stored in deep store. This is different from the segment size metrics that is reported using the segment sizes taken from the Pinot servers.TableSizePerReplicaOnServer
: This metric provides the table size per replica. This is calculated using he table size reported by servers divided by the number of replicas configured for the table.TableTotalSizeOnServer
: This is the table size reported by servers for the table. This metric should be used instead ofOfflineTableEstimatedSize
.OfflineTableEstimatedSize
is now deprecated.Testing
Upgrade Notes
Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion)
backward-incompat
, and complete the section below on Release Notes)Does this PR fix a zero-downtime upgrade introduced earlier?
backward-incompat
, and complete the section below on Release Notes)Does this PR otherwise need attention when creating release notes? Things to consider:
release-notes
and complete the section on Release Notes)Release Notes
segment.size.in.byte
, was added to SegmentZKMetadata which is the size of the compressed segment tar.gz file.TableCompressedSize
: The total size in bytes of the compressed segments per tableTableSizePerReplicaOnServer
: The table size reported by servers divided by the number of replicas configured for the table.TableTotalSizeOnServer
: The total table size reported by servers. This metric should be used instead ofOfflineTableEstimatedSize
.OfflineTableEstimatedSize
is now deprecated.