Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delta housekeeping initial version #101

Open
wants to merge 26 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
5d7889e
delta housekeeping initial commit
lorenzorubi-db Dec 18, 2023
90bab27
debugging initial version
Dec 18, 2023
94629e0
convert output to pandas
lorenzorubi-db Dec 18, 2023
543f852
debugging -convert output to pandas
lorenzorubi-db Dec 18, 2023
567b303
DeltaHousekeepingActions object and tests
lorenzorubi-db Dec 19, 2023
bded305
added more insights to housekeeping and refactored tests
lorenzorubi-db Dec 21, 2023
cf4ef07
regression and cleanup
Dec 21, 2023
bc303cd
move implementation of map_chunked to a separated branch
lorenzorubi-db Jan 3, 2024
feeafaf
readability, cleanup, follow discoverx patterns
lorenzorubi-db Jan 3, 2024
e8a1b66
debugging on cluster + adding spark session to `DeltaHousekeepingActi…
Jan 3, 2024
e177ef4
simplify scan implementation & remove dependency to BeautifulSoup
lorenzorubi-db Jan 3, 2024
023b02f
faster implementation + unit tests
lorenzorubi-db Jan 5, 2024
c2b028f
cleanup
lorenzorubi-db Jan 5, 2024
0e4c8e5
cleanup and PR comments
lorenzorubi-db Jan 9, 2024
9a9fe6b
proper use of dbwidgets
lorenzorubi-db Jan 12, 2024
6c5ecf2
refactoring apply to return a single dataframe
lorenzorubi-db Jan 28, 2024
5359876
add test datasets for all housekeeping checks + bug fixes
lorenzorubi-db Feb 4, 2024
613a290
Merge branch 'master' into delta-housekeeping-notebooks
lorenzorubi-db Feb 4, 2024
9758a00
fix explain / apply methods
Feb 10, 2024
59760f9
refactoring to control output column names
Feb 10, 2024
a0d434e
refactoring to spark API -intermediate commit
Feb 11, 2024
0abe9a2
tests with DBR -nan's & timestamps
Feb 11, 2024
16e7ec6
failing test + cleanup
Feb 11, 2024
24edacb
cleanup
Feb 11, 2024
1b1de40
cleanup
Feb 11, 2024
aa671a2
remove 'reason' column from the output dfs
lorenzorubi-db Feb 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,11 @@ The properties available in table_info are
* **Maintenance**
* [VACUUM all tables](docs/Vacuum.md) ([example notebook](examples/vacuum_multiple_tables.py))
* Detect tables having too many small files ([example notebook](examples/detect_small_files.py))
* Delta housekeeping analysis ([example notebook](examples/exec_delta_housekeeping.py)) which provide:
* stats (size of tables and number of files, timestamps of latest OPTIMIZE & VACUUM operations, stats of OPTIMIZE)
* recommendations on tables that need to be OPTIMIZED/VACUUM'ed
* are tables OPTIMIZED/VACUUM'ed often enough
* tables that have small files / tables for which ZORDER is not being effective
* Deep clone a catalog ([example notebook](examples/deep_clone_schema.py))
* **Governance**
* PII detection with Presidio ([example notebook](examples/pii_detection_presidio.py))
Expand Down Expand Up @@ -91,7 +96,7 @@ from discoverx import DX
dx = DX(locale="US")
```

You can now run operations across multiple tables.
You can now run operations across multiple tables.

## Available functionality

Expand Down Expand Up @@ -128,4 +133,3 @@ After a `with_sql` or `unpivot_string_columns` command, you can apply the follow
Please note that all projects in the /databrickslabs github account are provided for your exploration only, and are not formally supported by Databricks with Service Level Agreements (SLAs). They are provided AS-IS and we do not make any guarantees of any kind. Please do not submit a support ticket relating to any issues arising from the use of these projects.

Any issues discovered through the use of this project should be filed as GitHub Issues on the Repo. They will be reviewed as time permits, but there are no formal SLAs for support.

491 changes: 491 additions & 0 deletions discoverx/delta_housekeeping.py

Large diffs are not rendered by default.

14 changes: 12 additions & 2 deletions discoverx/explorer.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
import concurrent.futures
import copy
import re
from typing import Optional, List
import pandas as pd
from typing import Optional, List, Callable, Iterable
from discoverx import logging
from discoverx.common import helper
from discoverx.discovery import Discovery
Expand All @@ -11,6 +12,7 @@
from pyspark.sql import DataFrame, SparkSession
from pyspark.sql.functions import lit
from discoverx.table_info import InfoFetcher, TableInfo
from discoverx.delta_housekeeping import DeltaHousekeeping, DeltaHousekeepingActions


logger = logging.Logging()
Expand Down Expand Up @@ -182,7 +184,7 @@ def scan(
discover.scan(rules=rules, sample_size=sample_size, what_if=what_if)
return discover

def map(self, f) -> list[any]:
def map(self, f: Callable) -> list[any]:
"""Runs a function for each table in the data explorer

Args:
Expand Down Expand Up @@ -214,6 +216,14 @@ def map(self, f) -> list[any]:

return res

def delta_housekeeping(self) -> pd.DataFrame:
"""
Gathers stats and recommendations on Delta Housekeeping
"""
dh = DeltaHousekeeping(self._spark)
dfs_pd: Iterable[pd.DataFrame] = self.map(dh.scan)
return DeltaHousekeepingActions(dfs_pd, spark=self._spark)


class DataExplorerActions:
def __init__(
Expand Down
65 changes: 65 additions & 0 deletions examples/exec_delta_housekeeping.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Databricks notebook source
# MAGIC %md
# MAGIC # Run Delta Housekeeping across multiple tables
# MAGIC Analysis that provides stats on Delta tables / recommendations for improvements, including:
# MAGIC - stats:size of tables and number of files, timestamps of latest OPTIMIZE & VACUUM operations, stats of OPTIMIZE)
# MAGIC - recommendations on tables that need to be OPTIMIZED/VACUUM'ed
# MAGIC - are tables OPTIMIZED/VACUUM'ed often enough
# MAGIC - tables that have small files / tables for which ZORDER is not being effective
# MAGIC

# COMMAND ----------

# MAGIC %pip install dbl-discoverx

# COMMAND ----------

# MAGIC %md
# MAGIC ### Declare Variables

# COMMAND ----------

dbutils.widgets.text("catalogs", "*", "Catalogs")
dbutils.widgets.text("schemas", "*", "Schemas")
dbutils.widgets.text("tables", "*", "Tables")

# COMMAND ----------

catalogs = dbutils.widgets.get("catalogs")
schemas = dbutils.widgets.get("schemas")
tables = dbutils.widgets.get("tables")
from_table_statement = ".".join([catalogs, schemas, tables])

# COMMAND ----------

from discoverx import DX

dx = DX()

# COMMAND ----------

# DBTITLE 1,Run the discoverx DeltaHousekeeping operation -generates an output object on which you can run operations
output = (
dx.from_tables(from_table_statement)
.delta_housekeeping()
)

# COMMAND ----------

# DBTITLE 1,apply() operation generates a spark dataframe with recommendations
result = output.apply()
result.select("catalog", "database", "tableName", "recommendation").display()

# COMMAND ----------

# DBTITLE 1,display() runs apply and displays the full result (including stats per table)
output.display()

# COMMAND ----------

# DBTITLE 1,explain() outputs the DeltaHousekeeping recommendations in HTML format
output.explain()

# COMMAND ----------


2 changes: 2 additions & 0 deletions tests/unit/data/delta_housekeeping/dd_click_sales.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
catalog,database,tableName,number_of_files,bytes
lorenzorubi,default,click_sales,6,326068799
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
catalog,database,tableName,number_of_files,bytes
lorenzorubi,default,housekeeping_summary,1,192917
4 changes: 4 additions & 0 deletions tests/unit/data/delta_housekeeping/dh_click_sales.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
catalog,database,tableName,operation,timestamp,min_file_size,p50_file_size,max_file_size,z_order_by
lorenzorubi,default,click_sales,VACUUM END,2023-12-06T16:40:28Z,null,null,null,null
lorenzorubi,default,click_sales,VACUUM END,2023-12-05T01:19:47Z,null,null,null,null
lorenzorubi,default,click_sales,VACUUM END,2023-11-25T04:03:41Z,null,null,null,null
25 changes: 25 additions & 0 deletions tests/unit/data/delta_housekeeping/dh_housekeeping_summary.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
catalog,database,tableName,operation,timestamp,min_file_size,p50_file_size,max_file_size,z_order_by
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-05T05:50:14Z,192917,192917,192917,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-05T05:21:22Z,184203,184203,184203,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-05T04:37:19Z,176955,176955,176955,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-05T04:10:26Z,168560,168560,168560,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-05T03:11:02Z,161710,161710,161710,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-05T02:44:41Z,154166,154166,154166,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-05T02:18:54Z,145990,145990,145990,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-05T01:42:12Z,137677,137677,137677,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-05T01:09:19Z,130864,130864,130864,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-05T00:53:33Z,123702,123702,123702,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-05T00:43:44Z,118806,118806,118806,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-05T00:28:00Z,111983,111983,111983,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-05T00:14:21Z,104790,104790,104790,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-04T23:47:02Z,97314,97314,97314,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-04T23:18:17Z,91509,91509,91509,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-04T22:14:48Z,84152,84152,84152,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-04T21:57:53Z,76464,76464,76464,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-04T21:30:49Z,67498,67498,67498,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-04T21:18:59Z,59412,59412,59412,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-04T20:30:48Z,51173,51173,51173,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-04T20:12:59Z,42346,42346,42346,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-04T19:35:05Z,34463,34463,34463,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-04T19:30:46Z,28604,28604,28604,[]
lorenzorubi,default,housekeeping_summary,OPTIMIZE,2023-12-04T19:06:51Z,8412,17592,17592,[]
20 changes: 20 additions & 0 deletions tests/unit/data/delta_housekeeping/dhk_pandas_result.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
catalog,database,tableName,number_of_files,bytes,max_optimize_timestamp,2nd_optimize_timestamp,max_vacuum_timestamp,2nd_vacuum_timestamp,min_file_size,p50_file_size,max_file_size,z_order_by,error
lorenzorubi,default,housekeeping_summary_v3,1,3787,null,null,null,null,null,null,null,null,null
lorenzorubi,maxmind_geo,gold_ipv6,1,4907069,null,null,null,null,null,null,null,null,null
lorenzorubi,default,click_sales,6,326068799,null,null,2023-12-06T16:40:28Z,2023-12-05T01:19:47Z,null,null,null,null,null
lorenzorubi,default,housekeeping_summary,1,192917,2023-12-05T05:50:14Z,2023-12-05T05:21:22Z,null,null,192917,192917,192917,["a"],null
lorenzorubi,default,housekeeping_summary_v2,3,12326,2023-12-18T11:25:35Z,null,null,null,5273,5273,5273,[],null
lorenzorubi,maxmind_geo,raw_locations,1,6933,null,null,null,null,null,null,null,null,null
lorenzorubi,tpch,customer,1,61897021,null,null,null,null,null,null,null,null,null
lorenzorubi,tpch,nation,1,3007,null,null,null,null,null,null,null,null,null
lorenzorubi,maxmind_geo,raw_ipv6,1,1783720,null,null,null,null,null,null,null,null,null
lorenzorubi,maxmind_geo,gold_ipv4,1,7220024,null,null,null,null,null,null,null,null,null
lorenzorubi,dais_dlt_2023,enriched_orders,null,null,null,null,null,null,null,null,null,null,[UNSUPPORTED_VIEW_OPERATION.WITHOUT_SUGGESTION] The view `lorenzorubi`.`dais_dlt_2023`.`enriched_orders` does not support DESCRIBE DETAIL. ; line 2 pos 20
lorenzorubi,default,click_sales_history,1,7710,null,null,null,null,null,null,null,null,null
lorenzorubi,tpch,orders,2406,317120666,null,null,null,null,null,null,null,null,null
lorenzorubi,default,complete_data,6,326060019,null,null,2023-12-06T16:40:36Z,2023-12-05T01:19:25Z,null,null,null,null,null
lorenzorubi,maxmind_geo,raw_ipv4,1,3115269,null,null,null,null,null,null,null,null,null
lorenzorubi,gcp_cost_analysis,sku_prices,1,835,null,null,null,null,null,null,null,null,null
lorenzorubi,dais_dlt_2023,daily_totalorders_by_nation,null,null,null,null,null,null,null,null,null,null,[UNSUPPORTED_VIEW_OPERATION.WITHOUT_SUGGESTION] The view `lorenzorubi`.`dais_dlt_2023`.`daily_totalorders_by_nation` does not support DESCRIBE DETAIL. ; line 2 pos 20
lorenzorubi,gcp_cost_analysis,project_ids,2,1774,null,null,null,null,null,null,null,null,null
lorenzorubi,dais_dlt_2023,daily_2nd_high_orderprice,null,null,null,null,null,null,null,null,null,null,[UNSUPPORTED_VIEW_OPERATION.WITHOUT_SUGGESTION] The view `lorenzorubi`.`dais_dlt_2023`.`daily_2nd_high_orderprice` does not support DESCRIBE DETAIL. ; line 2 pos 20
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
catalog,database,tableName,number_of_files,bytes,max_optimize_timestamp,2nd_optimize_timestamp,max_vacuum_timestamp,2nd_vacuum_timestamp,min_file_size,p50_file_size,max_file_size,z_order_by,error,rec_optimize,rec_optimize_reason,rec_vacuum,rec_vacuum_reason,rec_misc,rec_misc_reason
lorenzorubi,default,housekeeping_summary,1.0,192917.0,2023-12-05T05:50:14Z,2023-12-05T05:21:22Z,,,192917.0,192917.0,192917.0,[],,True, | Tables that are not OPTIMIZED often enough | Tables that are OPTIMIZED too often | Tables that are too small to be OPTIMIZED,True,The table has never been VACUUM'ed | | ,False, |
lorenzorubi,default,housekeeping_summary_v2,3.0,12326.0,2023-12-18T11:25:35Z,,,,5273.0,5273.0,5273.0,[],,True, | Tables that are not OPTIMIZED often enough | | Tables that are too small to be OPTIMIZED,True,The table has never been VACUUM'ed | | ,True,Tables that need more analysis -small_files |
2 changes: 2 additions & 0 deletions tests/unit/data/delta_housekeeping/expected_need_analysis.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
catalog,database,tableName,number_of_files,bytes,max_optimize_timestamp,2nd_optimize_timestamp,max_vacuum_timestamp,2nd_vacuum_timestamp,min_file_size,p50_file_size,max_file_size,z_order_by,error,rec_optimize,rec_optimize_reason,rec_vacuum,rec_vacuum_reason,rec_misc,rec_misc_reason
lorenzorubi,default,housekeeping_summary_v2,3.0,12326.0,2023-12-18T11:25:35Z,,,,5273.0,5273.0,5273.0,[],,True, | Tables that are not OPTIMIZED often enough | | ,True,The table has never been VACUUM'ed | | ,True,Tables that need more analysis -small_files
4 changes: 4 additions & 0 deletions tests/unit/data/delta_housekeeping/expected_need_optimize.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
catalog,database,tableName,number_of_files,bytes,max_optimize_timestamp,2nd_optimize_timestamp,max_vacuum_timestamp,2nd_vacuum_timestamp,min_file_size,p50_file_size,max_file_size,z_order_by,error,rec_optimize,rec_optimize_reason
lorenzorubi,default,click_sales,6.0,326068799.0,,,2023-12-06T16:40:28Z,2023-12-05T01:19:47Z,,,,,,True,The table has not been OPTIMIZED and would benefit from it
lorenzorubi,tpch,orders,2406.0,317120666.0,,,,,,,,,,True,The table has not been OPTIMIZED and would benefit from it
lorenzorubi,default,complete_data,6.0,326060019.0,,,2023-12-06T16:40:36Z,2023-12-05T01:19:25Z,,,,,,True,The table has not been OPTIMIZED and would benefit from it
18 changes: 18 additions & 0 deletions tests/unit/data/delta_housekeeping/expected_need_vacuum.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
catalog,database,tableName,number_of_files,bytes,max_optimize_timestamp,2nd_optimize_timestamp,max_vacuum_timestamp,2nd_vacuum_timestamp,min_file_size,p50_file_size,max_file_size,z_order_by,error,rec_optimize,rec_optimize_reason,rec_vacuum,rec_vacuum_reason
lorenzorubi,default,housekeeping_summary_v3,1.0,3787.0,,,,,,,,,,False,,True,The table has never been VACUUM'ed
lorenzorubi,maxmind_geo,gold_ipv6,1.0,4907069.0,,,,,,,,,,False,,True,The table has never been VACUUM'ed
lorenzorubi,default,housekeeping_summary,1.0,192917.0,2023-12-05T05:50:14Z,2023-12-05T05:21:22Z,,,192917.0,192917.0,192917.0,[],,False,,True,The table has never been VACUUM'ed
lorenzorubi,default,housekeeping_summary_v2,3.0,12326.0,2023-12-18T11:25:35Z,,,,5273.0,5273.0,5273.0,[],,False,,True,The table has never been VACUUM'ed
lorenzorubi,maxmind_geo,raw_locations,1.0,6933.0,,,,,,,,,,False,,True,The table has never been VACUUM'ed
lorenzorubi,tpch,customer,1.0,61897021.0,,,,,,,,,,False,,True,The table has never been VACUUM'ed
lorenzorubi,tpch,nation,1.0,3007.0,,,,,,,,,,False,,True,The table has never been VACUUM'ed
lorenzorubi,maxmind_geo,raw_ipv6,1.0,1783720.0,,,,,,,,,,False,,True,The table has never been VACUUM'ed
lorenzorubi,maxmind_geo,gold_ipv4,1.0,7220024.0,,,,,,,,,,False,,True,The table has never been VACUUM'ed
lorenzorubi,dais_dlt_2023,enriched_orders,,,,,,,,,,,[UNSUPPORTED_VIEW_OPERATION.WITHOUT_SUGGESTION] The view `lorenzorubi`.`dais_dlt_2023`.`enriched_orders` does not support DESCRIBE DETAIL. ; line 2 pos 20,False,,True,The table has never been VACUUM'ed
lorenzorubi,default,click_sales_history,1.0,7710.0,,,,,,,,,,False,,True,The table has never been VACUUM'ed
lorenzorubi,tpch,orders,2406.0,317120666.0,,,,,,,,,,True,The table has not been OPTIMIZED and would benefit from it,True,The table has never been VACUUM'ed
lorenzorubi,maxmind_geo,raw_ipv4,1.0,3115269.0,,,,,,,,,,False,,True,The table has never been VACUUM'ed
lorenzorubi,gcp_cost_analysis,sku_prices,1.0,835.0,,,,,,,,,,False,,True,The table has never been VACUUM'ed
lorenzorubi,dais_dlt_2023,daily_totalorders_by_nation,,,,,,,,,,,[UNSUPPORTED_VIEW_OPERATION.WITHOUT_SUGGESTION] The view `lorenzorubi`.`dais_dlt_2023`.`daily_totalorders_by_nation` does not support DESCRIBE DETAIL. ; line 2 pos 20,False,,True,The table has never been VACUUM'ed
lorenzorubi,gcp_cost_analysis,project_ids,2.0,1774.0,,,,,,,,,,False,,True,The table has never been VACUUM'ed
lorenzorubi,dais_dlt_2023,daily_2nd_high_orderprice,,,,,,,,,,,[UNSUPPORTED_VIEW_OPERATION.WITHOUT_SUGGESTION] The view `lorenzorubi`.`dais_dlt_2023`.`daily_2nd_high_orderprice` does not support DESCRIBE DETAIL. ; line 2 pos 20,False,,True,The table has never been VACUUM'ed
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
catalog,database,tableName,number_of_files,bytes,max_optimize_timestamp,2nd_optimize_timestamp,max_vacuum_timestamp,2nd_vacuum_timestamp,min_file_size,p50_file_size,max_file_size,z_order_by,error,rec_optimize,rec_optimize_reason,rec_vacuum,rec_vacuum_reason
lorenzorubi,default,housekeeping_summary,1.0,192917.0,2023-12-05T05:50:14Z,2023-12-05T05:21:22Z,,,192917.0,192917.0,192917.0,[],,True, | Tables that are not OPTIMIZED often enough,True,The table has never been VACUUM'ed
lorenzorubi,default,housekeeping_summary_v2,3.0,12326.0,2023-12-18T11:25:35Z,,,,5273.0,5273.0,5273.0,[],,True, | Tables that are not OPTIMIZED often enough,True,The table has never been VACUUM'ed
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
catalog,database,tableName,number_of_files,bytes,max_optimize_timestamp,2nd_optimize_timestamp,max_vacuum_timestamp,2nd_vacuum_timestamp,min_file_size,p50_file_size,max_file_size,z_order_by,error,rec_optimize,rec_optimize_reason,rec_vacuum,rec_vacuum_reason
lorenzorubi,default,click_sales,6.0,326068799.0,,,2023-12-06T16:40:28Z,2023-12-05T01:19:47Z,,,,,,True,The table has not been OPTIMIZED and would benefit from it | ,True, | Tables that are not VACUUM'ed often enough
lorenzorubi,default,complete_data,6.0,326060019.0,,,2023-12-06T16:40:36Z,2023-12-05T01:19:25Z,,,,,,True,The table has not been OPTIMIZED and would benefit from it | ,True, | Tables that are not VACUUM'ed often enough
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
catalog,database,tableName,number_of_files,bytes,max_optimize_timestamp,2nd_optimize_timestamp,max_vacuum_timestamp,2nd_vacuum_timestamp,min_file_size,p50_file_size,max_file_size,z_order_by,error,rec_optimize,rec_optimize_reason,rec_vacuum,rec_vacuum_reason,rec_misc,rec_misc_reason
lorenzorubi,default,housekeeping_summary,1.0,192917.0,2023-12-05T05:50:14Z,2023-12-05T05:21:22Z,,,192917.0,192917.0,192917.0,[],,True, | Tables that are not OPTIMIZED often enough | Tables that are OPTIMIZED too often | ,True,The table has never been VACUUM'ed | | ,False, |
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
catalog,database,tableName,number_of_files,bytes,max_optimize_timestamp,2nd_optimize_timestamp,max_vacuum_timestamp,2nd_vacuum_timestamp,min_file_size,p50_file_size,max_file_size,z_order_by
lorenzorubi,default,click_sales,6,326068799,,,2023-12-06T16:40:28Z,2023-12-05T01:19:47Z,,,,
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
catalog,database,tableName,number_of_files,bytes,max_optimize_timestamp,2nd_optimize_timestamp,max_vacuum_timestamp,2nd_vacuum_timestamp,min_file_size,p50_file_size,max_file_size,z_order_by
lorenzorubi,default,housekeeping_summary,1,192917,2023-12-05T05:50:14Z,2023-12-05T05:21:22Z,,,192917,192917,192917,[]
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
catalog,database,tableName,number_of_files,bytes,max_optimize_timestamp,2nd_optimize_timestamp,max_vacuum_timestamp,2nd_vacuum_timestamp,min_file_size,p50_file_size,max_file_size,z_order_by,error,rec_optimize,rec_optimize_reason,rec_vacuum,rec_vacuum_reason,rec_misc,rec_misc_reason
lorenzorubi,default,click_sales,6.0,326068799.0,,,2023-12-06T16:40:28Z,2023-12-05T01:19:47Z,,,,,,True,The table has not been OPTIMIZED and would benefit from it | | | ,True, | Tables that are not VACUUM'ed often enough | Tables that are VACUUM'ed too often,False, |
lorenzorubi,default,complete_data,6.0,326060019.0,,,2023-12-06T16:40:36Z,2023-12-05T01:19:25Z,,,,,,True,The table has not been OPTIMIZED and would benefit from it | | | ,True, | Tables that are not VACUUM'ed often enough | Tables that are VACUUM'ed too often,False, |
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
catalog,database,tableName,number_of_files,bytes,max_optimize_timestamp,2nd_optimize_timestamp,max_vacuum_timestamp,2nd_vacuum_timestamp,min_file_size,p50_file_size,max_file_size,z_order_by,error,rec_optimize,rec_optimize_reason,rec_vacuum,rec_vacuum_reason,rec_misc,rec_misc_reason
lorenzorubi,default,housekeeping_summary,1.0,192917.0,2023-12-05T05:50:14Z,2023-12-05T05:21:22Z,,,192917.0,192917.0,192917.0,"[""a""]",,True, | Tables that are not OPTIMIZED often enough | Tables that are OPTIMIZED too often | Tables that are too small to be OPTIMIZED,True,The table has never been VACUUM'ed | | ,True, | Tables for which ZORDER is not being effective
Loading