`map_chunked` implementation #99

lorenzorubi-db · 2024-01-03T18:03:11Z

DataExplorer new method map_chunked as an alternative to map:

map processes the tables one by one
map_chunked processes the tables in chunks of size tables_per_chunk

discoverx/explorer.py

nfx · 2024-02-02T13:50:24Z

discoverx/explorer.py

@@ -197,6 +198,39 @@ def map(self, f) -> list[any]:

        return res

+    def map_chunked(self, f: Callable, tables_per_chunk: int, **kwargs) -> list[any]:


Suggested change

def map_chunked(self, f: Callable, tables_per_chunk: int, **kwargs) -> list[any]:

def map_chunked(self, f: Callable, tables_per_chunk: int, **kwargs) -> list[Any]:

any is a function, not a type

nfx · 2024-02-02T13:51:20Z

setup.py

@@ -34,6 +34,7 @@
    "delta-spark>=2.2.0",
    "pandas<2.0.0",  # From 2.0.0 onwards, pandas does not support iteritems() anymore, spark.createDataFrame will fail
    "numpy<1.24",  # From 1.24 onwards, module 'numpy' has no attribute 'bool'.
+    "more_itertools",


Create LPP ticket for this, otherwise re-implement a single function. Don't add whole library for the sake of a function

lorenzorubi-db added 2 commits January 3, 2024 10:47

map_chunked initial implementation

6ebe982

added tests for map_chunked

91355c2

lorenzorubi-db mentioned this pull request Jan 3, 2024

Delta housekeeping notebooks #95

Closed

nfx requested changes Feb 2, 2024

View reviewed changes

do not use more_itertools + improve function hints

b45c4ed

lorenzorubi-db requested a review from nfx February 3, 2024 20:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`map_chunked` implementation #99

`map_chunked` implementation #99

lorenzorubi-db commented Jan 3, 2024

nfx Feb 2, 2024

nfx Feb 2, 2024

		@@ -197,6 +198,39 @@ def map(self, f) -> list[any]:

		return res

		def map_chunked(self, f: Callable, tables_per_chunk: int, **kwargs) -> list[any]:

map_chunked implementation #99

Are you sure you want to change the base?

map_chunked implementation #99

Conversation

lorenzorubi-db commented Jan 3, 2024

nfx Feb 2, 2024

Choose a reason for hiding this comment

nfx Feb 2, 2024

Choose a reason for hiding this comment

`map_chunked` implementation #99

`map_chunked` implementation #99