feat: make the # of CPU threads configurable and document cpu/memory patterns #2773

westonpace · 2024-08-22T15:38:55Z

This gets rid of all usage of num_cpus (except in one spot to determine the default # of CPU threads) and instead uses ObjectStore::io_parallelism or get_num_compute_intensive_cpus. Admittedly, many of our operations do not split out I/O and compute. For example:

async move {
  let batch = read_batch(...).await?;
  let transformed = transform_batch(batch);
}

However, we can just pick one and clean these up as we go. For now, I made a best effort guess (most places where we were applying a multiplier to num_cpus I use the I/O count. Other places I use the compute count.

codecov-commenter · 2024-08-22T16:05:45Z

Codecov Report

Attention: Patch coverage is 94.25287% with 5 lines in your changes missing coverage. Please review.

Project coverage is 79.30%. Comparing base (eb87bfa) to head (57d076d).
Report is 1 commits behind head on main.

Files	Patch %	Lines
rust/lance-core/src/utils/tokio.rs	50.00%	1 Missing and 1 partial ⚠️
rust/lance-io/src/object_store.rs	71.42%	1 Missing and 1 partial ⚠️
rust/lance/src/index/vector/ivf/v2.rs	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2773      +/-   ##
==========================================
+ Coverage   79.26%   79.30%   +0.03%     
==========================================
  Files         227      227              
  Lines       68245    68283      +38     
  Branches    68245    68283      +38     
==========================================
+ Hits        54096    54153      +57     
+ Misses      11023    11002      -21     
- Partials     3126     3128       +2

Flag	Coverage Δ
unittests	`79.30% <94.25%> (+0.03%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

eddyxu · 2024-08-22T16:53:40Z

rust/lance-file/src/reader.rs

@@ -322,7 +326,7 @@ impl FileReader {
                        .await
                }
            })
-            .buffered(num_cpus::get() * 4)
+            .buffered(self.io_parallelism() as usize)


can io_parallalism return usize so dont need cast all of them?

Good idea. Done.

… casting

westonpace requested review from wjones127, chebbyChefNEQ and eddyxu August 22, 2024 15:39

github-actions bot added enhancement New feature or request python labels Aug 22, 2024

westonpace requested a review from BubbleCal August 22, 2024 15:39

eddyxu reviewed Aug 22, 2024

View reviewed changes

wjones127 approved these changes Aug 22, 2024

View reviewed changes

chebbyChefNEQ approved these changes Aug 22, 2024

View reviewed changes

westonpace added 4 commits August 25, 2024 07:54

Make the # of CPU threads configurable. Document CPU & Memory patterns

eed30f3

Minor cleanup. Add note for devs

11f1ec6

Change io_parallelism method to return usize instead of u32 to reduce…

0bfe351

… casting

Missed a spot where ? is no longer needed

0516bd8

westonpace force-pushed the feat/configurable-cpu-threads branch from 52a50c1 to 0516bd8 Compare August 25, 2024 15:11

westonpace added 2 commits August 26, 2024 05:51

Make num_threads in CompactionOptions optional

8bb8ab2

Address clippy suggestion

57d076d

westonpace merged commit 1e6ee60 into lancedb:main Aug 26, 2024
22 of 23 checks passed

westonpace mentioned this pull request Aug 28, 2024

Limit parallelism in Dataset.cleanup_old_versions #2805

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: make the # of CPU threads configurable and document cpu/memory patterns #2773

feat: make the # of CPU threads configurable and document cpu/memory patterns #2773

westonpace commented Aug 22, 2024

codecov-commenter commented Aug 22, 2024 •

edited

Loading

eddyxu Aug 22, 2024

westonpace Aug 22, 2024

feat: make the # of CPU threads configurable and document cpu/memory patterns #2773

feat: make the # of CPU threads configurable and document cpu/memory patterns #2773

Conversation

westonpace commented Aug 22, 2024

codecov-commenter commented Aug 22, 2024 • edited Loading

Codecov Report

eddyxu Aug 22, 2024

Choose a reason for hiding this comment

westonpace Aug 22, 2024

Choose a reason for hiding this comment

codecov-commenter commented Aug 22, 2024 •

edited

Loading