perf: use spdk mempool per-core cache for io objects pool #1612

dsharma-dc · 2024-03-21T12:50:42Z

No description provided.

Signed-off-by: Diwakar Sharma <diwakar.sharma@datacore.com>

dsharma-dc · 2024-03-22T05:02:29Z

With some simple fio run locally on malloc'ed pool - 4k bs, 8 fio jobs, 32 io depth, randwrite workload.
I see improvements in throughput as shown below. Though this shows a stark improvement, it's not always this much, but with a number of runs mostly better.

With per-core cache:
Jobs: 8 (f=8): [w(8)][79.2%][w=834MiB/s][w=213k IOPS][eta 00m:25s]
Jobs: 8 (f=8): [w(8)][89.9%][w=787MiB/s][w=201k IOPS][eta 00m:12s]

Vanilla:
Jobs: 8 (f=8): [w(8)][76.7%][w=576MiB/s][w=148k IOPS][eta 00m:28s]
Jobs: 8 (f=8): [w(8)][89.9%][w=537MiB/s][w=137k IOPS][eta 00m:12s]

tiagolobocastro · 2024-03-25T14:41:32Z

hmm I see slightly slower casperf performance with single-core null device, example:
Total : 10909417 IO/s 5326: MB/s
Total : 11440796 IO/s 5586: MB/s
Haven't checked multi-core yet, I think I've got a patch somewhere to test that

dsharma-dc · 2024-03-25T16:55:44Z

hmm I see slightly slower casperf performance with single-core null device, example: Total : 10909417 IO/s 5326: MB/s Total : 11440796 IO/s 5586: MB/s Haven't checked multi-core yet, I think I've got a patch somewhere to test that

I wouldn't expect this to show improvements with single core because the caching is un-necessary in case of single core. Since cache holds 512 objects, there will be more cache misses. With multi-core, it'll benefit to avoid threads dipping their hands into common pool and contending.

tiagolobocastro · 2024-03-25T17:29:00Z

hmm I see slightly slower casperf performance with single-core null device, example: Total : 10909417 IO/s 5326: MB/s Total : 11440796 IO/s 5586: MB/s Haven't checked multi-core yet, I think I've got a patch somewhere to test that

I wouldn't expect this to show improvements with single core because the caching is un-necessary in case of single core. Since cache holds 512 objects, there will be more cache misses. With multi-core, it'll benefit to avoid threads dipping their hands into common pool and contending.

Yes but seems it's decreasing single-core performance, or am I missing something?

dsharma-dc · 2024-03-25T18:25:50Z

hmm I see slightly slower casperf performance with single-core null device, example: Total : 10909417 IO/s 5326: MB/s Total : 11440796 IO/s 5586: MB/s Haven't checked multi-core yet, I think I've got a patch somewhere to test that

I wouldn't expect this to show improvements with single core because the caching is un-necessary in case of single core. Since cache holds 512 objects, there will be more cache misses. With multi-core, it'll benefit to avoid threads dipping their hands into common pool and contending.

Yes but seems it's decreasing single-core performance, or am I missing something?

hmm, I'm unclear what's our most reliable benchmark. I ran the same test that I did earlier, now on a single core io-engine instance(3 runs) and I see this:

With cache:
WRITE: bw=417MiB/s (438MB/s), IOPS ~107k , 48.6MiB/s-56.0MiB/s (51.0MB/s-58.7MB/s), io=24.5GiB (26.3GB), run=60001-60002msec
WRITE: bw=418MiB/s (438MB/s), IOPS ~108k, 49.1MiB/s-57.6MiB/s (51.5MB/s-60.4MB/s), io=24.5GiB (26.3GB), run=60001-60001msec
WRITE: bw=423MiB/s (443MB/s), IOPS ~108k , 45.7MiB/s-60.1MiB/s (48.0MB/s-63.0MB/s), io=24.8GiB (26.6GB), run=60001-60002msec

Vanilla:
WRITE: bw=404MiB/s (424MB/s), IOPS ~103k, 46.3MiB/s-53.9MiB/s (48.6MB/s-56.6MB/s), io=23.7GiB (25.4GB), run=60001-60002msec
WRITE: bw=410MiB/s (430MB/s), IOPS ~105k, 46.1MiB/s-57.4MiB/s (48.4MB/s-60.2MB/s), io=24.0GiB (25.8GB), run=60001-60002msec
WRITE: bw=411MiB/s (431MB/s), IOPS ~105k, 47.9MiB/s-53.8MiB/s (50.2MB/s-56.5MB/s), io=24.1GiB (25.9GB), run=60001-60001msec

dsharma-dc · 2024-03-25T18:34:03Z

Also , theoretically I would think that read workloads should see more improvements because in general reads have lower path length which would mean cache objects will be returned back quicker and hence lesser chances of dipping into common pool.

tiagolobocastro · 2024-03-25T18:53:25Z

With fio I seem to get consistent results for multi-core, always ~10k IOPS more with the cache, at a cost of 3x2MiB hugepages (4 core config), so seems the tradeoff is worth it!
For single core with fio seems a little more volatile, sometimes better sometimes worse, so perhaps just noise afterall

dsharma-dc · 2024-03-26T05:25:30Z

bors merge

bors-openebs-mayastor · 2024-03-26T05:52:51Z

Build succeeded:

continuous-integration/jenkins/branch

perf: use spdk mempool per-core cache for io objects pool

1352631

Signed-off-by: Diwakar Sharma <diwakar.sharma@datacore.com>

auto-assign bot requested review from chriswldenyer and tiagolobocastro March 21, 2024 12:50

dsharma-dc requested review from gila, dsavitskiy, tiagolobocastro and hrudaya21 and removed request for tiagolobocastro and chriswldenyer March 21, 2024 12:50

tiagolobocastro approved these changes Mar 25, 2024

View reviewed changes

hrudaya21 approved these changes Mar 26, 2024

View reviewed changes

bors-openebs-mayastor bot merged commit bf6450d into openebs:develop Mar 26, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: use spdk mempool per-core cache for io objects pool #1612

perf: use spdk mempool per-core cache for io objects pool #1612

dsharma-dc commented Mar 21, 2024

dsharma-dc commented Mar 22, 2024

tiagolobocastro commented Mar 25, 2024

dsharma-dc commented Mar 25, 2024

tiagolobocastro commented Mar 25, 2024

dsharma-dc commented Mar 25, 2024 •

edited

Loading

dsharma-dc commented Mar 25, 2024

tiagolobocastro commented Mar 25, 2024

dsharma-dc commented Mar 26, 2024

bors-openebs-mayastor bot commented Mar 26, 2024

perf: use spdk mempool per-core cache for io objects pool #1612

perf: use spdk mempool per-core cache for io objects pool #1612

Conversation

dsharma-dc commented Mar 21, 2024

dsharma-dc commented Mar 22, 2024

tiagolobocastro commented Mar 25, 2024

dsharma-dc commented Mar 25, 2024

tiagolobocastro commented Mar 25, 2024

dsharma-dc commented Mar 25, 2024 • edited Loading

dsharma-dc commented Mar 25, 2024

tiagolobocastro commented Mar 25, 2024

dsharma-dc commented Mar 26, 2024

bors-openebs-mayastor bot commented Mar 26, 2024

dsharma-dc commented Mar 25, 2024 •

edited

Loading