Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
性能提升
与 skiplist 相比,读写速度提升 1.8x, scan 性能持平。
Motivation
rocksdb 官方支持的5种 mem table 中,hash 类型的结构支持快速的查询,vector 类型的结构支持快速的写入,另外的一个缺陷是它们都没法支持 range scan. 因此在实际使用中更多会用 SkipList . 但是 SkipList 在并发读写的情况下性能不会太好。
需要一种新的结构(hash_sorted_vector)同时拥有 hash 的查询速度,vector 的写入速度,而且支持 range scan.
如上图所示,在 hash_sorted_vector 中,数据会同时写入 hash 和 vector 中。hash 由大量的 bucket 组成,bucket 里是链表,锁作用在每个 bucket 内,以缓解锁带来的性能损失。 sorted_vector 由大量 sorted vector 和头部的 unsorted vector 组成,排序 vector 是只读的,数据只写入 unsorted vector. scan操作则只从 sorted vectors 里以 �multi-way merge sort 的方式读取数据。每次的 scan 操作会将当前的 unsorted vector 排序,作为 sorted vectors 的一员,并生成一个新的 unsorted vector.
benchmarks
跑了 rocksdb 自带的 randomreadrandomwrite 和 newiterator, 运行环境是 ubuntu docker hosted in macbook pro m1 pro.
截图如下: