Release V0.2.0 Release · CentaurusInfra/regionless-storage-service

Here enters the V0.2.0 release of the Centaurus Regionless Storage Service (RKV).

In this release, one of the main focuses is improving I/O performance (latency, throughput) when the backend storage instances (e.g. Redis) are geo-distributed in multiple availability zones and/or regions.

Release Features

A novel sharding design that improves I/O latency and throughput while maintaining storage load balancing and high availability when storage instances are distributed in multiple data centers (e.g. availability zones and/or regions).
Adoption of asynchronous replication for sequential consistency.
Strong consistency validated programmatically via the linearizability checker Porcupine (similar to Jepsen).
A new in-memory storage type for more effective and costly efficient development
Improved the "one-key" deployment scripts for multi-region and large-scale testing.
Improved throughput by removing the server-scope locking

Performance

The following graph lists the KPI for this release:

The storage capacity goal of 100 million keys with 3 replicas was achieved using 54 m5a.2xlarge VMs on AWS, distributed in 4 availability zones in 3 regions, with 2 regions on the east coast and 1 across on the west coast. With sequential consistency, the write latency falls within the 20-30 ms which is limited by the network latency between regions on the same coast. Meanwhile, the read latency fails within the same-region range of less than 10 ms by preferring to read from geographically close-by replicas. Storage load were evenly distributed on all 54 VMs.

With the various bug fixes in this release, the concurrency of RKV has also been increased by at least 5 folds.

Known Issues

A manually set latency threshold is needed for the grouping and selection of storage instances. Due to variation of network latency, occasionally it could cause not enough "remote replication hosts" available upon RKV start up.
In large-scale test, the YCSB host becomes a bottleneck from scaling up due to connection pool limitation.

Looking forward

Expand RKV servers from residing in a single-region to multiple regions.
List-watch capability.
Global read implementation with multi-region RKV, and optimization with smart caching.
Add integration test to CICD.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

V0.2.0 Release

Release Features

Performance

Known Issues

Looking forward