Skip to content

Commit

Permalink
Merge pull request #4 from YumingxuanGuo/development/lsm-tree
Browse files Browse the repository at this point in the history
Development/lsm tree
  • Loading branch information
YumingxuanGuo committed Mar 17, 2023
2 parents 50fafdd + 0e276d8 commit 1e47dfd
Show file tree
Hide file tree
Showing 12 changed files with 3,807 additions and 118 deletions.
502 changes: 495 additions & 7 deletions Cargo.lock

Large diffs are not rendered by default.

6 changes: 6 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,22 @@ edition = "2021"
[dependencies]
bincode = "~1.3.3"
bytes = "1.4.0"
crossbeam-epoch = "0.9"
crossbeam-skiplist = "0.1"
config = "0.13.3"
futures = "~0.3.15"
futures-util = "~0.3.15"
log = "~0.4.14"
moka = "0.10.0"
parking_lot = "0.12"
ouroboros = "0.15"
rand = { version = "0.8.5", features = ["small_rng"] }
regex = "1.5.4"
rustyline = "11.0.0"
rustyline-derive = "0.8.0"
serde = "~1.0.126"
serde_derive = "~1.0.126"
tempfile = "3"
tokio = { version = "1.26.0", features = ["macros", "rt", "rt-multi-thread", "net", "io-util", "time", "sync"] }
tokio-serde = { version = "~0.8", features = ["bincode"] }
tokio-stream = { version = "~0.1.6", features = ["net"]}
Expand Down
49 changes: 43 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,59 @@
# FeatherDB

Version: 0.1.0
Version: 0.2.0

## Introduction

FeatherDB is an in-memory, non-persistent, single-threaded, non-transactional, non-relational, centralized database.
FeatherDB is an on-disk, persistent, concurrent, unreliable, non-transactional, non-relational, centralized database.

## What's New
In the most recent release, FeatherDB evolved from purely in-memory to disk-based with persistency.
We chose LSM-tree as the main key-value storage engine for its great performance in writing operations.

Five methods are provided as the interface:
```Rust
/// Sets a value for a key, replacing the existing value if any.
fn set(&self, key: &[u8], value: Vec<u8>) -> Result<()>;

/// Gets a value for a key, if it exists.
fn get(&self, key: &[u8]) -> Result<Option<Vec<u8>>>;

/// Deletes a key, doing nothing if it does not exist.
fn delete(&self, key: &[u8]) -> Result<()>;

/// Iterates over an ordered range of key/value pairs.
fn scan(&self, range: Range) -> Result<KvScan>;

/// Flushes any buffered data to the underlying storage medium.
fn flush(&self) -> Result<()>;
```

LSM-tree contains two different data structures: memtable for in-memory data, and sstable (sorted string table) for on-disk data.
Once the data are flushed as sstables to disk, they become immutable; so mutable operations are only performed in the memtables.

We used a lock-free skiplist as the underlying implementation for the memtable, assuring thread-safety in concurrent operations.
As a plus, all the methods in the interface take immutable references, allowing us to access the storage with simply `Arc<LsmTree>`, instead of `Arc<RwLock<LsmTree>>`.

We also provided the iterator functionality in the `scan` method.
This offers efficient key-range traversals and supports for, possibly in the future, SQL queries.

## Usage

Currently, there is no user interface available.
Currently, there is no command-line interface available.

## What's Next

Upcoming developments inlcudes:
Some of the most imminent developments includes:

* Write-ahead-log (WAL): the fail-safe that guarantees all operations will be performed eventually even after unexpected crashes.

* Level compaction: the "merge" part of the LSM-tree that reduces the storage overhead for outdated and deleted entries.

* Bloom filter: an optimization technique that improves key searching performance.

* Persistence: a disk manager that communicates between memory and disk.
* Transaction: an transactional engine that supports ACID transactions under different isolation levels.

* Transaction: an ACID-compliant transaction engine with different isolation levels.
Further upcoming ones includes:

* Relational model: an SQL interface including projections, filters, joins, aggregates, and transactions.

Expand Down
Loading

0 comments on commit 1e47dfd

Please sign in to comment.