Log Structured Merge Tree (LSM-tree) Implementations (a Demo and LevelDB)
Background
Log structured merge tree, or LSM-tree, is a famous data structure that has been widely adopted by many modern “big data” products, such as BigTable, HBase, LevelDB, etc. Its core idea is very simple and perhaps somewhat counterintuitive if you’re used to the traditional database architecture.
LST-tree keeps data both in memory and on disk. The in-memory part is a conventional update-in-place data structure with necessary indexing. The on-disk part, however, consists of immutable sorted string tables (SSTable). The SSTable stores key-value pairs. Both the key and the value are of string type. You can also treat them as general byte arrays. The SSTable file format is conceptually a list of consecutive key-value pairs sorted in key order.
When the memory fills up, the data is flushed to disk. The SSTables on disk are never mutated. Periodically, they’re merged and compacted into bigger SSTables. The merge works like an external merge sort. The compaction keeps the latest value of a key by collapsing multiple updates of that same key, and discards deleted keys (marked by a tombstone bit).
The benefit of this data structure is that it supports high write throughput because random updates only happen in memory and all disk writes are sequential append. This is reminiscent of appending to log, and that’s where it got its name. Research has shown that sequential disk operations can run even faster than…