面向持久内存的高效写优化哈希索引方案
摘要
NVM non-volatile memory 因为 data consistency and hardware limitations of NVM,传统为 DRAM 设计的索引技术不适用 inefficient,为了高效索引 NVM 上的数据,本论文提出一种写优化的高性能哈希索引技术 write-optimized and high-performance hashing index,称为 level hashing
- write-optimized
- high-performance hashing
- low-overhead consistency guarantee
- and cost-efficient resizing.
为了保证 consistency with low overhead
- level hashing leverages log-free consistency schemes for insertion, deletion, and resizing operations, and an opportunistic log-free scheme for update operation.
To cost-efficiently resize this hash table,
- level hashing leverages an in- place resizing scheme that only needs to rehash 1/3 of buckets instead of the entire table, thus significantly reducing the number of rehashed buckets and improving the resizing performance.
2. Background and Motivation
2.1 Data Consistency in NVM
- CPU cache lines
- CPU and memory controller may reorder memory writes.
- cache line flush instruction (CLFLUSH for short) / memory fence instruction (MFENCE for short), to ensure the ordering of memory writes
- atomic memory write of NVM is 8 bytes, which is equal to the memory bus width for 64-bit CPUs
Existing techniques, such as logging and copy-on-write (CoW), are used to guarantee consistency of the data whose sizes are larger than an atomic-write size.
- The logging technique first stores the old data (undo logging) or new data (redo logging) into a log and then updates the old data in place.
- The CoW first creates a new copy of data and then performs updates on the copy. The pointers that point to the old data are finally modified.
- Nevertheless, logging and CoW have to write twice for each updated data.
Code
RECIPE : high-performance, concurrent indexes for persistent memory (SOSP 2019)