面向持久内存的高效写优化哈希索引方案

论文地址：https://www.usenix.org/system/files/osdi18-zuo.pdf

摘要

NVM non-volatile memory 因为 data consistency and hardware limitations of NVM，传统为 DRAM 设计的索引技术不适用 inefficient，为了高效索引 NVM 上的数据，本论文提出一种写优化的高性能哈希索引技术 write-optimized and high-performance hashing index，称为 level hashing

write-optimized
high-performance hashing
low-overhead consistency guarantee
and cost-efficient resizing.
为了保证 consistency with low overhead
- level hashing leverages log-free consistency schemes for insertion, deletion, and resizing operations, and an opportunistic log-free scheme for update operation.
To cost-efficiently resize this hash table,
- level hashing leverages an in- place resizing scheme that only needs to rehash 1/3 of buckets instead of the entire table, thus significantly reducing the number of rehashed buckets and improving the resizing performance.

2. Background and Motivation

2.1 Data Consistency in NVM

CPU cache lines
CPU and memory controller may reorder memory writes.
cache line flush instruction (CLFLUSH for short) / memory fence instruction (MFENCE for short), to ensure the ordering of memory writes
atomic memory write of NVM is 8 bytes, which is equal to the memory bus width for 64-bit CPUs
Existing techniques, such as logging and copy-on-write (CoW), are used to guarantee consistency of the data whose sizes are larger than an atomic-write size.
- The logging technique first stores the old data (undo logging) or new data (redo logging) into a log and then updates the old data in place.
- The CoW first creates a new copy of data and then performs updates on the copy. The pointers that point to the old data are finally modified.
- Nevertheless, logging and CoW have to write twice for each updated data.

Code

RECIPE : high-performance, concurrent indexes for persistent memory (SOSP 2019)

https://github.com/utsaslab/RECIPE