Heterogeneous-memory Storage Engine Designed For Ssds

HSE is an embeddable key-value store designed for SSDs based on NAND flash or persistent memory. HSE optimizes performance and endurance by orchestrating data placement across DRAM and multiple classes of SSDs or other solid-state storage. HSE is ideal for powering NoSQL, Software-Defined Storage (SDS), High-Performance Computing (HPC), Big Data, Internet of Things (IoT), and…

Category:

Description

                    

        

HSE is an embeddable key-value store designed for SSDs based on NAND flash or persistent memory. HSE optimizes performance and endurance by orchestrating data placement across DRAM and multiple classes of SSDs or other solid-state storage.

HSE is ideal for powering NoSQL, Software-Defined Storage (SDS), High-Performance Computing (HPC), Big Data, Internet of Things (IoT), and Artificial Intelligence (AI) solutions.

Key Features

  • Standard and advanced key-value operators
  • Full transactions with snapshot-isolation spanning multiple independent key-value collections
  • Cursors for iterating over snapshot views
  • Data model for optimizing mixed use-case workloads in a single data store
  • Flexible durability controls
  • Configurable data orchestration schemes
  • C API library that can be embedded in any application

Benefits

  • Scales to terabytes of data and hundreds of billions of keys per store
  • Efficiently handles thousands of concurrent operations
  • Dramatically improves throughput, latency, write-amplification, and read-amplification versus common alternatives for many workloads
  • Optionally combined multiple classes of solid-state storage to optimize performance and endurance

The HSE Wiki contains all the information you need to get started with HSE.

YCSB (Yahoo! ® Cloud Serving Benchmark) is an industry-standard benchmark for databases and storage engines supporting key-value workloads. The following table summarizes several YCSB workload mixes, with application examples taken from the YCSB documentation.

YCSB Workload Operations Application Example
AT 50% Read; 50% Update Session store recording user-session activity
B 95% Read; 5% Update Photo tagging
VS 100% Read User profile cache
D 95% Read; 5% Insert User status updates

We integrated HSE with YCSB to make it easy to compare its performance and scalability to that of other storage engines for YCSB workloads. Below are throughput results from running YCSB with HSE.

For comparison, we include results from RocksDB , a popular and widely-deployed key-value store. For these YCSB workloads, HSE delivered up to nearly 6x more throughput than RocksDB.

System configuration details and additional performance results can be found in the YCSB section of the HSE Wiki.

We also integrated HSE with MongoDB® , a popular NoSQL database, to validate its benefits within a real-world storage application. Below are throughput results from running YCSB with MongoDB using HSE (MongoDB / HSE).

For comparison, we include results from MongoDB using the default WiredTiger storage engine (MongoDB / WiredTiger). For these YCSB workloads, MongoDB / HSE delivered up to nearly 8x more throughput than MongoDB / WiredTiger.

System configuration details and additional performance results can be found in the MongoDB section of the HSE Wiki.

  

Read More

Reviews

  1. haneefmubarak

    Looks pretty cool when you make it to the GitHub (github.com/hse-project). Order of magnitude performance gains! I imagine most of that come from skipping the Filesystem layer and just hitting the raw Block layer directly.

    I am curious about the durability and how well tested all of that is though. On the one hand, filesystems put a lot of work towards ensuring that bytes written to disk and synced are most likely durable, but OTOH Micron is a native SSD vendor so they've probably thought of that.

    I'm also curious whether RAIDing multiple SSDs together at the block layer and then running HSE on top of that will be faster or whether running multiple HSE instances (not the right word, it's a library, but you get what I mean) with one per drive and then executing redundantly across instances would be faster. Argument for the former is that each instance would have to redo the management work, argument for the latter is that there's probably synchronization overhead within the library so running more in parallel should allow for concurrency and parallelism gains.

  2. aloknnikhil

    > github.com/hse-project/hse

    Their benchmarks show significant gains compared to RocksDB.

    > github.com/spdk/rocksdb

    But what I'd really like to see is a comparison against RocksDB using SPDK

    > dqtibwqq6s6ux.cloudfront.net/download/papers/Hitachi_SPDK_NVMe_oF_Performance_Report.pdf
    Based on these results, SPDK performs significantly better than the kernel requiring only 1-2 cores to saturate IOPS on an NVMe SSD (compared to the kernel requiring 16)

Add a review

Your email address will not be published. Required fields are marked *

Vendor Information

  • Store Name: PLATINUM BLUE DREAM
  • Vendor: Platinum
  • Address:
  • No ratings found yet!