Caching And Tiering: Intel Optane Memory H20 and Enmotus FuzeDrive SSD Reviewedby Billy Tallis on May 18, 2021 2:00 PM EST
- Posted in
- SSD Caching
- 3D XPoint
- Optane Memory
- Tiger Lake
The latest iteration of Intel's Optane Memory SSD caching is here. The new Optane Memory H20 is two NVMe drives in one, combining a 1TB QLC drive (derived from their recent 670p) with an updated 32GB Optane cache drive, all on one M.2 card. We're also taking a look at the Enmotus FuzeDrive SSD, a different take on the two-drives-in-one idea that augments its QLC with a dedicated pool of fast SLC NAND flash. Each of these drives is paired with software to intelligently manage data placement, putting heavily-used data on the faster, higher-endurance storage media. The overall goal of the two products is the same: to combine the affordable capacity of QLC NAND with the high-end performance and write endurance of SLC NAND or 3D XPoint memory.
SSD Caching History
There is a long history behind the general idea of combining fast and slow storage devices into one pool of storage that doesn't require end users to manually manage data placement. Caching of data in RAM is ubiquitous with CPUs having multiple levels of cache, and hard drives and some SSDs also having their own RAM caches, but all of those are temporary by nature. Persistent caches using a faster form of non-volatile storage has never been quite as pervasive, but there have been plenty of examples over the years.
In the consumer space, caching was of great interest when SSDs first started to go mainstream: they were far faster than hard drives, but not yet large enough to be used as a complete replacement for hard drives. Intel implemented Smart Response Technology (SRT) into their Rapid Storage Technology (RST) drivers starting a decade ago with the Z68 chipset for Sandy Bridge. Hard drive manufacturers also introduced hybrid drives, but with such pitifully small NAND flash caches that they weren't of much use.
More recently, the migration of SSDs to store more bits of data per physical memory cell has led to consumer SSDs implementing their own transparent caching. All consumer SSDs using TLC or QLC NAND manage a cache layer that operates a portion of the storage as SLC (or occasionally MLC)—less dense, but faster.
Intel made another big push for SSD caching with their first Optane devices to hit the consumer market: tiny M.2 drives equipped with the promising new 3D XPoint memory, and rather confusingly branded Optane Memory as if they were DRAM alternatives instead of NVMe SSDs. Intel initially pitched these as cache devices for use in front of hard drives. The implementation of Optane Memory built on their RST work, but came with new platform requirements: motherboard firmware had to be able to understand the caching system in order to properly load an operating system from a cached volume, and that firmware support was only provided on Kaby Lake and newer platforms. The Optane + hard drive strategy never saw huge success; the continuing transition to TLC NAND meant SSDs that were big enough and fast enough became widely affordable. Multiple-drive caching setups were also a poor fit for the size and power constraints of notebooks. Optane caching in front of TLC NAND was possible, but not really worth the cost and complexity, especially with SLC caching working pretty well for mainstream single-drive setups.
QLC NAND provided a new opportunity for Optane caching, leading to the Optane Memory H10 and the new Optane Memory H20 we're reviewing today. These squeeze Intel's consumer QLC drives (660p and 670p respectively) and one of their Optane Memory cache drives onto a single M.2 card. This requires a somewhat non-standard interface; most systems cannot detect both devices and will be able to access either the QLC or the Optane side of the drive, but not both. Some Intel consumer platforms starting with Coffee Lake have the capability to detect these drives and configure the PCIe x4 link to a M.2 slot as two separate x2 links.
The caching system for Optane Memory H20 works pretty much the same as when using separate Optane and slow drives, though Intel has continued to refine their heuristics for data placement with successive releases of their RST drivers. One notable downside is that splitting the M.2 slot's four PCIe lanes into two x2 links means there's a bottleneck on the QLC side; the Silicon Motion SSD controllers Intel uses support four lanes, but only two can be wired up on the H10 and H20. For the H10, this hardly mattered because the QLC portion of that drive (equivalent to the Intel SSD 660p) could only rarely provide more than 2GB/s, so limiting it to PCIe 3.0 x2 had only a minor impact. Intel's 670p is quite a bit faster thanks to more advanced QLC and a much-improved controller, so limiting it to PCIe 3.0 x2 on the Optane Memory H20 actually hurts.
|Intel Optane Memory H20 Specifications|
|Form Factor||single-sided M.2 2280||single-sided M.2 2280|
|NAND Controller||Silicon Motion SM2265||Silicon Motion SM2263|
|NAND Flash||Intel 144L 3D QLC||Intel 64L 3D QLC|
|Optane Controller||Intel SLL3D|
|Optane Media||Intel 128Gb 3D XPoint||Intel 128Gb 3D XPoint|
|QLC NAND Capacity||512 GB||1024 GB||256 GB||512 GB||1024 GB|
|Optane Capacity||32 GB||16 GB||32 GB||32 GB|
|Sequential Read||up to 3300 MB/s||1450 MB/s||2300 MB/s||2400 MB/s|
|Sequential Write||up to 2100 MB/s||650 MB/s||1300 MB/s||1800 MB/s|
|Random Read IOPS||65k (QD1)||230k||320k||330k|
|Random Write IOPS||40k (QD1)||150k||250k||250k|
|Launched||May 2021||April 2019|
|11th Gen Core CPU
500 Series Chipset
RST Driver 18.1
|8th Gen Core CPU
300 Series Chipset
RST Driver 17.2
Both the Optane Memory H10 and H20 are rated for peak throughput in excess of what either the Optane or QLC portion can provide on its own. To achieve this, Intel's caching software has to be capable of doing some RAID0-like striping of data between the two sub-devices; it can't simply send requests to the Optane portion while falling back on the QLC only when strictly necessary.
At first glance, the Optane Memory H20 looks like a rehash of the H10, but it is a substantially upgraded product. The Optane portion of the H20 is a bit faster than previous Optane Memory products including the Optane portion of the H10. Intel didn't give specifics on how they improved performance here, but they are still using first-generation 3D XPoint memory rather than the second-generation 3DXP that is now shipping in the enterprise Optane P5800X SSD.
The QLC side of the drive gets a major upgrade from 64L to 144L QLC NAND and a controller upgrade from the Silicon Motion SM2263 to the SM2265. The new controller is an Intel-specific custom part for the 670p and the H20, derived from the SM2267 controller but lacking the PCIe 4.0 capability. Cutting out the PCIe 4.0 support was reasonable for the Intel 670p because the QLC isn't fast enough to go beyond PCIe 3.0 speeds anyways, and Intel can reduce power draw and maybe save a bit of money with the SM2265 instead of the SM2267. But for the Optane Memory H20 and its PCIe x2 limitation for the QLC portion, it would have been nice to be able to run those two lanes at Gen4 speed.
The Optane Memory H10 was initially planned for both OEM and retail sales, but the retail version was cancelled before release and the (somewhat spotty) support for H10 that was provided by retail Coffee Lake motherboards ended up being useless to consumers. The H20 is launching as an OEM-only product from the outset, which ensures it will only be used in compatible Intel-based systems. This allows Intel to largely avoid any issues with end-users needing to install and configure the caching software, because OEMs will take care of that. The Optane Memory H20 is planned to start shipping in new systems starting in June.
Post Your CommentPlease log in or sign up to comment.
View All Comments
powerarmour - Tuesday, May 18, 2021 - linkQLC garbage again, I can hardly contain myself.
Samus - Wednesday, May 19, 2021 - linkUnderstanding QLC's place in the market (cheap bulk flash storage) I'm also struggling to understand who these premium-priced QLC products are for. Seriously who is going to pay 23-25¢/GB for something like this when it's only crutch is high read throughput that has zero real world advantage for virtually all PC users.
Wereweeb - Wednesday, May 19, 2021 - linkThese products are both proofs of concept, and an advertising for the importance of Caching/Tiering.
Enmotus managed to get 3600 TBW out of a 2TB QLC SSD by reducing it's available capacity by a bit and using their software.
philehidiot - Wednesday, May 19, 2021 - linkThere is definitely the endurance advantage, but you don't need a commercial product for proof of concept. Indeed, I'd say releasing a commercial product just to prove it can be done where there is no real use for it is a bit daft. Unless they plan to inflict it upon customers in a data collection exercise, using their muscle to force it into laptops. We have already seen the advantages of this kind of tech when smaller SSDs were placed as a cache / tier into HDDs.
If their plan is to build this into an industrial product, their proof of concept should be a bunch of engineering samples tested for endurance, not a bodged consumer grade product which seems as though it's going to do more to show you can have a very complex and bodged product and it just about compete with what's already established on the market.
As for advertising, I'd say this is a pretty poor advert. Someone mentioned that Intel's storage division has been held back and it strikes me this is the case. This isn't a new and exciting product, it's two technologies being put together with an inadequate hardware interface and terrible software.
It has potential, but the people who will accept QLC NAND won't know or care what this is and the people who might benefit from the high DWPD won't touch it with a barge pole.
This should have stayed in R&D until it could add something to the market.
Samus - Thursday, May 20, 2021 - linkI'll believe it when it's independently tested. No level of software trickery will enable massive gains in TBW. If you fully write to a drive, the physical cells are fully utilized. Sure you can mask this with a large spare area and aggressive wear leveling but even a 2TB QLD SSD with 4TB of physical NAND (so 2TB spare area) will only yield 4x the endurance and that's best case scenario.
Enmotus can't break the laws of physics with intelligent software unless they've come up with some revolutionary hardware deduplication\compression algorithm that is limiting physical changes to NAND by many orders of magnitude, while also eliminating write amplification that is essential to modern ECC for data integrity.
Billy Tallis - Thursday, May 20, 2021 - linkThe key advantage the Enmotus drive has over regular QLC drives is that the static SLC portion can be used for far more P/E cycles. On a regular QLC drive, which blocks are used for the dynamic SLC cache is constantly changing, and the fact that a block that's currently operating as SLC may soon be repurposed as QLC effectively prevents it from being rated for more P/E cycles than QLC usage can permit. But with a large pool of permanent SLC, the drive can safely re-use those cells long past the point where they would be unusable as QLC. 128GiB at 30k P/E cycles can on its own handle more total writes than the drive as a whole is rated for.
As long as the tiering software does a good job of preventing most writes and write amplification from ever getting to the QLC part of the drive, the endurance rating is completely realistic. The tiering software won't be able to keep the wear confined to the SLC if you are using the drive as a giant circular buffer for video recording or something else that keeps the drive full and constantly modifies all of the data. But most real consumer workloads have a small amount of hot data that's frequently changing and a large amount of cold data that doesn't get rewritten often enough to pose a problem for QLC.
Spunjji - Wednesday, May 19, 2021 - linkAgreed - this would really need to show a serious performance benefit at a similar cost to a TLC drive, or lower cost and similar performance. As it is, it does neither. I'm sure OEMs will lap it up at whatever knockdown price Intel offers it to them to clear the shelves.
Spunjji - Wednesday, May 19, 2021 - linkDerped there and confused the price of the Enmotus with the H20... the Enmotus product really does seem to be in a bad place for price vs. consumer appeal without the benefit of Intel's cosy relationship with OEMs.
Morawka - Friday, May 21, 2021 - linkThe Enmotus product is perfect for Chia miners. Plotting on Chia absolutely destroys consumer-grade SSD's. A 980 Pro will get smoked in around 3 months, whereas this Enmotus drive, even though it's pricier, will last 3-5x longer.
Billy Tallis - Friday, May 21, 2021 - linkI think Chia plotting requires more space than the SLC portion of the Enmotus drive, and plotting is an example of the kinds of workloads that would not be handled well by the Enmotus tiering software unless the plotting could fit entirely in the SLC tier.