Lenovo's announcement today of a new generation of ThinkPads based on Intel's Kaby Lake platform includes brief but tantalizing mention of Optane, Intel's brand for devices using the 3D XPoint non-volatile memory technology they co-developed with Micron. Lenovo's new ThinkPads and competing high-end Kaby Lake systems will likely be the first appearance of 3D XPoint memory in the consumer PC market.

Several of Lenovo's newly announced ThinkPads will offer 16GB Optane SSDs in M.2 2242 form factor paired with hard drives as an alternative to a using a single NVMe SSD with NAND flash memory (usually TLC NAND, with a portion used as SLC cache). The new Intel Optane devices mentioned by Lenovo are most likely the codenamed Stony Beach NVMe PCIe 3 x2 drives that were featured in roadmap leaked back in July. More recent leaks have indicated that these will be branded as the Intel Optane Memory 8000p series, with a 32GB capacity in addition to the 16GB Lenovo will be using. Since Intel's 3D XPoint memory is being manufactured as a two-layer 128Gb (16GB) die, these Optane products will require just one or two dies and will have no trouble fitting on to a short M.2 2242 card alongside a controller chip.

The new generation of ThinkPads will be hitting the market in January and February 2017, but Lenovo and Intel haven't indicated when the configurations with Optane will be available. Other sources in the industry are telling us that Optane is still suffering from delays, so while we hope to see a working demo at CES, the Optane-equipped notebooks may not actually launch until much later in the year. We also expect the bulk of the initial supply of 3D XPoint memory to go to the enterprise market, just like virtually all of Intel and Micron's 3D MLC NAND output has been used for enterprise SSDs so far.

Support for Intel Optane branded devices based on 3D XPoint memory technology has long been bandied about as a new feature of the Kaby Lake generation of CPUs and chipsets, but Intel has not officially clarified what that means. The plan of record has always been for the first Optane products to be NVMe SSDs, but NVMe is already thoroughly supported by current platforms and software. Because Optane SSDs will have a significantly higher price per GB than NAND flash based SSDs, the natural role for Optane SSDs is to act as a small cache device for larger and slower storage devices. The "Optane support" that Kaby Lake brings is almost certainly just the addition of the ability to use NVMe SSDs (including Optane SSDs) as cache devices.

At a high level, using Optane SSDs as a cache for hard drives is no different from the SSD caching Intel first introduced in 2011 with the Z68 chipset for Sandy Bridge processors and the Intel Rapid Storage Technology (RST) driver version 10.5. Branded by Intel as Smart Response Technology (SRT), their SSD caching implementation built on the existing RAID capabilities of RST to use a SSD as a block-level cache of a hard drive, operating as a write-back or write-through cache depending on the user's preference. For SATA devices, no hardware features were required but booting from RST RAID or cache volumes requires support in the motherboard firmware, and Intel's drivers have used RAID and SRT SSD caching to provide product segmentation between different chipsets.

With the release of Skylake processors and the 100-series chipsets, Intel brought support for NVMe RAID to RST version 15. This was not as straightforward to implement as RAID and SRT for SATA drives, owing to the fact that the SATA drives in a RST RAID or SRT volume are all conveniently connected through Intel's own SATA controller and managed by the same driver. NVMe SSDs by contrast each connect to the system through general-purpose PCIe lanes and can use either the operating system's NVMe driver or a driver provided by the SSD manufacturer. In order to bring NVMe devices under the purview of Intel's RST driver, 100-series chipsets have an unusual trick: when the SATA controller is put in RAID mode instead of plain AHCI mode, NVMe devices that are connected to the PCH have their PCI registers re-mapped to appear within the AHCI controller's register space and the NVMe devices are no longer detectable as PCIe devices in their own right. This makes the NVMe SSDs inaccessible to any driver other than Intel's RST. Intel has provided very little public documentation of this feature and its operation is usually very poorly described by the UEFI configuration interfaces on supporting machines. This has caused quite a few tech support headaches for machines that enable this feature by default as it is seldom obvious how to put the machine back into a mode where standard NVMe drivers can be used. Worse, some machines such as the Lenovo Yoga 900 and IdeaPad 710 shipped with the chipset locked in RAID mode despite only having a single SSD. After public outcry from would-be Linux users, Lenovo released firmware updates that added the option of using the standard AHCI mode that leaves NVMe devices alone.

(excerpt from Intel 100-series chipset datasheet)

In spite of the limitations and rough edges, Intel's solution does ensure reliable operation in RAID mode, free of interference from third-party drivers. It's certainly less work than the alternative of writing a more general-purpose software RAID and caching system for Windows that can handle a variety of underlying drivers. It also lays the groundwork for adding support for NVMe cache devices to Intel's SRT caching system. Intel's SRT already has caching algorithms tuned for 16GB to 64GB caches in front of hard drives, so now that they have a solution for mediating access to NVMe SSDs it is simple to enable using both features simultaneously. The changes do need to be added to both the RST driver and to the motherboard firmware if booting from a cached volume is to be supported. Backporting and deploying the firmware changes to Skylake motherboards should be possible but is unlikely to happen.

In the years since Intel introduced SRT caching, another form of tiered storage has taken over: TLC NAND SSDs with SLC caching. NAND flash suffers from write times that are much longer than read times, and storing multiple bits per cell requires multiple passes of writes. To alleviate this, most TLC NAND-based SSDs for client PC usage treat a portion of their flash as SLC, storing just one bit per cell instead of three. This SLC is used as a cache to absorb bursts of writes, which are consolidated into TLC NAND when the drive is idle (or when the SLC cache fills up). Even TLC NAND has reasonably high read performance, so there is little need to use SLC to cache read operations. By contrast, Intel's Smart Response Technology has to cache access to hard drives, where both read and write latencies are painfully high. This means SRT has to balance keeping frequently-read data in the cache against making room for a burst of writes. Having a lot of static data hanging around on the cache device will cause significant write amplification to result from any wear leveling, but SRT already reduces the write load by having sequential writes bypass the cache. Taking into account that 3D XPoint memory can handle millions of write operations per cell, even a small 16GB cache device should have no trouble with endurance.

Comments Locked


View All Comments

  • Kakti - Wednesday, December 28, 2016 - link

    Phones may actually be a pretty good scenario for 3D-Xpoint/Optane replacement.

    Along a similar vein then are those super cheapo netbooks/chromebooks that still use eMMC for storage. I'm hoping this year we finally see the end of those god awful 16gb and 32gb eMMC drives. But then again NAND may just end up replacing eMMC for these low drives.

    At the end of the day, we really need more information on how big/dense these things will be. Are we going to continue seeing DIMM-esque drives and modules that run in the dozens of MB each, or will we be seeing multi-terabyte size products in the consumer space? Are these always going to play the role of "smaller, faster cache" in a RAM > Optane > NAND design, or do they plan (hope) that this will replace NAND eventually?

    Moving offtopic for a second, in regards to DIMMS vs NAND, is it something inherent in the physical design and architecture of DIMM Memory that makes it so much less dense than comparable NAND? Why does a given size of a DIMM only have say 8gb, 16gb or even "crazy" 128gb of RAM, when an M.2 drive can have 512gb-1tb? I assume they're on same or at least very similar lithography so assume it's just a simple design difference that makes RAM so much more volume hungry?
  • extide - Thursday, December 29, 2016 - link

    eMMC is just a protocol -- eMMC drives still use NAND flash just like sd cards and everything else
  • extide - Thursday, December 29, 2016 - link

    With DRAM you basically have a capacitor for each bit, and whether it's charged or not is how it knows if it's a 1 or a 0, but the charge in that capacitor drains out very quickly, which is what makes it volatile and that's why you need to keep power because it has to constantly read the cell and then refresh the charge in the cell. Requiring a cap means that it's hard to make the cell sizes small. For Example: a single die of modern top end latest lithography DRAM has about 8 or maybe 16Gbit capacity. NAND can store 384Gbit in the ~same space right now with IMFlash's 3D NAND.
  • Kakti - Friday, December 30, 2016 - link

    Gotcha, thank you for the reply extide.
  • tuxRoller - Wednesday, December 28, 2016 - link

    This explains the new instructions and platform support for pmem:

  • Billy Tallis - Wednesday, December 28, 2016 - link

    That only applies to NVDIMMs, not PCIe NVMe devices even if they have Optane memory on board. Also, it's out of date: https://software.intel.com/en-us/blogs/2016/09/12/...
  • tuxRoller - Thursday, December 29, 2016 - link

    The nvdimm form factor is irrelevant (lightnvm works similarly). This is speaking about the challenges of pmem, and yes, it's a bit old but it actually provides some details as to why Intel is requiring a platform to support optane.
    Also, the clwb command was relegated to the SDK in the October update.
  • Billy Tallis - Thursday, December 29, 2016 - link

    LightNVM has approximately nothing to do with NVDIMMs. LightNVM is about moving the flash translation layer to the host CPU but otherwise still operating as a block device rather than memory mapped.
  • tuxRoller - Friday, December 30, 2016 - link

    Oh boy, this will be fun.


    I mentioned Lightnvm BECAUSE those pmem libraries are useful where the host is responsible for ensuring data persistence.
    Also, the dimm ff was, tmk, the first available and had been used for os support bringup wrt pmem.
    This is about creating a "native" solution to that problem instead of using dax. Obviously there are still some folks left to convince.
  • Byte - Wednesday, December 28, 2016 - link

    Wooo moar level of cache!

    L2, L3, L4/eDram, RAM, Optane, SSD RAM buffer, SLC Buffer, TLC/MLC Nand

Log in

Don't have an account? Sign up now