Mixed IO Performance

For details on our mixed IO tests, please see the overview of our 2021 Consumer SSD Benchmark Suite.

Mixed IO Performance
Mixed Random IO Throughput Power Efficiency
Mixed Sequential IO Throughput Power Efficiency

The Inland Performance Plus with the Phison E18 controller sets a new record for performance on our mixed sequential IO test, and it provides pretty good power efficiency on that test. It has somewhat disappointing performance on the mixed random IO test, with a few Gen3 TLC drives delivering better performance, and most of the 8-channel TLC drives delivering better efficiency.

Mixed Random IO
Mixed Sequential IO

On the sequential IO test, the Inland Performance Plus is a bit slow to start when the workload is very read-heavy, but quickly ramps up to about 6GB/s. Like many drives, performance is low to begin with as these drives aren't exactly optimized for juggling several parallel streams of sequential reads. Once the workload has shifted to include a significant amount of writes, caching makes things easier for the drives to manage and performance tends to improve. The E18 controller makes that transition early and with as big a performance gain as any drive, and things hold relatively steady around 6GB/s through the rest of the test.

On the random IO test, the Performance Plus is less consistent. After the typical initial performance drop that comes from adding the first bit of writes to the mix, the Performance Plus generally keeps slowing down but there's quite a bit of variability. The higher power consumption during phases where performance is lower indicates that there's background work to clean up the SLC cache interfering with benchmark performance. Things settle down during the last third of the test.

 

Power Management Features

Real-world client storage workloads leave SSDs idle most of the time, so the active power measurements presented earlier in this review only account for a small part of what determines a drive's suitability for battery-powered use. Especially under light use, the power efficiency of a SSD is determined mostly be how well it can save power when idle.

For many NVMe SSDs, the closely related matter of thermal management can also be important. M.2 SSDs can concentrate a lot of power in a very small space. They may also be used in locations with high ambient temperatures and poor cooling, such as tucked under a GPU on a desktop motherboard, or in a poorly-ventilated notebook.

Inland Performance Plus 2TB
NVMe Power and Thermal Management Features
Controller Phison E18
Firmware EIFM21.1
NVMe
Version
Feature Status
1.0 Number of operational (active) power states 3
1.1 Number of non-operational (idle) power states 2
Autonomous Power State Transition (APST) Supported
1.2 Warning Temperature 70 °C
Critical Temperature 110 °C
1.3 Host Controlled Thermal Management Supported
 Non-Operational Power State Permissive Mode Supported

The Phison E18 as used in the Inland Performance Plus supports the full range of NVMe power and thermal management features, but with the somewhat implausible 110 °C critical temperature threshold. The deepest idle power state also claims only a 30% reduction in power at the cost of much higher entry and exit latencies. Fortunately, as shown below the lowest idle power state saves a lot more power than indicated by this firmware information.

Inland Performance Plus 2TB
NVMe Power States
Controller Phison E18
Firmware EIFM21.1
Power
State
Maximum
Power
Active/Idle Entry
Latency
Exit
Latency
PS 0 8.8 W Active - -
PS 1 7.1 W Active - -
PS 2 5.2 W Active - -
PS 3 62 mW Idle 2 ms 2 ms
PS 4 44 mW Idle 25 ms 25 ms

Note that the above tables reflect only the information provided by the drive to the OS. The power and latency numbers are often very conservative estimates, but they are what the OS uses to determine which idle states to use and how long to wait before dropping to a deeper idle state.

Idle Power Measurement

SATA SSDs are tested with SATA link power management disabled to measure their active idle power draw, and with it enabled for the deeper idle power consumption score and the idle wake-up latency test. Our testbed, like any ordinary desktop system, cannot trigger the deepest DevSleep idle state.

Idle power management for NVMe SSDs is far more complicated than for SATA SSDs. NVMe SSDs can support several different idle power states, and through the Autonomous Power State Transition (APST) feature the operating system can set a drive's policy for when to drop down to a lower power state. There is typically a tradeoff in that lower-power states take longer to enter and wake up from, so the choice about what power states to use may differ for desktop and notebooks, and depending on which NVMe driver is in use. Additionally, there are multiple degrees of PCIe link power savings possible through Active State Power Management (APSM).

We report three idle power measurements. Active idle is representative of a typical desktop, where none of the advanced PCIe link or NVMe power saving features are enabled and the drive is immediately ready to process new commands. Our Desktop Idle number represents what can usually be expected from a desktop system that is configured to enable SATA link power management, PCIe ASPM and NVMe APST, but where the lowest PCIe L1.2 link power states are not available. The Laptop Idle number represents the maximum power savings possible with all the NVMe and PCIe power management features in use—usually the default for a battery-powered system but rarely achievable on a desktop even after changing BIOS and OS settings. Since we don't have a way to enable SATA DevSleep on any of our testbeds, SATA drives are omitted from the Laptop Idle charts.

Idle Power Consumption - No PMIdle Power Consumption - DesktopIdle Power Consumption - Laptop

The active idle power from the E18 drive is well under 1W, a clear improvement over other Gen4 drives and many of the top-performing Gen3 drives (note: all Gen4 drives are operating at Gen3 speeds during this test, because we can't get idle power management working properly on our Gen4 testbeds; on a Gen4 system we expect active idle power to be a bit higher). The desktop idle almost exactly matches what the drive claims, and lowest laptop idle power is great at just 3mW.

Unfortunately, wake-up times are a bit slow: wake-up from the desktop idle state is already 44ms and wake-up from the laptop idle state is a whopping 371ms, which is enough to cause noticeable delays if this power state is used frequently by the OS.

Idle Wake-Up Latency

Advanced Synthetic Tests: Block Sizes and Cache Size Effects Conclusion
Comments Locked

118 Comments

View All Comments

  • mode_13h - Sunday, May 16, 2021 - link

    > programs were doing their own thing, till OS's began to clamp down.

    DOS was really PCs' biggest Achilles heel. It wasn't until Windows 2000 that MS finally offered a mainstream OS that really provided all the protections available since the 386 (some, even dating back to the 286).

    Even then, it took them 'till Vista to figure out that ordinary users having admin privileges was a bad idea.

    In the Mac world, Apple was doing even worse. I was shocked to learn that MacOS had *no* memory protection until OS X! Of course, OS X is BSD-derived and a fully-decent OS.
  • FunBunny2 - Monday, May 17, 2021 - link

    " I was shocked to learn that MacOS had *no* memory protection until OS X! "

    IIRC, until Apple went the *nix way, it was just co-operative multi-tasking, which is worth a box of Kleenex.
  • Oxford Guy - Tuesday, May 18, 2021 - link

    Apple had protected memory long before Microsoft did — and before Motorola had made a non-buggy well-functioning MMU to get it working at good speed.

    One of the reasons the Lisa platform was slow was because Apple has to kludge protected memory support.

    The Mac was originally envisioned as a $500 home computer, which was just above toy pricing in those days. It wasn’t designed to be a minicomputer on one’s desk like the Lisa system, which also had a bunch of other data-safety features like ECC and redundant storage of file system data/critical files — for hard disks and floppies.

    The first Mac had a paltry amount of RAM, no hard disk support, no multitasking, no ECC, no protected memory, worse resolution, a poor-quality file system, etc. But, it did have a GUI that was many many years ahead of what MS showed itself to be capable of producing.
  • mode_13h - Tuesday, May 18, 2021 - link

    > Apple had protected memory long before Microsoft did

    I mean in a mainstream product, enabled by default. Through MacOS 8, Apple didn't even enable virtual memory by default!

    > The first Mac

    I'm not talking about the first Mac. I'm talking about the late 90's, when Macs were PowerPC-based and MS had Win 9x & NT 4. Linux was already at 2.x (with SMP-support), BeOS was shipping, and OS/2 was sadly well on its way out.
  • mode_13h - Sunday, May 16, 2021 - link

    > C has been described as the universal assembler.

    It was created as a cross-platform alternative to writing operating systems in assembly language!

    > a C program can be blazingly fast, if the code treats the machine as a Control Program would.

    No, that's just DOS. C came out of the UNIX world, where C programs are necessarily as well-behaved as anything else. The distinction you're thinking of is really DOS vs. real operating systems!

    > I'm among those who spent more time than I wanted, editing with Norton Disk Doctor.

    That's cuz you be on those shady BBS' dog!
  • mode_13h - Sunday, May 16, 2021 - link

    > I think there's been a view inculcated against C++

    C++ is a messy topic, because it's been around for so long. It's a litle hard to work out what someone means by it. STL, C++11, and generally modern C++ style have done a lot to alleviate the grievances many had with it. Before the template facility worked well, inheritance was the main abstraction mechanism. That forced more heap allocations, and the use of virtual functions often defeated compilers' ability to perform function inlining.

    It's still the case that C++ tends to hide lots of heap allocations. Where a C programmer would tend to use stack memory for string buffers (simply because its easiest), the easiest thing in C++ is basically to put it on the heap. Now, an interesting twist is that heap overrun bugs are both easier to find and less susceptible to exploits than on stack. So, what used to be seen as a common inefficiency of C++ code is now regarded as providing reliability and security benefits.

    Another thing I've noticed about C code is that it tends to do a lot of work in-place, whereas C++ does more copying. This makes C++ easier to debug, and compilers can optimize away some of those copies, but it does work to the benefit of C. The reason is simple: if a C programmer wants to copy anything beyond a built-in datatype, they have to explicitly write code to do it. In C++ the compiler generally emits that code for you.

    The last point I'll mention is restricted pointers. C has them (since C99), while C++ left them out. Allegedly, nearly all of the purported performance benefits of Fortran disappear, when compared against C written with restricted pointers. That said, every C++ compiler I've used has a non-standard extension for enabling them.

    > if C++, do things in an excessive object-oriented way

    Before templates came into more common use, and especially before C++11, you would typcially see people over-relying on inheritance. Since then, it's a lot more common to see functional-style code. When the two styles are mixed judiciously, the combination can be very powerful.
  • GeoffreyA - Monday, May 17, 2021 - link

    Yes! I was brought up like that, using inheritance, though templates worked as well. Generally, if a class had some undefined procedure, it seemed natural to define it as a pure virtual function (or even a blank body), and let the inherited class define what it did. Passing a function object, using templates, was possible but felt strange. And, as you said, virtual functions came at a cost, because they had to be resolved at run-time.

    Concerning allocation on the heap, oh yes, another concern back then because of its overhead. Arrays on the stack are so fast (and combine those buggers with memcpy or memmove, and one's code just burns). I first started off using string classes, but as I went on, switched to char/wchar_t buffers as much as possible---and that meant you ended up writing a lot of string functions to do x, y, z. And learning about buffer overruns, had to go back and rewrite everything, so buffer sizes were respected. (Unicode brought more hassle too.)

    "whereas C++ does more copying"

    I think it's a tendency in C++ code, too much is returned by value/copy, simply because of ease. One can even be guilty of returning a whole container by value, when the facility is there to pass by reference or pointer. But I think the compiler can optimise a lot of that away. Still, not good practice.
  • mode_13h - Tuesday, May 18, 2021 - link

    > though templates worked as well

    It actually took a while for compilers (particularly MSVC) to be fully-conformant in thier template implementations. That's one reason they took longer to catch on -- many programmers had gotten burned in early attempts to use templates.

    > Passing a function object, using templates, was possible but felt strange.

    Templates give you another way to factor out common code, so that you don't have to force otherwise unrelated data types into an inheritance relationship.

    > I think it's a tendency in C++ code, too much is returned by value/copy, simply because of ease.

    Oh yes. It's clean, side effect-free and avoids questions about what happens to any existing container elements.

    > One can even be guilty of returning a whole container by value, when the facility is there
    > to pass by reference or pointer. But I think the compiler can optimise a lot of that away.

    It's called (N)RVO and C++11 took it to a new level, with the introduction of move-constructors.

    > Still, not good practice.

    In a post-C++11 world, it's now preferred. The only time I avoid it is when I need a function to append some additional values to a container. Then, it's most efficient to pass in a reference to the container.
  • GeoffreyA - Wednesday, May 19, 2021 - link

    "many programmers had gotten burned in early attempts to use templates"

    It could be tricky getting them to work with classes and compile. If I remember rightly, the notation became quite unwieldy.

    "C++11 took it to a new level, with the introduction of move-constructors"

    Interesting. I suppose those are the counterparts of copy constructors for an object that's about to sink into oblivion. Likely, just a copying over of the pointers (or of all the variables if the compiler handles it)?
  • mode_13h - Thursday, May 20, 2021 - link

    > > "many programmers had gotten burned in early attempts to use templates"

    > It could be tricky getting them to work with classes and compile.

    I meant that early compiler implementations of C++ templates were riddled with bugs. After people started getting bitten by some of these bugs, I think templates got a bad reputation, for a while.

    Apart from that, it *is* a complex language feature that probably could've been done a bit better. Most people are simply template consumers and maybe write a few simple ones.

    If you really get into it, templates can do some crazy stuff. Looking up SFINAE will quickly take you down the rabbit hole.

    > If I remember rightly, the notation became quite unwieldy.

    I always used a few typedefs, to deal with that. Now, C++ expanded the "using" keyword to serve as a sort of templatable typedef. The repurposed "auto" keyword is another huge help, although some people definitely use it too liberally.

Log in

Don't have an account? Sign up now