Performance Consistency

Over the years we've seen SSD makers shift focus from sequential performance, to random performance and more recently towards IO consistency. Prolonged random IO can create a ton of fragmentation both in the NAND and, if applicable, in a SSD's indirection table. A drive without a powerful enough controller, a smart enough algorithm and/or enough spare area will see tremendous variance between even adjacent IOs. Inconsistent IO latency can result in inconsistent application performance, which for a web-facing server can manifest itself as inconsistent page load times for your users. Even more visible are in situations where your server is delivering visual content to your users. Inconsistent drive latency here can appear, in the worst case, as stutters in playback or even loss of quality (if your application dynamically adjusts streaming quality based on server performance). If you're running a large array of SSDs, the total performance of the array is bound by the slowest drive - in the case of an array of drives with highly variable performance, that minimum level of performance can actually end up being far slower than you're expecting.

Performance consistency matters, but it requires a somewhat different methodology to analyze. We focus on random write latency as that's really the worst case scenario for IO variance. Reads are fairly consistent, as are sequential writes. It's random writes that cause problems.

To generate the data below I took a freshly secure erased SSD and filled it with sequential data. This ensures that all user accessible LBAs have data associated with them. Next I kicked off a 4KB random write workload at a queue depth of 32 using incompressible data. I ran the test for just over half an hour, no where near what we run our steady state tests for but enough to give me a good look at drive behavior once all spare area filled up.

I recorded instantaneous IOPS every second for the duration of the test. I then plotted IOPS vs. time and generated the scatter plots below. Each set of graphs features the same scale. The first two sets use a log scale for easy comparison, while the last set of graphs uses a linear scale that tops out at 40K IOPS for better visualization of differences between drives.

The first set of graphs shows the performance data over the entire 2000 second test period. In these charts you'll notice an early period of very high performance followed by a sharp dropoff. What you're seeing in that case is the drive alllocating new blocks from its spare area, then eventually using up all free blocks and having to perform a read-modify-write for all subsequent writes (write amplification goes up, performance goes down).

The second set of graphs zooms in to the beginning of steady state operation for the drive (t=1400s). The third set also looks at the beginning of steady state operation but on a linear performance scale. Click the buttons below each graph to switch source data.

At a high level, the P400m actually looks really good. Thanks to tons of spare area on the drive by default, performance until the 600 second marker is incredibly consistent. After that point things get a bit more erratic but compared to the P400e this drive is in a completely different league of performance consistency. Intel's S3700 is a bit better here, but the P400m is a clear step in the right direction.

The next set of charts look at the steady state (for most drives) portion of the curve. Here we'll get some better visibility into how everyone will perform over the long run.

Looking closer we see a clear improvement in consistency over the P400e. The P400m is truly an upgrade for customers who have write heavy workloads (e.g. high traffic database servers).

The final set of graphs abandons the log scale entirely and just looks at a linear scale that tops out at 40K IOPS. We're also only looking at steady state (or close to it) performance here:

It's interesting that consistency seems to improve over time after falling behind a bit. We really hit the worst inconsistency early on here, and even then it's no where near as bad as on the P400e. Intel's S3700 maintains a tighter distribution of IOPS however.

Introduction Random & Sequential Performance
Comments Locked


View All Comments

  • zeadlots - Wednesday, February 13, 2013 - link

    is where the other data is.
  • toyotabedzrock - Saturday, February 16, 2013 - link

    Shouldn't the 200gb model have a 3.5 Pb endurance rating?
  • DeepStorage - Sunday, February 17, 2013 - link


    Could you elaborate on what benchmark software you use to collect the IOPS data over time? I have been looking for just such a tool even looking into a custom IOmeter that logged when the screen is updated.

    Howard Marks

Log in

Don't have an account? Sign up now