mooreslawflashThe industry is finally seeing widespread adoption of All Flash Arrays (AFA) now that the cost of flash technology has made these things reasonably affordable for Enterprise customers.  These represent the next technological jump in storage technology that will cause storage professionals to unlearn what we have learned about performance, cost to value calculations, and capacity planning.

The first arrays to market used existing architectures just without any moving parts.

notdesigned

Not much thought was put into their designs, and flash just replaced spinning disk to achieve questionable long term results.  They continue to use inefficient legacy RAID schemes and hot spares. They continue to use legacy processor + drive shelf architectures that limit scalability.  If they introduced deduplication (a natural fit for AFAs) it was an afterthought post process that didn’t always execute under load.

IDC has recently released a report titled All – Flash Array Performance Testing Framework written by Dan Iacono that very clearly outlines the new performance gotchas storage professionals need to watch out for when evaluating AFAs.  It’s easy to think AFAs are a panacea of better performance.  While it’s hard not to achieve results better than spinning media arrays, IDC does a great job outlining the limitations of flash and how to create performance test frameworks that uncover the “horror stories” as IDC puts it in the lab before purchases are made for production.

Defining characteristics of flash based storage

You can’t overwrite a cell of flash memory like you’d overwrite a sector of a magnetic disk drive.  The first time a cell is written, the operation occurs very quickly.  The data is simply stored in the cell, basically as fast as a read operation.  Every subsequent re-write though, first you must erase a block of cells, and then program them again with new data to be stored.  This creates a latency for incoming write IO after the first, and should be accounted for in testing to make sure enough re-writes are occurring to uncover the performance of the device over time.

Flash wears out over time.  Each time a flash cell is erased and re-written it incurs a bit of damage or “wear.”  Flash media is rated by how many of these program erase (PE) cycles can occur before the cell is rendered inoperable.  SLC flash typically is rated at 100,000 PE cycles.  Consumer MLC (cMLC) is rated around 3,000, where enterprise MLC (eMLC) must pass higher quality standards to be rated for 30,000 PE cycles.  Most drives provide a wear-levelling algorithm that causes writes to be spread evenly across the drive to mitigate this.  Workload patterns, though might cause certain cells to be overwritten more than others, however so this is not a panacea in all cases.

Erase before write activity can lock out reads for the same blocks of cells until the write completes.  Different AFA vendors handle data protection in different ways, but in many cases, mixed read/write workload environments will exhibit greatly reduced IOPS and higher latencies than the 100% read hero numbers most vendors espouse.  This is yet another reason to do realistic workload testing to reset your own expectations prior to production usage.

How these flash limitations manifest in AFA solutions.

Performance degrades over time.  Some AFA solutions will have great performance when capacity is lightly consumed, but over time, performance will greatly diminish unless the implementation overcomes the erase-before-write and cell locking issues.  Look for technologies that are designed to scale, with architectures that overcome the cell locking issues inherent to native flash.

Garbage collection routines crush the array.  Under load, some garbage collection routines that are used to clean up cells marked for erasure, etc. if not handled properly can crush array performance.  In IDC’s testing, this lead to wildly fluctuating AFA performance — sometimes good, sometimes horrible.  Not all arrays exhibit this behavior, and only testing will show the good from the bad (because the vendors won’t tell you).

$ per usable GB is surprisingly inflated due to inefficient thin + dedup or best practice requirements to leave unused capacity in the array.  Comparing the cost of the raw installed capacity of each array is the wrong way to measure the true cost of the array.  Make sure you look at the true usable capacity expectations after RAID protection, thin provisioning, deduplication, spare capacity, requirements to leave free space available, or other mysterious system capacity overheads imposed but undisclosed by the vendor.  The metric you’re after is dollar per usable GB.

Check out the IDC report. It’s a great education about AFAs, and provides a fantastic blueprint to use when testing AFA vendors against each other.