Enmotus MiDrive: Rethinking SLC Caching For QLC SSDs
For consumer storage, CES 2020 brought a new wave of competition for PCIe 4.0 SSDs and promise of faster portable SSDs, but the most intriguing product demo was from Enmotus. They are planning a profound change to how consumer SSDs work, ditching drive-managed SLC caching in favor of host-managed tiered storage.
Enmotus is a well-established provider of storage management software. Their most familiar product to consumers is probably FuzeDrive, a limited edition of which is bundled with recent generations of AMD motherboards as AMD StoreMI. This serves as AMD's answer to Intel's Smart Response Technology (SRT) and Optane Memory storage caching systems. Enmotus also has enterprise-oriented products in the same vein. Their new MiDrive technology builds on their existing tiering software to manage a combination of SLC and QLC NAND on a single consumer SSD.
Caching and Tiering Challenges
All software-driven caching or tiering solutions tend to have limited consumer appeal due to the complexity of setting up the system. At least two physical drives are required, and the OS needs to load an extra driver to manage data placement. Any compatibility issue or other glitch can easily render a PC unbootable, and data recovery isn't as straightforward as for a single drive. These hurdles don't scare off enthusiasts and power users, but PC OEMs aren't eager to market and support these configurations. But without some form of caching or tiering, consumer SSDs would be limited to the raw performance of TLC or QLC NAND. SLC caching managed transparently by the SSD's firmware has been adopted by almost all consumer SSDs in order to improve burst performance, and it has proven to be very effective for consumer workloads. The fundamental limitation of this strategy is that the SSD must work with limited information about the nature and purpose of the user data it is reading and writing.
Most SSDs rely on fairly simple procedures for managing their SLC caches: sending all writes to the cache unless it's full, and using idle time to fold data from SLC into more compact TLC representation, freeing up cache space for future bursts of writes. There are still some choices to be made in implementing SLC caching for consumer SSDs: whether to use a fixed-size cache or dynamically sized, and whether to stall when the cache fills up or divert writes straight to TLC/QLC. As QLC drives become more common, we're also seeing drives that prefer to keep data in the SLC cache long-term until the drive starts to fill up, so that the cache can help with read performance in addition to write performance.
Enmotus FuzeDrive manual data placement controls
Host-managed caching or tiering opens the door to more intelligent management of data placement, since the host OS has better information: about which chunks of data belong to what file, and about the processes and users that interact with those files. It is easier for the host OS to accurately track the history of access patterns for hot vs. cold files. It is also possible to expose manual control of data placement directly to the user.
Two Drives In One
The Enmotus MiDrive technology allows one SSD to present the host with access to two separate pools of flash storage: QLC and SLC managed by the same SSD controller. To implement this, they have partnered with Phison to modify SSD controller firmware. For server products, a single NVMe SSD would expose two separate NVMe namespaces that Linux treats as different block devices. But for consumers, Enmotus has chosen to maximize backwards compatibility by having the MiDrive present itself as a single block device, with the first 32 or 64 GB initially mapped to SLC NAND and the rest of the drive mapped to QLC NAND. This makes it possible (and fast!) to install an OS to a MiDrive without needing any special Enmotus software or drivers. Once the Enmotus driver has been loaded, it takes over the management of data placement using vendor-specific commands that instruct the SSD to promote or demote ranges of Logical Block Addresses (LBAs) between the QLC and SLC pools of flash. (The initialization process for this tiering currently takes about a quarter of a second, because very little data needs to be moved until there's history indicating what should be in QLC vs SLC.)
Enmotus MiDrive 800GB appearing as a single device
This is a lot simpler for the host side than the strategy Intel uses for their Optane Memory H10, which is two separate PCIe devices on one M.2 card and requires special motherboard support to properly detect both halves before the caching software can even get involved. Enmotus is working to make MiDrive even simpler by having Microsoft distribute the Enmotus driver with Windows, so that a MiDrive will be automatically detected and managed by the Enmotus software without requiring any user intervention. For now, Windows will default to using its standard NVMe driver for a MiDrive, but that should change by the time products hit the shelves.
Example of how MiDrive LBA allocation will change with use
(for illustration purposes only, not based on real testing)
Enmotus supports assigning data to SLC or QLC in 4MB chunks, which is probably the size of a single NAND flash erase block in SLC mode, and thus the smallest chunk size that can easily be remapped between the QLC and SLC portions of the drive without contributing to unnecessary write amplification. That 4MB block size means that a small file moved to SLC is likely to bring along other nearby files, which will often contain related data that may also benefit from being in SLC. It also means that large files can be partially resident in SLC and partially in QLC. Since this process doesn't change the logical block addresses a file occupies, Enmotus MiDrive doesn't need to change anything about how NTFS organizes data, and it doesn't need to behave like an advanced disk defragmenter that tries to move important data toward the beginning of the disk. The MiDrive software only needs to look up what LBAs are used by a file and tell the SSD whether to move that data to SLC or QLC blocks. The only side-effect visible to the rest of the OS is a change in the performance characteristics for accessing that part of the SSD.
The SLC portion of an Enmotus MiDrive differs from a traditional SLC cache not only by being host-managed, but also in how the SSD treats it for wear leveling purposes. A typical SSD's SLC cache may have a static or dynamic size, but in either case when new write commands arrive the SSD will write the data to whatever NAND flash block is currently empty. When the cache is flushed, data from several SLC blocks will be rewritten in TLC or QLC mode to a different empty block, and the SLC blocks are then free to be erased and put back into the pool of available blocks. Managing just one pool of empty blocks means that the actual physical location of the SLC cache can move around over time, and a block that was last used as TLC might end up being used as SLC the next time data is written to it.
By contrast, Enmotus MiDrive technology has the SSD track two entirely separate pools. When the drive is manufactured, the SLC portion is permanently allocated for the lifetime of the drive. Any physical NAND pages and blocks that are used as SLC will always be treated as SLC for the lifetime of the drive, and the same for the QLC portion. The two pools of flash are subject to completely independent wear leveling, even though SLC and QLC portions will exist side by side on each physical flash chip on the drive. This means that the QLC blocks will never be subjected to the short-term Program/Erase cycles of SLC cache filling and flushing. For the SLC blocks, the error correction can be tuned specifically to SLC usage, and that allows Enmotus to achieve around 30k Program/Erase cycles for the SLC portion of the drive (based on Micron QLC NAND). MiDrives will expose separate SMART indicators for the SLC and QLC portions of the drive, so monitoring software will need to be updated to properly interpret this information.
In principle, it would be possible for either the SLC or QLC portion of the drive to be worn out prematurely, but in practice Enmotus is confident that their tiered storage management software will lead to longer overall drive lifespans than drive-managed SLC caching. Files that are known to be frequently modified will permanently reside on SLC and not be automatically flushed out to QLC during idle time. If the Enmotus software is smart enough, it will also be able to determine which files should skip the SLC and go straight to QLC until it becomes clear that a file is frequently accessed. For example, a file download coming into the machine over gigabit Ethernet will not initially need SLC performance because raw QLC can generally handle sequential writes at that speed (especially with no background SLC cache flushing to slow things down). And if that file is a movie which is infrequently accessed and only read sequentially, there's no reason for it to ever be promoted up to SLC. In general, the tiered storage management done by Enmotus should result in less data movement between SLC and QLC, rather than the increased write amplification that traditional SLC caching causes.
Since the SLC portion of an Enmotus MiDrive is a slice carved out of regular QLC NAND, it cannot offer all the benefits of specialized low-latency SLC NAND like Samsung's Z-NAND or Kioxia/Toshiba XL-Flash. The SLC portion of a MiDrive won't be appreciably faster than the SLC cache of a traditional consumer SSD, but that performance will be more consistent and predictable when working with files that are kept entirely on the SLC portion of the drive.
The Business Model
Enmotus MiDrive is currently implemented as a combination of Windows driver software and custom SSD firmware for Phison NVMe controllers, but it does not require any custom hardware. This means that any vendor currently selling Phison E12 NVMe SSDs can make a MiDrive-based product by licensing and shipping Enmotus firmware. PC OEMs can adopt MiDrives by switching to drives with Enmotus firmware and ensuring that they either include the Enmotus drivers in their Windows images, or relying on them to be distributed through Windows Update. No motherboard firmware or hardware modifications are required, or any changes to the process of provisioning a machine and preparing it for delivery to the end user. Enmotus is engaging both with PC OEMs and vendors of retail SSDs, so we can expect pre-built systems with Enmotus MiDrive technology and and upgrade options usable on any Windows 10 PC that already supports standard M.2 NVMe SSDs. Enmotus is optimistic about uptake from PC OEMs, expecting MiDrive to get a much better reception than Intel's Optane H10 did.
The basic MiDrive products will be fully automatic, with the Enmotus driver pre-installed or installed automatically when a MiDrive is detected. Data placement decisions will be completely behind-the-scenes. For enthusiasts, there will also be a premium tier similar to their current FuzeDrive software, which includes Windows Explorer shell integration so that individual files can be manually promoted or demoted, either permanently or for a limited period of time. Enmotus will also be providing a drive health monitoring tool that will include their estimate for how much extra drive lifetime has been won by using their tiering instead of ordinary SLC caching.
Mockup of Enmotus MiDrive SSD health monitoring tool
Enmotus expects SSDs with MiDrive technology to mostly use either 32GB or 64GB SLC portions and offer total capacities from about 400GB up to around 2TB, but the exact configurations will be determined by what their partners want to bring to market. Enmotus is also planning enthusiast-oriented solutions supporting RAID-0 style striping across multiple physical drives, and solutions for single-package BGA SSDs that go into small form factor and embedded devices.
Enmotus MiDrive technology will add to the price of SSDs, but since we're talking about QLC storage that's only relative to the cheapest NVMe SSDs available, and the final sticker prices will still be competitive for consumer SSDs. In return for that, users should get better real-world performance and enough effective write endurance to justify a 5-year warranty. We're looking forward to testing out this technology later this year, even though it will further complicate our benchmarking process. Enmotus is already sampling to interested OEMs.