The main bottleneck in computers today is not the CPU, but the mechanical hard disk drive (HDD). When an application mostly waits on the CPU to get things done, it’s called compute bound. On the other hand when it’s mostly waiting to get data in or out of storage, it’s called I/O bound. Because CPU performance keeps doubling and HDD performance doesn’t, today most systems are I/O bound. HDDs simply cannot keep CPUs fed with data fast enough. While storage capacities in HDDs have steadily increased, performance has languished — making it the system choke-point. It is one of the last remaining devices that has moving parts in a computer today. (Long ago, CPUs moved from vacuum tubes to solid-state transistors). The future of storage is solid-state disks (SSDs).
Here is an exclusive interview with David Flynn, CTO at Fusion-io, to talk about their ioDrive. It aims to alleviate the performance disconnect between computing and storage while staying within competitive price-points.
Tell us about your product.
The Fusion-io ioDrive is based on a technology we call “ioMemory” – which is composed of NAND Flash combined with our controller, firmware, and software technologies. We’re running a massive parallel NAND Flash array on a single ioDrive – currently on the order of 200 NAND Flash parts. This is similar to a SAN of HDDs. When we say we put the performance of a SAN in the palm of your handle, we literally mean it.
NAND Flash has become interesting only in the last few years. It use to be that DRAM was king, now it’s NAND Flash. NAND Flash has a semiconductor cell design that is much simpler and smaller than a DRAM cell. Manufacturers put it on new silicon wafer fabrication processes sooner than DRAM by about a year and a half. The net result is that at any point in time, NAND has about 10x the cell density of DRAM. With two bits per cell, that’s 20x the bit density. NAND Flash has been around a long time, but the commodization of it is new. Now, more NAND Flash ships each year than the sum total of all DRAM ever shipped. And that is doubling every year. NAND Flash modules can be stacked vertically since they don’t use much power or generate much heat.
So, at the end of the day you can have 100x the capacity per ioMemory module compared to DRAM. We currently have 320 GB ioDrives today, 640 GB later this year, and soon will have TB sized ioDrives.
So what makes our product unique?
We have looked at this commodization of NAND Flash, but it’s not exactly a normal HDD or RAM. While it can be used in place of either, it has its own strengths and weaknesses. What we have done is integrate it in a way to accentuate it’s strengths and minimize its weakness. We maximize its benefits by putting it as close as possible to the CPU – attaching it directly to the arteries of the computer using PCI Express (PCIe). We then move data memory-to-memory directly from NAND Flash to host RAM.
It’s a real handicap to put NAND behind a traditional HDD infrastructure – attaching it out a SATA, SCSI or Fibre Channel bus. Ironically trying to make NAND look from a physical connectivity perspective and a protocol perspective like a HDD isn’t really the goal – as long as software and the OS see it as a disk, that’s all that matters. And, of course, the ioDrive does. This pays off big time.
One metric, the time it takes to fetch a 4K piece of data, exemplifies this advantage – it’s just 50us (that’s millionths of a second). An HDD takes on the order of 5-10ms (thousandths of a second). That’s 100x faster. With the ioDrive enabled as virtual memory swap space, one can get a VM page in 50us and get 100,000 of them per second vs. a HDD that takes 5ms and can only do maybe 120 per second. Extending the apparent amount of DRAM using virtual memory swap space is viable again. For all intents and purposes a server can now appear to have terabytes of DRAM with just a few ioDrives enabled as swap space. On another metric, bandwidth, we get ~ 700 MB/s sustained per ioDrive. That’s faster than the fastest RAID controllers out there – another reason why we go directly to PCIe is that no RAID controller out there could handle it.
What types of applications will benefit from using the ioDrive?
- Content caching as a substitute of DRAM. This is useful in areas such as search, media. That’s because a caching server’s utility is determined by the amount of memory it has. When you don’t have enough DRAM to cache all the content, you have to add more servers. Not because there’s not enough CPU horsepower or bandwidth (one can always go to 10GigE), but because without enough memory the server cannot find things fast enough to hand out.. We allow you to use the ioDrive as swap space for the virtual memory to get 10x or 100x more caching ability on the server.
- Transaction processing, those are extremely I/O-bound for databases and data warehousing. Common things that normal hard drives are RAIDed in SAN costing millions dollars in attempt to aggregate enough performance to keep the server fed with data. We can do the equivalent I/O of a thousand HDDs with a single ioDrive. And we can get the capacity of a TB with just a few ioDrives put together.
- Media Editing, such as HD video, editing, authoring, streaming and serving. This also works with CAD/CAM for engineers and artists.
What type of RAM is in the ioDrive? Does it use Flash (non-volatile) or SDRAM (volatile)?
The Fusion-io ioDrive is based on a technology that is called “ioMemory” which is made of NAND Flash and very intelligent controller, hardware, and software. We’re running a parallel NAND Flash array in a single drive – on the order of 200 NAND Flash pieces of silicon. This is not dissimilar to a SAN.
[Editor: It uses both, mainly Flash with some SDRAM for metadata].
When do you think the price of solid-state disks (SSDs) will be less than mechanical disks (HDDs)?
From a raw capacity standpoint with no regard to performance, the cross-over point is expected to be 2014 to 2015. But today, from the performance standpoint, NAND Flash (SSDs) can be cheaper than hard disk drives (HDDs) by a long shot.
- HDDs cost tens of cents per GB, but tens of dollars per IOPS.
- ioDrives cost tens of cents per IOPS, but tens of dollars per GB.
The pain point we address, the gap in performance between the CPU and storage, is doubling every 18 months with Moore’s Law. That means our benefit doubles. At the same time, our cost gets cut in half by Moore’s Law as the capacity of NAND doubles. This means that the cost-benefit to the customer is growing by Moore’s Law squared!
The question that keeps coming up quite frequently is what is the longevity of the ioDrive? What if the ioDrive was set to continuously read/write? Will errors cause the drive to fail? People are concerned about how long the drives will last.
NAND Flash, being silicon, was initially used in ways that presumed it to be bit-perfect – using only single bit protection, like server DRAM. That’s back when it was called EEPROM. Today, by using storage-like protection, capable of fixing many bad bits, it can be used for a much longer period of time. This works because it doesn’t just “break” when it gets used too much, but rather bits get stuck here or there. The error correction code is capable of guaranteeing that those stuck bits can be corrected.
Additionally, wear is distributed evenly or “leveled” across the entire part, making sure that no one section gets used too much. And, when one section gets too many stuck bits for the error correction to safely guarantee the original data can be retrieved it is taken out of service and swapped for a spare section. This results in a storage device that lasts on average much longer than a HDD. I say on average because no one can predict when a normal mechanical HDD with die. And, that’s really the beauty of silicon-based storage – you can tell when it is getting fatigued and needs to be de-serviced because it starts running low on spares. If spares run-out it just refuses to allow more data to be written. At no point does the data get lost.
[Editor: The ioDrive can last years even with continuous reads/writes, 24/7/365! This is much longer than a HDD could last under such extreme conditions].
What operating systems will the ioDrive support?
Today we support Linux and in the very near future Windows XP, Vista, 2003 Server and 2008 Server. We will also support Mac OS X. It is to be determined, based on customer demand, whether we will support other UNIX flavors such as FreeBSD, Solaris, AIX, etc.
Can the ioDrive be setup in RAID configurations?
At Storage and Networking World we demonstrated 3 ioDrives RAIDed together, and we were getting 2 GB/s bandwidth (yes that’s gigabytes!). The machine was able to swap the full 4 GB of DRAM in just 2 seconds – that’s the equivalent of transferring an entire DVD movie in 2 seconds!
ioDrives can be RAIDed in any way normal disk are (RAID0, RAID1, RAID5, etc.). This is best done with the operating systems RAID capabilities built into the logical volume manager (i.e. Sun ZFS).
How much does it cost?
Our full list price is currently around $30/GB. The prices will change over time. We have ioDrives in several capacities:
- 80 GB @ $2400
- 160 GB @ $4800
- 320 GB @ $8900
- 640 GB by the end of this year
- 1.2 TB next year
Is there anything you’d like to add?
- We like to go green. Nothing like getting rid of racks and racks of HDDs that take enormous amounts of kWs of electricity. Replacing a 100,000 IOPS HDD SAN with ioDrives is equivalent to taking 6 SUVs off the road. There is a power reduction of 99.9% when comparing an ioDrive to an HDD SAN. ioDrives burn only 6 watts for 100,000 IOPs, where a HDD SAN burns 133,433 KWh per year.
- We also save a lot of data center space.
- Direct Attached Storage (DAS) is back.
For more information visit: