Flash Drives Replace Disks at Amazon, Facebook, Dropbox

Inside a data center in San Jose, California, Dropbox is running servers equipped with solid-state drives, also known as SSDs -- super-fast storage devices that could one day replace traditional hard drives. The company doesn't use SSDs in all its servers, but it's moving in that direction. In other words, Dropbox is indicative of the web's leading services.
Image may contain Corridor Lighting and Floor
Rows of customer's server cages at the SV5 Equinix facility in South San Jose.Peter McCollough

SAN JOSE, CALIFORNIA -- If you drive south from San Jose until the buildings are few and far between, exit the highway, and take a quick left, you'll find a data center occupied by some of the biggest names on the web. Run by a company called Equinix, the facility is a place where the likes of Google, Facebook, and Amazon can plug their machines straight into the big internet service providers.

If you're allowed inside and you walk past the cages of servers and other hardware, you can't see much. In most cages, the lights are off, and even when they're on, there are few ways of knowing what gear belongs to what company. Some companies don't want you to see. Google engineers have been known to wear miner helmets when installing new hardware, determined to keep their specialized gear hidden from the competition.

But if you walk into the right building and down the right aisle, you'll run into a giant Dropbox logo. Clearly, the file-sharing upstart is proud of its data center gear. But at the same time, it doesn't think this hardware is all that different from what the rest of the world is using. And that's about right.

>'Though some people still have a hard time grasping it, these drives save a tremendous amount of money. They look more expensive, but when you need higher performance, you need way less of them.'

\- Artur Bergman

Inside its cage, Dropbox is running servers equipped with solid-state drives, also known as SSDs -- super-fast storage devices that could one day replace traditional hard drives. The company doesn't use SSDs in all its servers, but it's moving in that direction. In other words, Dropbox is like the web as a whole. Such names as Facebook, Amazon, Microsoft, Mozilla, and Wikia are also using solid-state storage in their data centers, and judging from anecdotal evidence, the trend goes even further.

Like a hard drive, an SSD is a device for storing information. But unlike a hard drive, it doesn't have any moving parts. Today's SSD are built with flash memory -- the same stuff that stores data and applications on your iPhone. These drives have been around for years, but they've been slow to make headway in the real world, in part because they're more expensive than traditional hard drives. A 300GB flash drive sells for around $500, whereas a comparable hard drive is closer to $100. A 3 terabyte hard drive -- which is about ten times larger -- sells for around $350.

But in just the last 12 months, SSDs have turned the corner. They're appearing in high-profile laptops such as Google's Chromebooks and Apple's brand-new MacBook Pros, and in the data center, many companies are realizing that they make economic sense even with their higher price tags.

In 2011, according to Jim Handy, an analyst with research outfit Objective Analysis, businesses purchased an estimated 7.9 million SSDs that connect to servers using the serial-ATA interface -- i.e., the interface that traditional hard drives use. That's a $2.2 billion market, says Handy, and he expects this to grow to 13 million devices and $3.6 billion in 2012.

"I think this is getting pretty common," says Artur Bergman, the founder of Fastly, a San Francisco outfit that uses SSDs exclusively in providing a service that helps other businesses speed their delivery of pages over the net. "Though some people still have a hard time grasping it, these drives save a tremendous amount of money. They look more expensive, but when you need higher performance, you need way less of them."

The Speech

About a year ago, Bergman gave a four-minute speech at a Silicon Valley conference attended almost exclusively by engineers who sit on the cutting edge of web infrastructure. He started by asking if anyone in the audience used SSDs in the data center, and less than 20 percent raised their hands. And when he asked who used only SSDs in their data centers, one person raised his hand -- the head of engineering at Wikia, who had inherited his SSD-happy data center from Bergman, the outfit's previous head of engineering.

Anyone who hadn't raised a hand, Bergman said, was "wasting their life." Yes, wasting their life. "I keep repeating that to every single individual I talk to, and what I get back is: '[SSDs are] too expensive,'" he said. "Actually, they're cheaper." Cost shouldn't be measured by the price tag on an individual SSD, he said, but by how much you spend on drives across the data center in order to juggle the required information with each passing second.

One SSD, he said, can handle about 40,000 reads or writes a second, whereas the average hardware gives you about 180. And it runs at about one watt as opposed to 15 watts, which means you spend far less on power. "Do the math on how much you can save," he said. In short, you need fewer servers to do the same amount of work. At Wikia, Bergman first installed SSDs on the company's caching servers, used for providing quick access to data that repeatedly accessed by web surfers. Then, he moved them into the company's database servers, where data stored more permanently. This provided so much additional speed, Bergman says, the caching servers were no longer needed.

Customers sign with their fingers directly onto the iPhone's touchscreen.

When he gave the speech, Bergman had been preaching this same message for about two and half years -- and few listened. But twelve months on, he says, it seems that the web is finally heeding his advice. Companies are constantly emailing him, just to let him know they've embraced SSDs.

Yes, many companies are still holding back, in part because they're waiting for prices to come down even further, in part for other reasons. SSDs are not only more expensive than traditional hard drives, they can accept only so much data before they can't accept any more. In other words, they have a limited lifespan.

But so do hard drives, which are prone to sudden and unexpected death. Bergman doesn't see a SSD's limited life as a big issue. "It's a pretty good failure mode compared to a hard drive, which just takes longer and longer to write data before dying," he says. At Wikia, he says, he replaced the company's first SSDs after two years, and didn't have any write problems before that.

"I don't trust a hard drive after three years," he says. "They don't fail because they run out of write cycles, but they still fail."

Peter McCollough

A Dropbox in the Bucket

Dropbox is another prime example of a web outfit that has made the leap to solid state drives. When you upload files to Dropbox, it stores them in the proverbial cloud, shuttling them to Amazon's online storage service, S3. But at the Equinix facility in San Jose and at other data centers across the country, the San Francisco-based startup runs its own servers that keep track of where all those files are stored -- and many of these machines are equipped with SSDs.

"They deal with all the file meta-data -- what files are in your drop box, who you're sharing files with, what are all the revisions of all your files, stuff like that," says Kevin Modzelewski, a software engineer at the company. "On this meta-data side, we need very high-performance databases."

Some of these machines still use spinning hard disks. When storing larger types of data, Modzelewski says, hard drives still make sense. But when storing smaller chunks of data, the company is using SSDs, and it's looking at using SSDs across the board as it builds new services atop its meta-data operation.

>"Things are heating up in terms of how fast you have to move as a company. SSDs let you do that. Because they're faster, to get the same user experience, you don't have to hyper-optimize code just to get products written and features built in."

\- Kevin Modzelewski

Part of the motivation, Modzelewski says, is that with SSDs, the company's developers can build new services much quicker. Typically, code must be optimized for use on hard disks, but this optimization isn't necessary when you're running on SSDs. "Things are heating up in terms of how fast you have to move as a company. SSDs let you do that. Because they're faster, to get the same user experience, you don't have to hyper-optimize code just to get products written and features built in," he says.

"Hard drives have particular things that they're good at. You have to build your code in a such a way that it takes advantage of things they're good at and doesn't use them in ways that they're bad at. SSDs are more all-around-the-board fast. It frees you from having to think about this dimension of how your code works. And this makes the biggest difference at the beginning of a project."

Hard drives are good at providing access to sequentially stored data, for instance, but they're not quite as good at accessing random data. "Typically, you would have to worry about how you put all your data very close to together so that it could be accessed very quickly on a hard drive. But with SSDs you don't have to worry about that -- especially at the beginning."

According to Colin Corbett -- who oversaw YouTube's internal network and data centers before taking a similar role at Dropbox -- this is not unusual. "A lot of folks are moving to SSDs in some part of the database tier, while continuing to use hard drives in the rest of their infrastructure." But he does stress that Dropbox still uses hard drives in some servers, and that many outfits are still waiting for prices to come down even further before moving to SSDs entirely.

Facebook Gets Flashy

Like Dropbox, Facebook is using flash storage in its database machines -- but the particulars are a little different. Frank Frankovsky, who oversees hardware design at the social networking giant, confirms that the company is using hardware from Silicon Valley outfit Fusion-io that adds flash to servers using the PCI Express connector, long used to connect other peripherals on servers and other machines.

Facebook provides quick access to oft-used data using a data caching platform called Memcached, but when it pulls additional data from a traditional database, he says, it feels the need to use flash. "When a request gets to the database tier," he says. "We want to be able to serve it up really, really quickly. PCI-based flash gives us really high performance on requests...It's a significant improvement on that overall round-trip time to the user."

>"Spinning disks are the highest-failure item in everybody's data center, because they're mechanical. Things that move tend to fail. We want to eliminate some of these failures."

\- Frank Frankovsky

Frankovsky says that Facebook is also considering flash as a way of merely booting its servers, so this task doesn't fall on the hard drives. Bergman did something similar at Wikia. "Spinning disks are the highest-failure item in everybody's data center, because they're mechanical. Things that move tend to fail," he says. "We want to eliminate some of these failures."

In this case, Facebook may use something akin to a USB stick or use a separate pool of flash drives that connects to all web servers in a rack of machines or row of racks in the data center. "The discussions have really picked up a lot, and that's because it's becoming that much more within reach from a cost per gigabyte perspective," Frankovsky says. "As they become more and more affordable, you'll continue to find new use cases for it. It's really cool technology. It's just a matter of getting the economics right."

Calligo -- a cloud service meant specifically for off-shore companies -- has already setup this sort of collective flash storage pool inside its new data centers in the Channel Islands, using hardware from a Colorado startup called SolidFire. "I can give oodles of performance to hundreds of thousands of virtual machines, without having to worry about the physical limitations of other storage," says Calligo founder and CEO Julian Box.

According to research analyst Jim Handy, the market for this sort of storage device is also on the rise. Separate from the market for serial ATA drives, sales of devices that connect to servers in other ways topped 38,000 units in 2011 and $581 million in revenues. He expects this will grow to 77,000 units and $902 million this year.

Flash in the Heavens

In using a cloud service like Calligo, you can benefit from SSDs without actually installing your own. And Calligo isn't your only option. Amazon -- whose cloud service is now running an estimated one percent of the internet -- is using SSDs to underpin its new DynamoDB online database, and Microsoft using them to on the latest incarnation of Microsoft Azure, unveiled just last week.

Amazon says it's using only SSDs with DynamoDB, but hard drives still under pin its other services. And in similar fashion, Microsoft tells us that it's using both SSDs and spinning disks inside its Azure data centers. In other words, they're doing what a lot of people are doing.

Artur Bergman laments that SSDs aren't more widely used in the proverbial cloud. Amazon's primary storage service, S3, for instance, still uses spinning hard disk, and part of Fastly's business involves helping companies speed their access to S3. The best thing to do, he repeats time and again, is to install your own flash.

Like Facebook and Microsoft and Amazon and Dropbox, he says, you don't have to start big. You can begin with just a few drives. "Start small," he said in his now iconic speech. "Don't get fancy SSD card for tens of thousands of dollars. You don't need to drive a fucking Formula One car. You're currently on a bicycle."

Additional reporting by Robert McMillan

Correction: Due to a typo, this article originally said that a 300 terabyte hard drive sells for about $350. It has now been corrected to say a 3 terabyte drive sells for around $350.