Microsoft Munches Own Open Source Dog Food

In recent months, Microsoft has made some enormous strides in this area, letting developers run the open-source Linux operating system atop its Azure cloud service and working to build a new version of the open-source Hadoop number-crunching system that runs on its own Windows operating system. But do businesses and developers really want to run Linux on Azure or Hadoop on Windows? Do these two world really mix?
Image may contain Human and Person
Photo: James Merithew/Wired.com

Microsoft is on a mission to make nice with open source software. That's no secret.

Open source software -- software that's freely shared with the world at large -- is in many ways a threat to Microsoft. After all, the company makes its money selling proprietary software tools to the masses. But Microsoft realized years ago that open source tools had become so popular and so influential, it couldn't thrive in the modern world unless it embraced the open source community -- at least in part.

The question is how successful this mission will be. In recent months, Microsoft has made some enormous strides in this area, letting developers run the open source Linux operating system atop its Azure cloud service and working to build a new version of the open source Hadoop number-crunching system that runs on its own Windows operating system. But do businesses and developers really want to run Linux on Azure or Hadoop on Windows? Do these two worlds really mix?

Well, we can at least say that Microsoft itself is now running Hadoop -- and that's a big step for the company. Microsoft technical fellow Dave Campbell tells us that a growing number of internal company projects are taking advantage of Hadoop and HDInsight, Microsoft's version of the platform. He declines to name specific projects within Microsoft proper, but 343 Industries -- the Microsoft game studio that develops the Halo franchise -- is now using HDInsight on Azure.

Hadoop is a way of crunching massive amounts of data across a sea of ordinary computer servers. It was bootstrapped at Yahoo and Facebook about six years ago and soon spread across the web, becoming an essential part of running a big online operation.

As Hadoop spread, Microsoft built a similar tool known as Dryad, which ran atop another proprietary Microsoft platform called Cosmos. This helped drive Microsoft's Bing search engine, and in 2009, the company released a preview version of a Dryad platform that could be used outside of Microsoft and Cosmos.

But in late 2011, just a few months after the beta release of the tool, Microsoft pulled the plug, saying it was doubling down on Hadoop. The company had already announced that it was helping to port the open source platform Hadoop to Windows, but the move was surprising -- and it was yet another sign that Microsoft was dead serious about embracing open source software.

In October, its Hadoop-based tool were rechristened as HDInsight, and the company released a two technical previews: one that ran atop its Windows Azure cloud service and one that you could download and installed on your own Windows servers.

Today, Microsoft still uses Cosmos. But as Dave Campbell says, Hadoop is gathering steam inside the company.

Microsoft subsidiary 343 Industries uses HDInsight to glean information about the behavior of Halo players. Jerry Hook, an executive producer of Halo 4, explains that the game designers wanted to update the game each and every week, and they wanted to tailor new content to the preferences of users -- so they turned to Hadoop.

Using HDInsight, the 343 Industries team was able to collect data on how players actually interacted with the game, which modes they preferred, which weapons they used, and which maps they spent time in and which ones they dropped out of quickly. They were then able to use that data to tweak the weekly releases. For example, Hook says they discovered that players actually preferred smaller maps to larger maps that required the use of vehicles.

More dramatically, they learned that players had found a way to corrupt profiles to make parts of the character avatars invisible. Once the team was able to see the pattern, they were able to put measures in place to discourage players from doing this, such as booting them or making their character avatars larger.

343 Industries' non-technical team are able to use HDInsight via an Excel plugin called Microsoft Data Explorer, enabling the marketing team to do research such as how best to encourage new players to start using the multiplayer mode.

So Microsoft has finally learned to eat its own opens-source dog food. But will others embrace HDInsight in the same way? Campbell says a few outside companies are using it, including the social media monitoring company Klout. Klout also has close ties to Microsoft. But this is still progress. Klout isn't using Dryad. It's using a Microsoft tool based on open source software.