Microsoft Bing Battles Bot Queries by the Billions

In October 2011, Microsoft's Bing search engine handled 2.7 billion search queries, enough to qualify as the number-three search engine in the U.S. But according to new research done by Microsoft and others, that number was dwarfed by the billions of queries that arrived from botnets running on hacked computers across the world.
Image may contain Architecture Planetarium Building Interior Design Indoors Electronics and Screen
Bing.com

In October 2011, Microsoft's Bing search engine handled 2.7 billion search queries, enough to qualify as the number-three search engine in the U.S. But according to new research done by Microsoft and others, that number was dwarfed by the billions of queries that arrived from botnets running on hacked computers across the world.

Looking at Bing query data covering a 16-day period that October, researchers at Microsoft, Wright State University, and the Georgia Institute of Technology counted nearly 3.2 billion queries made by some sort of automated software -- and a vast majority of these automated search requests came from botnets. That's 500 million more than the number of legitimate queries that Bing saw all month. During another 16-day period in May, they counted just over 3 billion auto-queries.

"That's a lot of queries," says Danny Sullivan, the founding editor of Search Engine Land, a site that has long followed the progress of Google, Bing, and other search engines. The research underlines the scope of an ongoing battle between search companies and the scammers who want to exploit their websites to hack into computers or make buckets of money from online referral fees.

What were those botnets doing? According to Junjie Zhang, an associate professor with Wright State University in Dayton, Ohio, a big chunk of them were trying to find websites to hack or email addresses to harvest for spam campaigns.

"The most popular queries are corresponding to vulnerability discovery," he says. About a third of these 6.2 billion automated search queries were looking for search terms such as "powered by PHP register" or "WordPress forum plugin by Fredrik Fahlstad."

Junjie Zhang

Photo courtesy Junjie Zhang

Search results to queries like this can pave the way to web attacks by criminals, giving them a list of websites that might be vulnerable to known software bugs. Take the Fredrik Fahlstad query. The results from that query provide a list of websites that could be vulnerable to a widely used type of web attack called a SQL injection, in which the bad guys use web-based forms as a kind of back door into the database server that's used by the victim.

The botnets also spend a fair bit of bandwidth -- amounting to about 3.6 percent of the total automated queries, according to Zhang -- looking for movie trailers or online coupons to download.

Interestingly, the researchers discovered that not all of this automated search junk is coming from hacked computers. In fact, a decent-sized chunk of it -- about 300 million queries over the 32-days of the study -- seems to come from computers that have been set up in data centers for the express purpose of querying Bing.

Zhang isn't sure exactly what these computers -- he calls them data center clusters -- are doing, but in the paper, the researchers speculate that some of it may be coming from cloud computing services. According to the paper: "The existence of malicious activities from data centers may indicate a new trend, where attackers have started exploiting cloud-computing or other well-maintained infrastructures for launching attacks."

It's also possible that these data center clusters are run by semi-legitimate companies that use search results to set up marketing pages. These companies then earn a referral fee -- sometimes just pennies; other times several dollars -- whenever there's a sale. "It seems that they are searching for some information relating to commercial products," Zhang says, "perhaps they are going to summarize this information and offer a more accessible database."

They may also come from scammers who are trying to game the Bing search engine, says Sullivan. One way of doing this: Get your bot to conduct a search for a popular term like "replica handbags" and then have it click on your own website every time Bing serves up search results. Eventually, Bing might decide that your website deserves higher placement.

Of course, the search companies are aware of this behavior. They've been playing a cat-and-mouse game with scammers, trying to stop this type of behavior since the late 1990s years.

Over at Google, that's the job of the company's Safe Browsing team, a 20-person operation run by Niels Provos that tries to keep dangerous and scamming sites out of its search index. "Google makes a lot of money on the internet," he says. "This is only going to work if the users trust it. So if we make the web as safe for users as possibly, ultimately, they'll also be going to Google properties."

The research paper, Intention and Origination: An Inside Look at Large-Scale Bot Queries, is set to be presented later this month at the Network and Distributed System Security Symposium.