How Baidu Will Win China’s AI Race—and, Maybe, the World’s

In an exclusive interview, COO Qi Lu explains why the Chinese search giant will be smarter than Alexa and drive better than Google.
Image may contain Qi Lu Human Person Clothing Sleeve Apparel and Finger
Bloomberg/Getty Imags

A company can have the best technology in the world. It can have the strongest talent. It can have the coolest product ideas. But to train the algorithms that will deliver the intelligence to transform our cities, it needs data. To wit: The company with the most data wins.

That’s why earlier this year, after leaving Microsoft the previous fall, legendary engineer Qi Lu headed to Beijing to become Baidu's chief operating officer. At his former job, he was, among other things, CEO Satya Nadella’s top deputy in helping to lead the company’s AI strategy. Clearly, he saw more opportunity across the Pacific: In China, 731 million people—nearly twice the entire population of the United States—are online. Says Lu: “China has the structural advantage.”

On July 26, while Lu was visiting Silicon Valley, we sat down for an exclusive interview. Lu offered up an eye-opening explanation of how Baidu stands to dominate AI in China. And most places in the world, Lu notes, have much more in common with the tiny homes of the Chinese than the sprawling North American McMansions. He believes that could be China’s biggest advantage in rolling out AI to global markets. Sure, America’s tech giants may have the lead in talent—for now—but Lu believes that Baidu has what it will take to conquer the world.

Jessi Hempel: In the time since you’ve arrived at Baidu, there’s been a reorganization. As COO, what’s your role at the company?

I work very, very closely with Robin [Li, Baidu CEO]. We make sure he and I are fully in sync. I run R&D, sales, and marketing, because I want to make sure that our overall strategy is fully, fully in sync. That’s number one. Number two, I feel that we’re now much more clear and focused, in terms of strategy. It’s really two battles. One is strengthening our mobile foundations. The other is leading the AI era.

How do you describe your AI strategy?

We believe the best way to commercialize AI technology is to build ecosystems. Essentially, to enable our partners to better accelerate their pace of innovation, using healthy, stable economic models to build strong, long-term win-wins for our developers and partners. The baseline is Baidu Brain [the term Baidu uses for all of its AI assets]. It’s broader and more extensive than what Microsoft and Google offer today in the United States, because it’s a platform. We have 60 different types of AI services in our suite we call Baidu Brain.

And we’re the first major company to clearly separate the perceptual and the cognitive layer. Perceptive capability and the cognitive are related, but they are quite different. Most of the [other] AI platforms bundle them together.

What is Baidu’s equivalent of Siri or Cortana?

We are focusing on two platforms to bring our customers and partners together. The first platform we call DuerOS. DuerOS is a natural language-based, conversation-based, human computing platform. Very much like Alexa, Google Now, Siri, or Cortana in the United States. The only difference is DuerOS is so far ahead of anybody else. DuerOS in China has accumulated more conversation-based skill sets than anybody else. We have 10 major domains [and] over 100 sub-domains of conversational skills that we developed. We’re also building up an emerging partner ecosystem. So our partners are building more and more skill sets. Amazon, perhaps, has more than Baidu right now, because they have a larger partner ecosystem in the United States. But compared to most companies, in China, we’re clearly leading.

Number two, we are also clear leaders in partners. DuerOS today is in over 100 brands of private home appliances, whether it’s refrigerators, air conditioners, TVs, storytelling machines, or speakers.

How does the US market for voice technology compare to the Chinese market?

The home environment is very different. Because we’re talking about voice interactions. The acoustic environment, the pattern of noises, will be very different. Alexa, Echo, and Cortana are optimized for American homes. In my view, this only works in North America and maybe a portion of Europe. Essentially, the assumption is that you have spacious homes; you have several rooms. In China, that’s not the case at all. For our target, even for the young generation with high incomes, typically they have 60 square meters [645 square feet], sometimes 90 square meters [970 square feet].

We have better opportunities to globalize DuerOS, because guess what? A home in Japan, a home in India, or a home in Brazil, is a lot closer to a home in China than a home in North America.

Bloomberg/Getty Images

So, that’s different. What’s similar?

The similarity part is the technology. The core technology is still speech recognition, signal processing, natural language understanding, and the platform. Our platform architecture, in many ways, is very similar to Amazon. In my view, Amazon is doing a very great job. Even though I worked at Microsoft. I’m always gonna be rooting for Microsoft. But honestly, Amazon is leading.

But don’t you think that Amazon’s handicap is on its back end, in that it can’t keep up on the technology side with Google and Microsoft?

I worked on Cortana four and a half years ago. At the time we all were like, “Amazon, yeah, that technology is so far behind.” But one thing I learned is that in this race to AI, it’s actually more about having the right application scenarios and the right ecosystems. Google and Microsoft, technologically, were ahead of Amazon by a wide margin. But look at the AI race today. The Amazon Alexa ecosystem is far ahead of anybody else in the United States. It’s because they got the scenario right. They got the device right. Essentially, Alexa is an AI-first device.

Microsoft and Google made the same mistake. We focused on Cortana on the phone and PC, particularly the phone. The phone, in my view, is going to be, for the foreseeable future, a finger-first, mobile-first device. You need an AI-first device to solidify an emerging base of ecosystems.

It’s become so much clearer, living in China, what AI-first really means. It means you interact with the technology differently from the start. It has to be voice or image recognition, facial recognition, in the first interactions. You can use a screen or touch, but that’s secondary.

At Baidu [headquarters], it’s all face recognition-based. At the vending machine at Baidu, you can buy stuff with voice and a face. And we’re also working on a cafeteria project. Our goal is, when you go to a cafeteria, you walk away with food.

Technically, that’s possible now in a lot of places, but that doesn’t mean people are receptive to it.

It’s not all technology. It’s about the structure of the environment—the culture, the policy regime. This is why AI plus China, to me, is such an interesting opportunity. It’s just different cultures, different policy regimes, and a different environment.

So how about the ethical consequences of the tools that we’re creating? Do people have the same types of conversations at Baidu as they do at Microsoft?

Similar. Protection of privacy is of paramount importance to us. Ultimately, our users trust in our technology. So, this is something we talk quite a bit about. And we are going to continue to seriously invest in capabilities to make sure that you can trust our services, in terms of privacy. For example, we talked about voice interactions. We’re working on technologies that would prevent the unintended activation of smartphones. It’s because we know that people don’t want their conversations to be shipped to the Cloud. I may have very private conversations in my living room. [But sometimes] the speakers think you are trying to wake them up, and then send those bits to the Cloud.

Do you think that Chinese consumers care as much? Do you think that they expect something different, by virtue of the fact that they live under a different political environment?

Our assumption is that people will care about this. Ultimately, we believe people are rational. If there’s a compelling benefit, people will weigh the consequences and then make those choices. I think this is global.

Baidu announced an ambitious self-driving initiative called Apollo this spring, and you’ve announced 50 partnerships so far. Why are you doubling down on autos?

If you want to truly build digital intelligence to be able to acquire knowledge, make decisions, and adapt to the environment, you need to build autonomous systems. In autonomous systems, the car is the first major commercial application that is going to land.

It’s just like the phone ecosystem today. The phone ecosystem is the largest silicon software ecosystem. I believe the same thing will happen for the autonomous system. The car is going to build a larger ecosystem. And the same set of capabilities—hardware, sensors, chip sets, software—will be used to build industry robots, home robots. We want to have hundreds of companies and universities all at work on this, building a very large ecosystem. Then we can build robots, build drones, and build all those autonomous systems. So, to me, autonomy is a key.

You were instrumental in developing Apollo, right?

I am the COO of the company, but I run that business directly. For the last three plus months, I probably spent about about 40 percent of my time on the autonomous driving technology product—talking to customers; talking to partners. Essentially, from where things are today, toward the future of being able to be fully autonomous, the fundamental technological path for the self-driving technology is the speed of iterations.

What does that speed depend on?

Essentially, how much data you can get. Because to be able to drive on the road, you have to drive different kinds of roads in different kinds of conditions—lighting, weather, whether it’s wet, how much physical pressure is on your tires. And with Apollo, we will be able to pull together all the resources, particularly the data resources, in a way that enables everybody to be better off.

We wrote a manifesto of Apollo. Essentially, there are four principles. Each is important. One is open capability. At Baidu, we open up our capability—in code, in services, in data—to all partners. This works particularly well in China, because China is highly, highly fragmented. There’s more than 250 car OEMs [original equipment manufacturers], unlike the United States, which is a heavily concentrated industry. None of the OEMs will have the full capabilities to build out deep R&Ds. With our code base that we released on July 5, [we will make it possible for] one person to assemble a vehicle in three days that can do autonomous driving in limited forms and start on R&Ds.

The second is shared resources. Essentially, with the Apollo design, there are two tiers. You are able to use the Apollo code and capability, and some data sets, with no strings attached. The second tier is enables you to use all the data that Baidu provides—HD maps, the training data—but we ask you to contribute your data. However, there’s a key principle. The more you contribute, the more you should be able to get back.

The third principle is the accelerating pace of innovation. Essentially, because we’re able to put together more data, we are able to achieve more capability in our simulation engines. We enable everybody, collectively, to innovate at a much faster pace.

And the fourth principle is sustained win-win. Baidu is the biggest model. It’s going to focus on delivering high-end services, high-value services, HD maps, [and] security services. We’re competing against nobody. We enable each OEM, whether it’s Bosch, Continental, or Nvidia, to be able to do more.

This is the reason I created a subsidiary in the United States, Apollo US. And, also Apollo Singapore. The Singapore government essentially was like, “Wow, this is…Just come to Singapore. I’m ready to invest.”

Bloomberg/Getty Images

What needs to happen to enable fully autonomous vehicles in China?

Technology alone is not going to enable self-driving cars for a long time. I’ll give you just a simple example. Let’s say there’s some kind of a road incident in a city, and the police come, and there’s no signs. Say, he or she just hand writes a sign on a piece of paper to say, “Please slow down to less than five miles an hour. Watch carefully as you proceed.” And they hold it up. You need the technology to be able to read handwriting and understand the human language to be able to do that. That’s going to take a long, long time.

To enable full autonomy, you need new rules, new laws. That’s number one. Number two, as part of Apollo, working with all our partners, we actually found out there’s so much more commercialization, much, much earlier than full autonomy. The Audi 8 is great example. Essentially, the car automatically follows the flows in heavy jammed traffic. And that’s common to Beijing, Shanghai, and in the Bay Area. Now you just let the car drive, and you can read something and do something else. And besides following a car, there are so many other scenarios.

When we first met, you were at Microsoft. You left several months before you arrived at Baidu. Why?

I broke my leg in October 2016. I needed two surgeries. Bill, Satya, and I are still super close, so when I go to Seattle usually I go see Satya at his house. I visit with Bill. I promised to be their personal advisors.

It seems 2017 is a bellwether year for AI development in China. What’s significant about this year?

It’s a combination of the readiness of technology and the number of industry verticals AI can commercialize. And at the global scale, I do feel that there’s opportunities for China and the United States to collectively drive the world forward. I’m probably influenced by Bill Gates quite a bit. He always talked about how the world economy right now, for practical purposes, is a single engine economy. The United States has five percent of the world’s population, but produces about 24 percent of economic output and 60 percent of innovation. It’s just not going to be able to sustain the pace of growth, because the world has seven billion people. Maybe three-plus billion people are living a modern life. We have transportation; we eat processed food; we have refrigerators. But then there’s a sharp drop off. The other populations are living in completely different living conditions. Our job is to elevate everybody to living a modern life. How do you do that? By more innovations, better growth. Really, China should become the second innovation engine, and [Gates] genuinely believes a more innovative China, and a more developed China, is a great thing for the world. I believe that, too.

When you began beefing up your AI resources several years ago, you focused on building a lab in Silicon Valley. When American researcher Andrew Ng left Baidu last spring, his replacement to head Baidu’s AI labs was in China. Has AI’s talent in China caught up to the US?

The United States is still overall stronger, no question. But the gap between China and the United States is rapidly closing. There’s no doubt about that. And since I have lived in China for over six months now, honestly, I read more papers, I talk to more AI developers, and you can feel the strength of the talent base.

Baidu will do more and more AI work in China, for sure. But at the same time, we’re continuing to invest in the United States, in the Bay Area and also Seattle. We just opened a Seattle campus, because we acquired a company called Kitt.ai. For the very top echelon of the talent, the United States is still better, and we want to fully leverage that.