Microsoft Director of Search Stefan Weitz explains that the future of machine learning consists of teaching artificial intelligence to identify patterns. This will allow, for instance, a search engine to critically analyze your search queries rather than simply scouring the web’s index of results.
Stefan Weitz: So machine learning. What is machine learning? Machine learning really is teaching machines how to find patterns in large amounts of data. The way it works is you’ve got a black box. Think of this as just this set of algorithms in the center that can turn a mass of unstructured data or a mass of confusing data into something which is less confusing and more structured. So what happens is you basically tell the machine, "I’m going to give you all this input on this side and I’m going to tell you what the input should look like post-processing on this side." So you kind of give it the hint. And what it does is the machine says, "Okay, well how do I get from point A to point B?" And it builds, in essence, a pattern to say, "Oh, okay, when I see all this data, to get to this structured set of data I have to do all these computations in the middle to move it from unstructured or messy to structured and beautiful." And that can apply not just to data. It can apply to anything. It can apply to faces. It can apply to types of cats. Whatever it might be you’re basically saying, "Hey machine, this is a cat."
And it says, "Okay, when I see two eyes and a little pink nose and some whiskers — it doesn’t actually say this, but that’s what it’s thinking — then that is a cat." So you teach machines, in essence, to recognize patterns in data, in pictures and whatever it might be. So that’s machine learning basically. You’re in essence helping machines find patterns in massive amounts of data. How does it apply to things like natural language? Well the beauty of machine learning, the beauty of things called deep neural networks allow in essence machines to not think like humans; that’s too much of a stretch. But certainly operate in the same way that we operate. The same way that, for example, when you’re a child you might see a ball on the floor. You don’t know what it’s called. You don’t know how it’s constructed or anything else, but over time people, as you’re walking around the house, your mom or your dad will say, "Look at that ball," or, "Go get the ball." And so what’s happening is that over time you’re getting reinforced that when you see an object on the floor that is stationary and has a certain circumference and looks a certain way you begin to understand ah, that’s a ball, because you’ve heard it over and over again. And machine learning and natural language processing operates much the same way except instead of having your mom or dad point at the thing and say, "That’s a ball three or four times," machines now have trillions of observations about the real world so they can learn these things much, much faster. So for NLP, it’s critical because our ability to interact with search really is predicated on the system’s understanding of what it is we are asking.
Traditionally again machines will return back results or web pages based on the keywords that we put into the box. But if I were to ask a search engine, "Why is there no jaguar in this room?" today we would get back five and a half million results for that question, none of which make any sense of course. With natural language suddenly because the search systems understand what a jaguar is — it’s a car; it’s a sports team I think; it’s an animal; it’s a mammal of some sort. It understands what a room is. It understands the construct of that sentence. When you ask a search engine in the very near future why is there no jaguar in this room, the response will not be 5.6 million results. It’ll be a question back to you saying, "What do you mean?" Like are you asking why there’s no Jaguar car in this room? Are you asking why there’s no jaguar animal in this room? And then, yes, I meant the second one. Why is there no jaguar animals in this room? And because again the search engine understands what jaguar is, an animal is and where they usually live. And then it says "Well because you’re not in a jungle and jaguars generally don’t live in conference rooms like this." And so an NLP becomes very exciting because it’s modeled on the notion that the system, that the search system understands the physical world, everything in it and how all those things relate to each other. And so suddenly now you begin to be able to engage in dialogue just as you and I would do as humans. When we have a question on meaning or intent or context we engage in clarifying questions. That’s what will happen with NLP and search and is already happening today in certain cases.