In Terminator 2, the young John Connor tugs on Arnold's sleeve and asks, "Can you think?" Arnold sneers and replies, "Of course I can think, I have a neural network chip." Every neural network chip ever made by mankind was programmed with my software, BrainMaker. So, I claim to have a valid opinion on artificial intelligence.
Imagine a suburban street with picket fences and mailboxes at the end of every driveway. You can ask, "What's in the mail box for 219 Elm Street?" You can open the box and see. This is how the memory in your computer works, you put in an address or a filename and the memory at that address tells you what it's holding. But we can imagine a different sort of memory. If your wife asks you, "Who did we bump into at the restaurant last week?" you find your memory of your last few restaurant events, compare them to what you think your wife is looking for, and respond, "Oh, that was George and Martha." Back to your suburban street, imagine you can say to the street, "Who has a letter from Berkshire Hathaway," and one of the mailboxes waves its flag to show that it's holding such a letter. We call this content addressable memory. Instead of naming a mailbox and seeing what it's holding, we name some contents and the appropriate mailbox responds. This is how human memory works - your wife asks you about the restaurant last week, you retrieve your memory of the event, and then fill in the missing detail for her. Your computer has a very poor rudimentary version of this: you can tell it you're looking for one of your files containing some phrase, perhaps "restaurant last week," and your computer will spend the next couple minutes searching all your files to see if one contains that exact phrase.
Google owns a lot of computers - so many that when they had their initial public offering of stock, the SEC asked them to disclose the number of computers they owned, and a few weeks later Google told the SEC, "Sorry, we tried really hard, and we just don't know how to count them." A good guess would be that Google has a few hundred thousand computers. These computers are storing multiple copies of the Internet, plus an enormous number of books, images, sounds, maps. Google is not just passively collecting information from other sources: in one example, they have several hundred cars driving all over the world taking pictures of everywhere. Google has produced software for their servers that turns them into content addressable memory - they can put something out on the net, and ask "Who knows what this is?" Some server will respond, "I've got that." Google is using their content addressable memory to solve problems that previously could not be solved by normal programming methods on normal computers.
Several decades ago people started working on translation software. A famous story in computer science is that the NSA produced one of the first such programs, and for a demonstration a line was typed in in English and automatically translated to Russian. A Russian speaker was nearby and was asked what the translation said. He stumbled a bit, saying the language was a bit archaic, the grammar a bit unusual, but eventually he said the printout said "The ghost is ready, but the steak is shaky." The original English had been "The spirit is willing but the flesh is weak." Google has been doing on-line translations for several years now. They do it by having scanned in hundreds of thousands of books in dozens of languages, and comparing them for examples of translations. When you type a phrase into translate.google.com, they put the phrase over their language network and ask "Who recognizes this?" Some computer will come back with "I've got it, that's Portuguese." Then they put the phrase over their translate network saying "This is Portuguese, who can translate this to Hindi?" The computers each look at their stored Portuguese for similar phrases, and see if any of them has something close along with the same document available in Hindi, or if not if they can translate it to a central language like English or French. Then perhaps some other computer can take it from English to Hindi. All you know is a half second later you have some Hindi, and generally the translation is better than anything the NSA ever managed to accomplish with their software.
Google has the best speech recognition engine. You can dial 1-800-411-GOOG from any phone, and a google voice robot will ask you for a city and a name. They'll find you a phone listing for what you want and connect you. Accents don't bother them much. This is done in two stages: first they put a recording of your voice out over their speech net and ask, "Who can understand this?" The individual computers compare to known words and phrases and respond if they have something. Once the computers know what you're saying, then they ask their phone directory servers "Who knows something about pizza in West Sacramento?" and again some computer responds. Thousands of computers are involved in these searches, but all you know is a half second later the voice lists a few choices for you and offers to tell you more or connect you to one. Google is doing speech recognition, not so much by having a few smart programmers figure out a bunch of things about phonetics, grammar, dictionaries, Fourier and wavelet transforms, and filtering, but rather by just throwing several thousand computers with huge databases at the problem.
Google has a new app, Google Goggles, for android phones. You take a picture of pretty much anything, and Goggles will tell you what you're looking at: a painting, a building, a city street, a product for sale. Again, this is done not so much with incredibly clever programming, but by putting the picture out on their visual servers and asking "Who knows anything about this?" Some server will respond, "Oh, that's the Mona Lisa." Then Google asks a different set of servers, "Who knows anything about the Mona Lisa?" All you know is a half second later your phone tells you it's the Mona Lisa by DaVinci, it lives in the Louvre, and you can buy hand painted replicas on EBay for $35. Google is doing visual recognition, again with the brute force of thousands of computers working on the problem all at once.
Currently, Google's most impressive demonstration of their technology is a car that drives itself. Their car has already traveled 140,000 miles on Google Autopilot, depending on cameras and a scanning laser to do the driving. You just tell it your destination and it plots a route for you, taking into consideration speed limits and traffic patterns. Then it drives the route, scanning ahead with cameras and using the Google server network to recognize other cars, curbs, road signs, pedestrians, bicycles, traffic lights, and uses that information to control the accelerator, brakes and steering. You just sit in the passenger seat and read a book (retrieved from the Google Electronic Book Database, downloaded to your iPad). How does this work? Speech recognition networks decode your voice instructions. Location servers figure out where you are and where you want to go. Mapping servers find a good route for you. Visual servers look at images from the car's cameras and detect traffic, obstacles, lights, potential dangers. Driving servers make decisions about swerving, braking, or continuing. Probably a hundred thousand computers have a piece of this action. All you know is the car is taking you to your lunch appointment while you read your on-line copy of The Rise and Fall of the Great Powers.
About 15% of your brain is dedicated to speech recognition, and another 40% or so to visual recognition. Google is now emulating these capabilities with their networks. This emulation is not a simple demo system: Google Translate, Google Information, Google Goggles are running now and working quite well. In my estimation, they're more than half way to true artificial intelligence. To get the rest of the way I would say their single largest remaining problem is emotions. Absent emotions, all a computer does is look up information and answer questions, then wait passively for the next question. The true hallmark of life is desire. Even bacteria show a clear preference for warm places that have available food. They don't just wait around passively for someone to ask them "Do you have food?" or tell them "There's food over that way, go get it." When Google masters an affective engine, they will have produced something new, some kind of distributed network that contains much of the accumulated knowledge of mankind, can access this knowledge to find analogies and draw conclusions, and has its own goals and desires. We have a name for such a computer network: SkyNet, the system from the movie The Terminator that tried to exterminate mankind. As noted above, Google doesn't even know how many computers they own, much less where they each are and what they're each doing. If there is a sudden need to pull the plug on some of these computers, there will be a serious problem: which plugs? And this presumes the computers doing a particular task keep doing that task - if the network decides for its own reasons to reassign tasks and computers, we'll have no clue at all which plugs.