More AI History
Dateline: August 3, 1997
IN a previous article, we looked at some of the pre-history of AI. For the present series of articles, we will examine the more recent history of AI from its inception as a discipline in the early 1950s.
In 1943, mathematicians Warren McCulloch and Walter Pitts showed how it was possible for a neural network to compute. Six years later, Donald Hebb showed how a neural net could learn. If there is a "core" to AI, today it is probably the connectionist school -- neural nets. In that sense, McCulloch, Pitts and Hebb can be considered founding fathers of AI.
But connectionism was not always central to AI. In the early decades, most research focused on symbolic, rule-based reasoning, also known as expert systems. This article draws mainly upon Daniel Crevier's excellent 1993 book, AI: The Tumultuous History of the Search for Artificial Intelligence, for a review of AI up to the maturity of expert systems into real-world problem solvers.
Herbert Simon was a political scientist and expert in bureaucratic organization and, to some extent, economics. He proposed the theory of "satisficing" -- that we in fact make decisions without bothering to go to all the trouble of gathering information about all the options -- which led to the notion of heuristics or "rules of thumb" developed by George Polya in 1945. Heuristics have since been shown to be an essential element in both human and artificial intelligence.
Simon also came up with an economics/organizational theory that said parts of an organization deal only with limited sub goals, while only the top part sets the main goals and coordinates the activities of the departments. He recognized that this was one way of describing the working and organization of intelligence generally -- whether it to be a corporation, an ant colony, an ant, or an ant brain.
Simon's colleague Allen Newell, a physics graduate, had worked at the RAND Corporation, where he became acquainted with the Pandemonium project when Oliver Selfridge visited RAND and described the concept of pattern recognition. We met Pandemonium in a previous feature.
In 1956, Newell and Simon, plus RAND programmer J.C. Shaw, produced what is considered the first AI program, "Logic Theorist," basically a decision tree system for finding proofs for mathematical theorems. The program not only proved 38 of the first 52 theorems in chapter 2 of Russel and Whitehead's Principia Mathematica, and not only proved one of them in a manner more elegant than Russell and Whitehead had achieved, but also it found this proof without being asked to do so.
AI as a discipline was founded that same year -- 1956 -- at what has become known as the Dartmouth conference. The general belief at that time was that intelligence could be simulated in a machine. "This belief," says Crevier, "has remained the cornerstone of most AI work until today. It later became known as the physical symbol system hypothesis. The basic idea: our minds do not have direct access to the world. We can operate only on an internal representation of it, which corresponds to a collection of symbol structures. These structures can take the form of any physical pattern. They can can consist of arrays of electronic switches inside a digital computer, or meshes of firing neurons in a biological brain. An intelligent system (brain or computer) can operate on the structures to transform them into other constructions. Thought consists of expanding symbol structures, breaking them up and reforming them, destroying some and creating new ones. Intelligence is thus nothing but the ability to process symbols. It exists in a realm different from the hardware that supports it, transcends it, and can take different physical forms."
The main centers of AI in its infancy (from 1956 to 1963) were Carnegie Mellon University and the Massachusetts Institute of Technology, followed by Stanford University and IBM. The main research themes were improved heuristics and machine learning.
Newell and Simon followed up their success with Logic Theorist in 1957 by creating a program which, unlike Logic Theorist, was not pre-programmed for a specific task (in Logic Theorist's case, the task was proving math theorems). The new program was called General Problems Solver (GPS) for that reason. GPS made more use of feedback -- cybernetics -- to refine a solution by an iteration process. GPS, Crevier tells us, "learned to solve various puzzles, performed symbolic integration [whatever that is], and broke secret codes."
In the late 1950s/early 1960s, Robert K. Lindsay created "Sad Sam," a program that could read English sentences and draw conclusions from them, but only in a very limited domain. Given the sentences: "Jim is John's brother" and "Jim's mother is Mary," Sad Sam was smart enough to understand that Mary must therefore be John's Ma. As Crevier points out, this was a step in the right direction. As humans, "to understand is to be able to relate a new piece of information to other facts we already know, and . . . draw [our own] conclusions."
Meanwhile, at IBM, Herbert Gelernter was working on GTP (Geometry Theorem Prover), which incorporated a representation of shapes -- a first for AI. Arthur Samuel was creating a program that could play checkers and Alex Bernstein was creating Deep Blue's chess playing granddaddy.
John McCarthy, creator of the LISP programming language which greatly facilitated symbolic programming, correctly foresaw the difficulties of getting computers to display commonsense and understand language, but nevertheless persisted in pursuing formal logic as a means of acquiring them. Marvin Minsky was less pessimistic about the difficulties, and pursued an "anti-logic" approach.
In 1961, James Slagle wrote SAINT (Symbolic Automatic INTegrator) that worked like Logic Theorist but upon problems of algebra rather than logic.
LT, GPS, GTP, SAD SAM, and SAINT were restricted from true learning and understanding by the paucity of the general knowledge about the world and about the relationships among things -- facts, ideas, and so on.
Minsky student Tom Evans wrote ANALOGY in 1963. It was able to detect similarities among geometric objects. Daniel Bobrow produced STUDENT in 1964, to solve algebra problems posed in verbal format. And Bertram Raphael wrote SIR (Semantic Information Retrieval) that understood simple English sentences. Here is an example of a session with SIR quoted by Crevier (SIR's responses are in capitals):
1. Every boy is a person.
I UNDERSTAND
2. A finger is part of a hand.
I UNDERSTAND
3. Each person has two hands.
THE ABOVE SENTENCE IS AMBIGUOUS. PLEASE RE-PHRASE IT.
4. There are two hands on each person.
I UNDERSTAND
5. How many fingers does John have?
THE ABOVE SENTENCE IS AMBIGUOUS, BUT I ASSUME (HAS) MEANS (HAS AS PARTS). I DON'T KNOW WHETHER FINGER IS PART OF JOHN.
6. John is a boy.
I UNDERSTAND
7. How many fingers does John have?
THE ABOVE SENTENCE IS AMBIGUOUS, BUT I ASSUMED (HAS) MEANS (HAS AS PARTS), HOW MANY FINGERS PER HAND?
8. Every hand has five fingers.
I UNDERSTAND
9. How many fingers does John have?
THE ANSWER IS 10.
Work at MIT in the mid-to-late 1960s focused on getting computers to manipulate blocks, which meant they had to understand three-dimensional geometry and certain aspects of physics. And they had to be able to see.
The problem of how to make a computer not only see, through video cameras, but more importantly and problematically how to make it makes sense of what it sees, was tackled by a variety of researchers at MIT including Larry Roberts, Gerald Sussman, Adolfo Guzman, Max Clowes and David Huffman, David Waltz, Patrick Winston, and Berthold Horn. The end result of their efforts was "micro-blocks world," where a robot was able to see the set of blocks on the table and move and stack them.
Similar work at Stanford University eventually led to a robot that could construct an automobile water pump from randomly scattered parts, and then in 1969 to "Shakey," a wobbly robot on wheels that was able to move around rooms picking up and stacking boxes.
Numerous refinements to the AI control programs were made over the years. Each tiny improvement took a lot of effort. A program called STRIPS took the lead over GPS for a while, then along came Terry Weinograd's SHRDLU (a nonsense name -- it has no meaning.)
SHRDLU was more than an incremental advance -- it was a considerable advance. It let humans interrogate the robot in a blocks world. Here is an example, again taken from Crevier (SHRDLU's responses in capitals):
1. Pick up a red block.
OKAY
2. Find a block which is taller than the one you are holding and put it into the box.
BY "IT," I ASSUME YOU MEAN THE BLOCK WHICH IS TALLER THAN THE ONE I AM HOLDING. OKAY
3. What does the box contain?
THE BLUE PYRAMID AND THE BLUE BLOCK
4. What is the pyramid supported by?
THE BOX
5. How many blocks are not in the box?
FOUR OF THEM
6. Is at least one of them narrower than the one which I told you to pick up?
YES, THE RED CUBE
7. Is it supported?
YES, BY THE TABLE
8. Can the table pick up blocks?
NO
9. The Blue pyramid is mine
I UNDERSTAND
10. I own all blocks which are not red, but I don't own anything which supports a pyramid
UNDERSTAND
11. Do I own the box?
NO
12. Do I own anything in the box?
YES, TWO THINGS: THE BLUE BOX AND THE BLUE PYRAMID
13. Will you please stack up both of the red blocks and either a green cube or pyramid?
OKAY
14. Is there a large block behind a pyramid?
YES, THREE OF THEM: A LARGE RED ONE, A LARGE GREEN CUBE AND THE BLUE ONE
15. Put a small one onto the green cube which supports a pyramid
OKAY
16. Put the littlest pyramid on top of it
OKAY
17. Had you touched any pyramid before you put the green one on the little cube?
YES, THE GREEN ONE
18. When did you pick it up?
WHILE I WAS STACKING UP THE RED CUBE, A LARGE RED BLOCK AND A LARGE GREEN CUBE
19. Why?
TO GET RID OF IT
20. Why did you do that?
TO CLEAR OFF THE RED CUBE
21. Why did you clear off that cube?
TO PUT IT ON LARGE GREEN CUBE
22. Why did you do that?
TO STACK UP THE RED CUBE, A LARGE RED BLOCK AND A LARGE GREEN CUBE
23. Why did you do that?
BECAUSE YOU ASKED ME TO
Impressive as a SHRDLU was, you have to remember that it only worked in the very, very limited micro-world consisting of a few blocks. The real world has billions of objects with trillions of relationships.
In the late 1950s/early 1960s, while Minsky was pursuing symbolic AI at MIT, Frank Rosenblatt of Cornell University was taking an entirely different tack from his old high school classmate Minsky. With a background in psychology, he approached AI as an exercise in modeling the human brain -- a neural net. His first effort was called the Perceptron, which received a lot of publicity for its potential but it couldn't really do very much. The press of 1958 reported the Perceptron could distinguish between a cat and a dog, but it could only do so in principle, not in fact.
The Perceptron was essentially a blending of the Pandemonium concept of demons with the McCulloch -- Pitts neurons. The demons were photocells which informed the neurons how much light they were sensing. The neurons weighed the relative strengths of the light input and if the weight was higher than a predetermined threshold than the neurons would "fire," that he is, they would tell the next level of neurons about the light intensity. Eventually a pattern -- such as a letter of the alphabet -- would emerge.
Note that such technology is a now applied widely in scanners, for example. AI, like the space program, does not have to succeed in the equivalent of landing a human on Mars for it to produce many valuable spinoffs.
Rosenblatt's principle, however, that neural nets could be trained to respond correctly to stimuli, was the foundation on which neural network technology was built, and many researchers took to it during the 1960s, until Minsky and Seymour Papert published an influential book criticizing the Perceptron. There followed what Crevier called a "research vacuum" in neural network development that lasted until the 1980s, and Minsky and Papert have been held responsible by the neural network community for the delay in progress of what was to turn out to be a very promising approach.
Herbert Simon, with support from Newell and Shaw, also predicted that by 1967 a computer would be world chess champion, would discover and prove a major new math theorem, and would be psychologically like humans. Simon was to win the Nobel Prize, but not for these predictions. Crevier things that "true artificial intelligences still lie far in the future."
Among the anticipated achievements that failed to materialize was automatic language translation -- a capability that would bestow enormous benefits if it were achieved.
It was also recognized that programs like GPS and STRIPS barely scrape the surface of true intelligence. Among the many difficulties was what I call the knowledge context issue, and Crevier calls the commonsense knowledge issue: the myriad facts and ideas, and even more myriad relations among them, many of which we know without conscious learning or logic.
Joseph Weizenbaum (MIT) objected to AI on moral grounds. He created Eliza to show that AI was a trick -- Eliza appeared to understand a conversation, but in reality it did not. With acknowledgements again to Crevier, here's a sample session with Eliza (Eliza's responses in capitals):
Men are all alike.
IN WHAT WAY
They're always bugging us about something or other.
CAN YOU THINK OF A SPECIFIC EXAMPLE
Well, my boyfriend made me come here.
YOUR BOYFRIEND MADE YOU COME HERE
He says I'm depressed most of the time.
I AM SORRY TO HEAR YOU ARE DEPRESSED
It's true. I am unhappy.
DO YOU THINK COMING HERE WILL HELP YOU NOT TO BE UNHAPPY
I need some help, that much seems certain.
WHAT WOULD IT MEAN TO YOU IF YOU GOT SOME HELP
Perhaps I could learn to get along with my mother.
TELL ME MORE ABOUT YOUR FAMILY
My mother takes care of me.
WHO ELSE IN YOUR FAMILY TAKES CARE OF YOU
My father.
YOUR FATHER
You're like my father in some ways.
WHAT RESEMBLANCE DO YOU SEE
You are not very aggressive but I think you don't want me to notice that.
WHAT MAKES YOU THINK I'M NOT VERY AGGRESSIVE
You don't argue with me.
WHY DO YOU THINK I DON'T ARGUE WITH YOU
You are afraid of me.
DOES IT PLEASE YOU TO BELIEVE I AM AFRAID OF YOU
My father's afraid of everybody.
WHAT ELSE COMES TO MIND WHEN YOU THINK OF YOUR FATHER
Bullies.
DOES THAT HAVE ANYTHING TO DO WITH THE FACT THAT YOUR BOYFRIEND MADE YOU COME HERE
Next week, we will continue on the theme of expert systems with a look at two of its most famous early examples: DENDRAL and MYCIN.
Until
next week,

NEXT WEEK: Even More AI history. Second in a series of articles about the history of AI from its inception as a discipline in the early 1950s.
FOOTNOTE: In last week's article, a review of Dragon NaturallySpeaking automatic speech recognition software from Dragon Systems, I said I would give the program another week of use to see if it improved enough to make it usable. I did; it did not. Long-distance calls to Dragon Systems' help desk (no toll-free number) met with a recorded announcement of average delays of 25 minutes and encouraging email instead. So I sent email. It's been over a week and I've had no response to my request for a recommendation relating to sound cards -- I was willing to spend a further $200 on a top of the line Sound Blaster. I won't dilly-dally any more -- the program is being returned and I'll spend the $300 on IBM's ViaVoice, a competing product due out this month. I'll let you know how it checks out in a month or so.