Chapter 1 (final)

Artificial Intelligence: An IntroductionThe AI ProblemsThe Underlying AssumptionAI TechniquesGamesTheorem ProvingNatural Language ProcessingVision ProcessingSpeech ProcessingRoboticsExpert SystemSearch KnowledgeAbstraction

Unit 1What is Artificial Intelligence

Learning Objectives

After reading this unit you should appreciate the following:

Artificial Intelligence: An Introduction

AI Problems

AI Techniques

Games

Theorem Proving

Natural Language Processing

Vision and Speech Processing

Expert System

Search Knowledge

Abstraction

Top

Artificial Intelligence: An Introduction

Artificial intelligence (AI) is the study of how to make computers do things that, at the moment, people do better. This definition is, of course, somewhat ephemeral because of its reference to the current state of computer science but the fact remains that most attempt to define complex and widely used terms precisely are exercises in futility. To do this, we propose the above by no means is a universally accepted

ARTIFICIAL INTELLIGENCE

definition. It as well fails to include some areas of potentially very large impact, namely problems that cannot now be solved well by either computers or people. But it provides a good outline of what constitutes artificial intelligence, and it avoids the philosophical issues that dominate attempts to define the meaning of artificial intelligence. Interestingly, though, it suggests a similarity with philosophy at the same time it is avoiding it.

AI has embraced the larger scientific goal of constructing an information-processing theory of intelligence. If such a science of intelligence could be developed, it could guide the design of intelligent machines as well as explicate intelligent behaviour as it occurs in humans and other animals.

Top

The AI Problems

Much of the early work in the field focused on formal tasks, such as game playing and theorem proving. Chess written by Samuel, also received a good deal of attention. The Logic Theorist was an early attempt to prove mathematical theorems. Gelernter's theorem prover explored another area of mathematics: geometry. Game playing and theorem proving share the property that people who do them well are considered to be displaying intelligence. Despite this, it appeared initially that computers could perform well those tasks simply by being fast at exploring a large number of solution paths and then selecting the best one. It was thought that this process required very little knowledge and could therefore be programmed easily. As we will see later, this assumption turned out to be false since no computer is fast enough to overcome the combinatorial explosion generated by most problems.

Decision-making was yet a major assault in AI when day-to-day chores come in picture often called commonsense reasoning. It includes reasoning about physical objects and their relationships to each other (e.g., an object can be in only one place at a time), as well as reasoning about actions and their consequences (e.g., if you let go of something, it will fall to the floor and maybe break).

As the techniques in AI progressed and techniques for handling larger amounts of world knowledge were developed, some progress was made on the tasks just described and new tasks could reasonably be attempted. These include perception (vision and speech), natural language understanding, and problem solving in specialized domains such as medical diagnosis and chemical analysis.

Perception of the world around us is crucial to our survival. Animals with much less intelligence than people are capable of more sophisticated visual perception than are current machines. Perceptual tasks are difficult because they involve analog (rather than digital) signals; the signals are typically very noisy and usually a large number of things (some of which may be partially obscuring others) must be perceived at once.

In order to use language, to communicate a wide variety of ideas is perhaps the most important thing that separates humans from the other animals. This is usually referred to as natural language understanding, is still extremely difficult. In order to understand sentences about a topic, it is necessary to know not only a lot about the

2

WHAT IS ARTIFICIAL INTELLIGENCE

language itself but also a good deal about the topic so that unstated assumptions can be recognized.

Apart from these mundane tasks, many people can also perform one or maybe more specialized tasks in which carefully acquired expertise is necessary. Examples of such tasks include engineering design, scientific discovery, medical diagnosis, and financial planning. Programs that can solve problems in these domains also fall under the aegis of artificial intelligence. Figure 1.1 lists some of the tasks that are the targets of work in AI.

First perceptual, linguistic, and commonsense skills are learned. Later (and of course for some people, never) expert skills such as engineering, medicine, or finance are acquired. It might seem to make sense then that the earlier skills are easier and thus more acquiescent to computerized duplication than are the later, more specialized ones. For this reason, much of the initial AI work was concentrated in those early areas. But it turns out that this naive assumption is not right. Although expert skills require knowledge that many of us do not have, they often require much less knowledge than do the more mundane skills and that knowledge is usually easier to represent and deal with inside programs.

Figure 1.1: Some of the Task Domains of AI

3

Student Activity 1.1


As a result, the problem areas where AI is now flourishing most as a practical discipline (as opposed to a purely research one) are primarily the domains that require only specialized expertise without the assistance of commonsense knowledge. There are now thousands of programs called expert systems in day-to-day operation throughout all areas of industry and government. Each of these systems attempts to solve part, or perhaps all, of a practical, significant problem that previously required scarce human expertise.

Before embarking on a study of specific AI problems and solution techniques, it is important at least to discuss, if not to answer, the following four questions:

1. What are our underlying assumptions about intelligence?

2. What kinds of techniques will be useful for solving AI problems?

3. At what level of detail, if at all, are we trying to model human intelligence?

4. How will we know when we have succeeded in building an intelligent program?

Before reading next section, answer the following questions.

1. Discuss different AI task domains in detail.

2. What do you mean by an Expert System?

3. What problems we have to face in Natural Language understanding?

If your answers are correct, then proceed to the next section.

Top

The Underlying Assumption

The core research in artificial intelligence lies in what Newell and Simon call the physical symbol system hypothesis. They define a physical symbol system as follows:

A physical symbol system consists of a set of entities, called symbols, which are physical patterns that can occur as components of another type of entity called an expression (or symbol structure). Thus, a symbol structure is composed of a number of instances (or tokens) of symbols related in some physical way (such as one token being next to another). At any instant of time the system will contain a collection of these symbol structures. Besides these structures, the system also contains a collection of processes that operate on expressions to produce other expressions: processes of creation, modification, reproduction and destruction. A physical symbol system is a machine that produces through time an evolving collection of symbol structures. Such a system exists in a world of objects wider than just these symbolic expressions themselves.

They then state the hypothesis as

The Physical Symbol System Hypothesis: A physical symbol system has the necessary and sufficient means for general intelligent action.

4



There appears to be no way to prove or disprove it on logical grounds as it just a hypothesis. So it must be subjected to empirical validation. We may find that it is false. We may find that the bulk of the evidence says that it is true. But the only way to determine its truth is by experimentation.

Computers provide the perfect medium for this experimentation since they can be programmed to simulate any physical symbol system we like. This ability of computers to serve as arbitrary symbol manipulators was noticed very early in the history of computing.

As it has become increasingly easy to build computing machines, so it has become increasingly possible to conduct empirical investigations of the physical symbol system hypothesis. In each such investigation, a particular task that might be regarded as requiring intelligence is selected. A program to perform the task is proposed and then tested. Although we have not been completely successful at creating programs that perform all the selected tasks, most scientists believe that many of the problems that have been encountered will ultimately prove to be surmountable by more sophisticated programs than we have yet produced.

Evidence in support of the physical symbol system hypothesis has come not only from areas such as game playing, where one might most expect to find it, but also from areas such as visual perception, where it is more tempting to suspect the influence of subsymbolic processes. However, subsymbolic models (for example, neural networks) are beginning to challenge symbolic ones at such low-level tasks. Whether certain subsymbolic models conflict with the physical symbol system hypothesis is a topic still under debate. And it is important to note that even the success of subsymbolic systems is not necessarily evident against the hypothesis. It is often possible to accomplish a task in more than one way.

The importance of the physical symbol system hypothesis is twofold. It is a significant theory of the nature of human intelligence and so is of great interest to psychologists. It also forms the basis of the belief that it is possible to build programs that can perform intelligent tasks now performed by people. Our major concern here is with the latter of these implications, although as we will soon see, the two issues are not unrelated.


1. What is physical symbol system hypothesis?

2. Discuss the advantage of physical symbol system hypothesis in AI.


Top

AI Techniques

The problems of Artificial intelligence appear to have very little in common except that they are hard. But to our relief there are varieties of techniques to find the solution of the same. What, then, if anything, can we say about those techniques besides the fact that they manipulate symbols? How could we tell if those techniques

5


might be useful in solving other problems, perhaps ones not traditionally regarded as AI tasks? The rest of this book is an attempt to answer those questions in detail. But before we begin examining closely the individual techniques, it is enlightening to take a broad look at them to see what properties they ought to possess.

Intelligence requires knowledge. To compensate for its one overpowering asset, indispensability, knowledge possesses some less desirable properties, including:

It is voluminous.

It is hard to characterise accurately.

It is constantly changing.

It differs from data by being organized in a way that corresponds to the ways it will be used.

We are forced to conclude that an AI technique is a method that exploits knowledge that should be represented in such a way that:

The knowledge captures generalizations. In other words, it is not necessary to represent separately each individual situation. Instead, situations that share important properties are grouped together. If knowledge does not have this property, inordinate amounts of memory and updating will be required. So we usually call something without this property "data" rather than knowledge.

It can be understood by people who must provide it. Although for many programs, the bulk of the data can be acquired automatically (for example, by taking readings from a variety of instruments), in many AI domains, most of the knowledge a program has, must ultimately be provided by people in terms they understand.

It can easily be modified to correct errors and to reflect changes in the world and in our worldview.

It can be used in a great many situations even if it is not totally accurate or complete.

It can be used to help overcome its own sheer bulk by helping to narrow the range of possibilities that must usually be considered.

It is possible to solve AI problems without using AI techniques. And it is possible to apply AI techniques to the solution of non-AI problems. This is likely to be a good thing to do for problems that possess many of the same characteristics as do AI problems. In order to try to characterize AI techniques in as problem-independent a way as possible, let's look at two very different problems and a series of approaches for solving each of them.

Top

Games

Game playing share the property that people who do them well are considered to be displaying intelligence. Despite this, it appeared initially that computers could perform well act those tasks simply by being fast at exploring a large number of solution paths and selecting the best one and if we apply this rule to day to day life

6


then we can understand that, it is basic rule of problem solving. Almost in every case for every problem in a particular situation we may have various possible solutions but if we want to solve the problem correctly then we have to choose a right path then only we can overcome the problem. Same strategy we adopt in game playing, if we want to be a winner then we have to select right option among the various possible options. By adopting this approach we can design best possible game (AI based). But it may not be winner all the time. We can see this in real life problem for example Deep Blue (name of AI based computer system) is defeated by the Garry Cosparov but next time Deep Blue first was able to defact the world champion. We can understand it by following examples:

Tic- Tac- Toe

In this section, we present a series of three programs to play tic-tac-toe. The programs in this series increase in:

Their complexity.

Their use of generalizations.

The clarity of their knowledge.

The extensibility of their approach.

Thus they move toward being representations of what we call AI techniques.

Program 1

Data Structures

Board: A nine-element vector representing the board, where the elements of the vector correspond to the board positions as follows:

1 2 3

4 5 6

7 8 9

An element contains the value 0 if the corresponding square is blank, I if it is filled with an X, or 2 if it is filled with an O.

Movetable: A large vector of 19,683 elements, each element of which is a nine-element vector. The contents of this vector are chosen specifically to allow the algorithm to work.

The Algorithm

To make a move, do the following:

1. View the vector Board as a ternary (base three) number. Convert it to a decimal number.

2. Use the number computed in step 1 as an index into movetable and access the vector stored there.

7


3. The vector selected in step 2 represents the way the board will look after the move that should be made. So set Board equal to that vector.

Comments

This program is very efficient in terms of time. And, in theory, it could play an optimal game of tic-tac-toe. But it has several disadvantages:

It takes a lot of space to store the table that specifies the correct move to make from each board position.

Someone will have to do a lot of work specifying all the entries in the movetable.

It is very unlikely that all the required movetable entries can be determined and entered without any errors.

If we want to extend the game, say to three dimensions, we would have to start from scratch, and in fact this technique would no longer work at all, since 327 board positions would have to be stored, thus overwhelming present computer memories.

The technique embodied in this program does not appear to meet any of our requirements for a good AI technique. Let’s see if we can do better.

Program 2

Data Structures

Board: A nine-element vector representing the board, as described for Program 1. But instead of using the number 0, 1, or 2 in each element, we store 2 (indicating blank), 3 (indicating X), or 5 (indicating O). An integer indicating which move of the game is about to be played; 1 indicates the first move, 9 the last.

The Algorithm

The main algorithm uses three subprocedures:

Make2: Returns 5 if the center square of the board is blank, that is, if Board[5] = 2. Otherwise, this function returns any blank noncorner square (2,4,6,or 8).

Posswin(p): Returns 0 if player p cannot win on his next move; otherwise, it Returns the number of the square that constitutes a winning move. This function will enable the program both to win and to block the opponent's win. Posswin operates by checking, one at a time, each of the rows, columns, and diagonals. Because of the way values are numbered, it can test an entire row (column or diagonal) to see if it is a possible win by multiplying the values of its squares together. If the product is 18 (3 x 3 x 2), then X can win. If the product is 50 (5 x 5 x 2), then O can win. If we find a winning row, we determine . which element is blank, and return the number of that square.

Go(n): Makes a move in square n. This procedure sets Board[n] to 3 if Turn is odd, or 5 if Turn is even. It also increments Turn by one.

8


The algorithm has a built-in strategy for each move it may have to make. It makes the odd-numbered moves if it is playing X, the even-numbered moves if it is playing O. The strategy for each turn is as follows:

Turn = l Go(l) (upper left corner).

Turn=2 If Board[5] is blank, Go(5), else Go(1).

Turn=3 If Board[9] is blank, Go(9), else Go(3).

Turn=4 If Posswin(X) is not 0, then Go(Posswin(X)) [i.e. block opponent's win], else Go(Make2).

Turn=5 If Posswin(X) is not 0 then Go(Posswin(X)) [i.e., win] else if Posswin(O) is not 0, then Go(Posswin(O)) [i.e., block win], else if Board[7] is blank, then Go(7), else Go(3). [Here the program is trying to make a fork.]

Turn=6 If Posswin(O) is not 0 then Go(Posswin(O)), else if Posswin(X) is not 0, then Go(Posswin(X)), else Go(Make2).

Turn=7 If Posswin(X) is not 0 then Go(Posswin(X)), else if Posswin(O) is not 0, then Go(Posswin(O)), else go anywhere that is blank.

Turn=8 If Posswin(O) is not 0 then Go(Posswin(O)), else if Posswin(X) is not 0, then Go(Posswin(X)), else go anywhere that is blank.

Turn=9 Same as Turn=7.

Comments

This program is not quite as efficient in terms of time as the first one since it has to check several conditions before making each move. But it is a lot more efficient in terms of space. It is also a lot easier to understand the program's strategy or to change the strategy if desired. But the total strategy has still been figured out in advance by the programmer. Any bugs in the programmer's tic-tac-toe playing skill will show up in the program's play. And we still cannot generalize any of the program's knowledge to a different domain, such as three-dimensional tic-tac-toe.

Program 3

This program is identical to Program 2 except for one change in the representation of the board. We again represent the board as a nine-element vector, but this time we assign board positions to vector elements as follows:

8 3 4

1 5 9

6 7 9

Notice that this numbering of the board produces a magic square: all the rows, columns, and diagonals sum to 15. This means that we can simplify the process of checking for a possible win. In addition to marking the board as moves are made, we keep a list, for each player, of the squares in which he or she has played. To check for

9


a possible win for one player, we consider each pair of squares owned by that player and compute the difference between 15 and the sum of the two squares. If this difference is not positive or if it is greater than 9, then the original two squares were not collinear and so can be ignored. Otherwise, if the square representing the difference is blank, a move there will produce a win. Since no player can have more than four squares at a time, there will be many fewer squares examined using this scheme than there were using the more straightforward approach of Program 2. This shows how the choice of representation can have a major impact on the efficiency of a problem-solving program.

Comments

This comparison raises an interesting question about the relationship between the way people solve problems and the way computers do. Why do people find the row-scan approach easier while the number-counting approach is more efficient for a computer? We do not know enough about how people work to answer that question completely. One part of the answer is that people are parallel processors and can look at several parts of the board at once, whereas the conventional computer must look at the squares one at a time. Sometimes an investigation of how people solve problems sheds great light on how computers should do so. At other times, the differences in the hardware of the two seem so great that different strategies seem best. As we learn more about problem solving both by people and by machines, we may know better whether the same representations and algorithms are best for both people and machines.

Program 4

Data Structures

BoardPosition: A structure containing a nine-element vector representing the board, a list of board positions that could result from the next move, and a number representing an estimate of how likely the board position is to lead to an ultimate win for the player to move.

The Algorithm

To decide on the next move, look ahead at the board positions that result from each possible move. Decide which position is best (as described below), make the move that leads to that position, and assign the rating of that best move to the current position.

To decide which of a set of board positions is best, do the following for each of them:

See if it is a win. If so, cal1 it the best by giving it the highest possible rating.

Otherwise, consider all the moves the opponent could make next. See which of them is worst for us (by recursively calling this procedure). Assume the opponent will make that move. Whatever rating that move has, assign it to the node we are considering.

The best node is then the one with the highest rating.

10



This algorithm will look ahead at various sequences of moves in order to find a sequence that leads to a win. It attempts to maximize the likelihood of winning, while assuming that the opponent will try to minimize that likelihood.

Comments

This program will require much more time than either of the others since it must search a tree representing all possible move sequences before making each move. But it is superior to the other programs in one very big way: It could be extended to handle games more complicated than tic-tac-toe, for which the exhaustive enumeration approach of the other programs would completely fall apart. It can also be augmented by a variety of specific kinds of knowledge about games and how to play them. For example, instead of considering all possible next moves, it might consider only a subset of them that are determined, by some simple algorithm, to be reasonable. And, instead of following each series of moves until one player wins, it could search for a limited time and evaluate the merit of each resulting board position using some static function.

Program 3 is an example of the use of an AI technique. For very small problems, it is less efficient than a variety of more direct methods. However, it can be used in situations where those methods would fail.


1. What is an AI techniques?

2. Give an appropriate algorithm for solving in Tic-Tac-Toe problem.

3. Explain the spectrum from static to AI-based techniques for a problem other than the two discussed in this unit. Think of your own problem or use one of the following:

a. Translate an English sentence into Hindi.

b. Teach a child to subtract integers.


Top

Theorem Proving

Theorem proving has the property that people who do them well are considered to be displaying intelligence. The Logic Theorist was an early attempt to prove mathematical theorems. It was able to prove several theorems from the Qussells Principia Mathematica. Gelernters’ theorem prover explored another area of mathematics: geometry. There are three types of problems in A.I. Ignorable problems, in which solution steps can be ignored; recoverable problems in which solution steps can be undone; irrecoverable in which solution steps cannot be undone. Theorem proving falls into the first category i.e. it is ignorable suppose we are trying to solve a theorem, we proceed by first proving a lemma that we think will be useful. Eventually we realize that the lemma is not help at all. In this case we can simply ignore that lemma, and can start from beginning.

11


Top

Natural Language Processing

Perception and communication are essential components of intelligent behaviour. They provide the ability to effectively interact with our environment. Humans perceive and communicate through their five basic senses of sight, hearing, touch, smell, and taste, and their ability to generate meaningful utterances. Two of the senses, sight and hearing are especially complex and require conscious inferencing. Developing programs that understand natural language and that comprehend visual scenes are two of the most difficult tasks facing AI researchers.

Developing programs that understand a natural language is a difficult problem. Natural languages are large. They contain an infinity of different sentences. No matter how many sentences a person has heard or seen, new ones can always be produced. Also, there is much ambiguity in a natural language. Many words have several meanings such as can, bear, fly, and orange, and sentences can have different meanings in different contexts. This makes the creation of programs that “understand” a natural language, one of the most challenging tasks in AI. It requires that a program transform sentences occurring as part of a dialog into data structures which convey the intended meaning of the sentences to a reasoning program. In general, this means that the reasoning program must know a lot about the structure of the language, the possible semantics, the beliefs and goals of the user, and a great deal of general world knowledge.

Developing programs to understand natural language is important in AI because a natural form of communication with systems is essential for user acceptance. Further more, one of the most critical tests for intelligent behaviour is the ability to communicate effectively. AI programs must be able to communicate with their human counterparts in a natural way, and natural language is one of the most important mediums for that purpose.

Before proceeding further, a definition of understanding as used here should be given. We say a program understand a natural language if it behaves by taking a (predictably) correct or acceptable action in response to the input. For example, we say a child demonstrates understanding if it responds with the correct answer to a question. The action taken need not be an external response. It may simply be the creation of some internal data structures as would occur in learning some new facts. But in any case, the structures created should be meaningful and correctly interact with the world model representation held by the program. In this chapter we explore many of the important issues related to natural language understanding and language generation.

Top

Vision Processing

Accurate machine vision opens up a new realm of computer application. These applications include mobile robot navigation, complex manufacturing tasks, analysis of satellite images, and medical image processing. In this section, we investigate how we can transform raw camera images into useful information about the world.

12


A video camera provides a computer with an image represented as a two-dimensional grid of intensity levels. Each grid element, or pixel, may store a single bit of information (that is, black/white) or many bits (perhaps a real-valued intensity measure and colour information). A visual image is composed of thousands of pixels. What kinds of things might we want to do with such an image? Here are four operations, in order to increasing complexity:

1. Signal Processing: Enhancing the image, either for human consumption or as input to another program.

2. Measurement Analysis: For images containing a single object, determining the two-dimensional extent of the object depicted.

3. Pattern Recognition: For single-object images, classifying the object into a category drawn from a finite set of possibilities.

4. Image Understanding: For images containing many objects, locating the objects in the image, classifying them, and building a three-dimensional mode of the scene.

Image understanding is the most difficult visual task, and it has been the subject of the most study in AI. While some aspects of image understanding reduce to measurement analysis and pattern recognition, the entire problem remains unsolved, because of difficulties that include the following:

An image is two-dimensional, while the world is three-dimensional. Some information is necessarily lost when an image is created.

Figure 1.2: An Ambiguous Image

One image may contain several objects, and some objects may partially occlude others.

The value of a single pixel is affected by many different phenomena, including the colour of the object, the source of the light, the angle and distance of the camera, the pollution in the air, etc. It is hard to disentangle these effects.

As a result, 2-D images are highly ambiguous. Given a single image, we could construct any number of 3-D worlds that would give rise to the image. For example, consider the ambiguous image of Figure 1.2. It is impossible to decide what 3-D solid it portrays. In order to determine the most likely interpretation of a scene, we have to apply several types of knowledge.

For example, we may invoke knowledge about low-level image features, such as shadows and textures, Figure 1.3 shows how such knowledge can help to disambiguate the image. Having multiple images of the same object can also be

13


useful for recovering 3-D structure. The use of two or more cameras to acquire multiple simultaneous views of an object is called stereo vision. Moving objects (or moving cameras) also supply multiple views. Of course, we must also possess knowledge about how motion affects images that get produced. Still more information can be gathered with a laser rangefinder, a device that returns an array of distance measures much like sonar does. While rangefinders are still somewhat expensive, integration of visual and range data will soon become commonplace. Integrating different sense modalities is called sensor fusion. Other image factors we might want to consider include shading, colour, and reflectance.

High-level knowledge is also important for interpreting visual data. For example, consider the ambiguous object at the center of Figure 1.4(a). While no low-level image features can tell us what the object is, the object’s surroundings provide us with top-down expectations. Expectations are critical for interpreting visual scenes. But the preferred interpretations of egg, bacon, and plate reinforce each other mutually, providing the necessary expectations. (Figure 1.3)

Figure 1.3: Using Low-Level Knowledge to Interpret an Image

Figure 1.4: Using High-Level Knowledge to Interpret an Image

Top

14


Speech Processing

Natural language understanding systems usually accept typed input, but for a number of applications this is not acceptable. Spoken language is a more natural form of communication in many human-computer interfaces. Speech recognition systems have been available for some time, but their limitations have prevented widespread use. Below are five major design issues in speech systems. These issues also provide dimensions along which systems can be compared with one another.

Speaker Dependence versus Speaker Independence: A speaker-independent system can listen to any speaker and translate the sounds into written text. Speaker independence is hard to achieve because of the wide variations in pitch and accent. It is easier to build a speaker-dependent system, which can be trained on the voice patterns of a single speaker. The system will only work for that one speaker. It can be retrained on another voice, but then it will no longer work for the original speaker.

Continuous versus Isolated-Word Speech: Interpreting isolated-word speech, in which the speaker pauses between each word, is easier than interpreting continuous speech. This is because boundary effects cause words to be pronounced differently in different contexts. For example, the spoken-phrase “could you” contains a j sound, and despite the fact it contains two words, there is no empty space between them in the speech wave. The ability to recognize continuous speech is very important, however, since humans have difficulty speaking in isolated words.

Real Time versus Offline Processing: Highly interactive applications require that a sentence be translated into text as it is being spoken, while in other situations, it is permissible to spend minutes in computation. Real-time speeds are hard to achieve, especially when higher-level knowledge is involved.

Large versus Small Vocabularly: Recognizing utterances that are confined to small vocabularies (e.g., 20 words) is easier than working with large vocabularies (e.g., 20,000 words). A small vocabulary helps to limit the number of word candidates for a given speech segment.

Broad versus Narrow Grammar: An example of a narrow grammar is the one for phone numbers: S XXX-XXXX, where X is any number between zero and nine.

Still, no speech system is 100 per cent accurate. There has recently been renewed interest in integrating speech recognition and natural language processing in order to overcome the final hurdle. For example, ATNs and unification-based grammars can be used to constrain the hypotheses made by a speech system. Thus far, integration has proved difficult, because natural language grammars do not offer much in the way of constraints.

In the speech recognition literature, there is a quantitative measure of grammar, called perplexity. Perplexity measures the number of words that can legally appear next in the input (on average). The telephone number recognition task has a perplexity of 10, because at any decision point, there are ten alternatives. On a sample 1000-word English task, a word-pair grammar may reduce the perplexity from

15


1000 down to 60. A bigram grammar may reduce it further, perhaps to 20 (Lee and Hon, 1988).

While natural language grammars accurately predict word categories (such as noun and verb), they say nothing about which words within a category are likely to show up in the input. For example, given the word “the,” a grammar might hypothesize that the next word is either an adjectives or a noun. But this knowledge does us little good when there are thousands of possible adjectives and nouns to choose from. Thus, it is natural to turn to statistical, or collocational, facts about language. For example, if the word “doctor” is recognized, then one might expect to hear the word “nurse” later in the input, but not “Horse”. Collocational data, unlike more complex syntactic and semantic structures, can be extracted automatically from large on-line bodies of text. Ultimately, we want to substitute semantic and discourse information for statistical data. If we know the conversation is about doctors, and if we know that doctors and nurses typically work together, then we should be able to generate the proper expectations. Such a strategy will require large knowledge bases and a deeper understanding of semantics and discourse.

Top

Robotics

Robots have found numerous applications in industrial settings. Robot manipulators are able to perform simple repetitive task, such as bolting and fitting automobile parts, but these robots are highly task-specific. It is a long-standing goal in robotics to build robots that can be programmed to carry out a wide variety of tasks.

A manipulator is composed of a series of links and joints, usually terminating in an end-effector, which can take the form of a two-pronged gripper, a humanlike hand, or any of a variety of tools. One general manipulation problem is called pick-and-place, in which a robot must grasp an object and move it to a specific location. For example, consider Figure 1.5, where the goal is to place a peg in a hole.

Figure 1.5: A Pick-and-Place Task

There are two main subtasks here. The first is to design a robot motion that ends with the object stably grasped between the two fingers of the robot. Clearly some form of path planning, as discussed above, can be used to move the arm toward the object,

16


but we need to modify the technique when it comes to the fine motion involved in the grasp itself. Here, uncertainty is a critical problem. A robot can never be sure of the precise location of the peg or the arm. Therefore, it would be a mistake to plan a grasp motion in which the gripper is spread only wide enough to permit the peg to pass, as in Figure 1.6(a). A better strategy is to open the gripper wide, then close gradually as the gripper gets near the peg, as in Figure 1.6(b). That way, if the peg turns out to be located some small distance away from where we thought it was, the grasp will still succeed. Although this strategy depends less on precise vision, it requires some tactile sensitivity in order to terminate the grasp. Unless we take special care in designing grasping motions, uncertainty can lead to disasters. For example, should the left side of the gripper touch the peg one second before the right side does, the peg may fall, thus foiling the grasp. Brost (1988) and Mason et al. (1988) give robust algorithms for grasping a wide variety of objects.

After the peg is stably grasped, the robot must place it in the hole. This subtask resembles the path-planning problem, although it is complicated by the fact that moving the peg through 3-D space requires careful orchestration of the arm’s joints. Also, we must seriously consider the problems introduced by uncertainty. Failure will result from even a slight positioning error, because the peg will jam flatly on the outer surface. We slide the peg along the surface, applying downward pressure so that the peg enters the hole at an angle. After this happens, we straighten the peg gradually and push it down into the hole.

This type of motion, which reacts to forces generated by the world, is called compliant motion. Compliant motion is very robust in the face of uncertainty. Humans employ compliant motion in a wide variety of activities, such as writing on chalkboards.

Figure 1.6: Naïve and Clever Strategies for Grasping

So given a pick-and-place problem, how can we automatically generate a sequence of compliant motions? One approach (Lozano-Perez et al., 1984) is to use the familiar problem-solving process of backward chaining. Our initial and goal states for the peg-in-hole problem are represented as points in configuration space. First, we compute the set of points in 2-space from which we are guaranteed to reach the goal state in a single compliant motion, assuming a certain degree of uncertainty in initial position and direction of movement and certain facts about relative friction. Now we use backward chaining to design a set of motions that is guaranteed to get us from the

17


initial state to some point in the goal state’s stored pre-image. Recursively applying this procedure will eventually yield a set of motions that, while individually uncertain, combine to form a guaranteed plan.



1. Describe scenarios in which the following features are critical:

a. Reactivity: The robot must react quickly to a changing environment.

b. Robustness: The robot must act appropriately, in spite of incomplete or inexact sensory data.

c. Recoverability: When a plan fails to bring about expected results, the robot must find another way to achieve its goal.

2. Describe three different ways of combining speech recognition with a natural language understanding system. Compare and contrast them in terms of expected performance and ease of implementation.

3. Say each of the following phrases very slowly, and write down the sounds you use. Then gradually speed up, and continue to write down the sounds. Finally, say them the way you would in ordinary speech. How do the sounds change as you move through each series? What are the implications of these changes for continuous speech recognition?

a. could you

b. boy’s school

c. the store, the elevator

d. sharp point

e. stop it

f. want to go


Top

Expert System

Expert System are a recent product of Artificial Intelligence. They began to emerge as university research systems during the early 1970s. They have now become one of the more important innovations of AI since they have been shown to be successful commercial products as well as interesting research tools.

Application Domain of Expert System include plan, chemistry, biology, engineering, manufacturing, aerospace, military operations, finance, banking, meteorology, geology, geophysics and more. The list goes on and on.

“An expert system is set of programs that manipulate encoded knowledge to solve problems in a specialized domain that normally requires human expertise. An expert

18


system’s knowledge is obtained from expert sources and code of in a form suitable for the system to use in its inference or reasoning processes. The expert knowledge most be obtained from specialists or other sources of expertise, such as texts, journal articles and data base.”

We will discuss the Expert System is detail in unit eight.

Top

Search Knowledge

In order to solve the complex problems encountered in artificial intelligence, one needs both a large amount of knowledge and some mechanisms for manipulating that knowledge to create solutions to new problems. That is if we have knowledge that it is sufficient to solve a problem, we have to search our goal in that knowledge. To search a knowledge base efficiently, it is necessary to represent the knowledge base in a systematic way so that it can be searched easily. Knowledge searching is a basic problem in Artificial Intelligence. The knowledge can be represented either in the form of facts or in some formalism. A major concept is that while intelligent programs recognize search, search is computationally intractable unless it is constrained by knowledge about the world. In large knowledge bases that contain thousands of rules, the intractability of search is an overriding concern. When there are many possible paths of reasoning, it is clear that fruitless ones not be pursued. Knowledge about path most likely to lead quickly to a goal state is often called search control knowledge.

Top

Abstraction

In order to solve hard problems, a problem solver may have to generate long plans. In order to do that efficiently, it is important to be able to eliminate some of the details of the problem until a solution that addresses the main issue is found. Then an attempt can be made to fill in the appropriate details. Abstraction means to hide the details of something. For example, if we want to compute the square root of a number then we simply call the function sort in C. We do not need to know the implementation details of this function. Early attempts to do this involved the use of macro-operators, in which large operators we built from smaller one’s. But in this approach, no details were eliminated from actual description of the operators. A better approach was developed in the ABSTRIPS system which actually planned in a hierarchy of abstraction spaces, in each of which preconditions at a lower level of abstraction was ignored.


Answer the following questions.

1. What is the difference between vision and speech of AI problem?

2. Describe the importance of Expert System.

Summary

19


Artificial intelligence (AI) is the study of how to make computers do things, which, at the moment, people do better.

Perceptual tasks are difficult because they involve analog (rather than digital) signals; the signals are typically very noisy and usually a large number of things (some of which may be partially obscuring others) must be perceived at once.

Physical symbol system hypothesis is a significant theory of the nature of human intelligence and forms the basis of the belief that it is possible to build programs that can perform intelligent tasks now performed by people.

Artificial intelligence problems appear to have very little in common except that they are hard.

Knowledge possesses some less desirable properties - it is voluminous, it is hard to characterise accurately, it is constantly changing and it differs from data by being organized in a way that corresponds to the ways it will be used.

AI technique is a method that exploits knowledge.

A program is said to understand a natural language if it behaves by taking a (predictably) correct or acceptable action in response to the input.

Some of the popular application areas of AI include – Robotics, Natural language processing, Theorem proving, Pattern recognition.

Self-assessment Questions

Fill in the blanks (Solved)

1. ____________ is the study of how to make computers do things which, at the moment, people do better.

2. The core research in artificial intelligence lies in ________ hypothesis.

Answers

1. Artificial Intelligence

2. physical symbol system

True or False (Solved)

1. Robotics is an artificial intelligence application area.

2. Natural languages follow strict grammar rules.

Answers

1. True

2. False

20


Fill in the blanks (Unsolved)

1. __________ and ___________ are essential components of intelligent behaviour.

2. Spoken language is a more natural form of ___________________ in many human computer interfaces.

3. ____________ recognition classify the object into a category drawn from a finite set of possibilities.

True or False (Unsolved)

1. It is very easy to characterize knowledge.

2. AI techniques minimize the use of knowledge in solving a problem.

3. AI problems may be solved without using AI techniques.

Detailed Questions

1. Pick a specific topic within the scope of AI and use the sources described in this chapter to do a preliminary literature search to determine what the current state of understanding of that topic is. If you cannot think of a more novel topic, try one of the following: expert systems for some specific domain (e.g., cancer therapy, computer design, financial planning), recognizing motion in images, using natural (i.e., humanlike) methods for proving mathematical theorems, resolving pronominal references in natural language texts, representing sequences of events in time, or designing a memory organization scheme for knowledge in a computer system based on our knowledge of human memory organization.

2. Explore the spectrum from static to AI-based techniques for a problem other than the two discussed in this chapter. Think of your own problem or use one of the following:

Translating an English sentence into Japanese.

Teaching a child to subtract integers.

Discovering patterns in empirical data taken from scientific experiments, and suggesting further experiments to find more patterns.

3. Write short notes on the following:

a. Robotics

b. Theorem Proving

c. Search Knowledge

21

Chapter 1 (final)

Documents

Transcript of Chapter 1 (final)