Advanced Topics in Robotics CS493/790 (X) Lecture 2 Instructor: Monica Nicolescu.

Advanced Topics in Robotics

CS493/790 (X)

Lecture 2

Instructor: Monica Nicolescu

CS 493/790(X) - Lecture 2 2

Robot Components

• Sensors

• Effectors and actuators

– Used for locomotion and manipulation

• Controllers for the above systems

– Coordinating information from sensors

with commands for the robot’s actuators

• Robot = an autonomous system which exists in the

physical world, can sense its environment and can

act on it to achieve some goals

CS 493/790(X) - Lecture 2 3

Sensors

• Sensor = physical device that provides

information about the world

– Process is called sensing or perception

• Robot sensing depends on the task

• Sensor (perceptual) space

– All possible values of sensor readings

– One needs to “see” the world through the robot’s “eyes”

– Grows quickly as you add more sensors

CS 493/790(X) - Lecture 2 4

State Space

• State: A description of the robot (of a system in general)

• State space: All possible states a robot could be in

– E.g.: light switch has two states, ON, OFF; light switch with

dimmer has continuous state (possibly infinitely many states)

• Different than the sensor/perceptual space!!

– Internal state may be used to store information about the world

(maps, location of “food”, etc.)

• How intelligent a robot appears is strongly dependent on how

much and how fast it can sense its environment and about

itself

CS 493/790(X) - Lecture 2 5

Representation

• Internal state that stores information about the world

is called a representation or internal model

– Self: stored proprioception, goals, intentions, plans

– Environment: maps

– Objects, people, other robots

– Task: what needs to be done, when, in what order

• Representations and models influence determine

the complexity of a robot’s “brain”

CS 493/790(X) - Lecture 2 6

Action

• Effectors: devices of the robot that have impact on

the environment (legs, wings robotic legs,

propeller)

• Actuators: mechanisms that allow the effectors to

do their work (muscles motors)

• Robotic actuators are used for

– locomotion (moving around, going places)

– manipulation (handling objects)

CS 493/790(X) - Lecture 2 7

Autonomy

• Autonomy is the ability to make one’s own decisions

and act on them.

– For robots: take the appropriate action on a given situation

• Autonomy can be complete (R2D2) or partial

(teleoperated robots)

CS 493/790(X) - Lecture 2 8

Control Architectures

• Robot control is the means by which the sensing and

action of a robot are coordinated

• Controllers enable robots to be autonomous

– Play the role of the “brain” and nervous system in animals

• Controllers need not (should not) be a single program

– Typically more than one controller, each process information

from sensors and decide what actions to take

– Should control modules be centralized?

• Challenge: how do all these controllers coordinate

with each other?

CS 493/790(X) - Lecture 2 9

Spectrum of robot control

From “Behavior-Based Robotics” by R. Arkin, MIT Press, 1998

CS 493/790(X) - Lecture 2 10

Robot control approaches

• Reactive Control

– Don’t think, (re)act.

• Deliberative (Planner-based) Control

– Think hard, act later.

• Hybrid Control

– Think and act separately & concurrently.

• Behavior-Based Control (BBC)

– Think the way you act.

CS 493/790(X) - Lecture 2 11

Thinking vs. Acting

• Thinking/Deliberating– slow, speed decreases with complexity

– involves planning (looking into the future) to avoid bad solutions

– thinking too long may be dangerous

– requires (a lot of) accurate information

– flexible for increasing complexity

• Acting/Reaction – fast, regardless of complexity

– innate/built-in or learned (from looking into the past)

– limited flexibility for increasing complexity

CS 493/790(X) - Lecture 2 12

Reactive Control: Don’t think, react!

• Technique for tightly coupling perception and action to provide

fast responses to changing, unstructured environments

• Collection of stimulus-response rules

• Limitations

– No/minimal state

– No memory

– No internal representations

of the world

– Unable to plan ahead

– Unable to learn

• Advantages

– Very fast and reactive

– Powerful method: animals

are largely reactive

CS 493/790(X) - Lecture 2 13

Deliberative Control: Think hard, then act!

• In DC the robot uses all the available sensory information and

stored internal knowledge to create a plan of action: sense

plan act (SPA) paradigm

• Limitations

– Planning requires search through potentially all possible plans

these take a long time

– Requires a world model, which may become outdated

– Too slow for real-time response

• Advantages

– Capable of learning and prediction

– Finds strategic solutions

CS 493/790(X) - Lecture 2 14

Hybrid Control: Think and act independently & concurrently!

• Combination of reactive and deliberative control

– Reactive layer (bottom): deals with immediate reaction

– Deliberative layer (top): creates plans

– Middle layer: connects the two layers

• Usually called “three-layer systems”

• Major challenge: design of the middle layer

– Reactive and deliberative layers operate on very different

time-scales and representations (signals vs. symbols)

– These layers must operate concurrently

• Currently one of the two dominant control paradigms

in robotics

CS 493/790(X) - Lecture 2 15

Behavior-Based Control: Think the way you act!

• Behaviors: concurrent processes that take inputs from

sensors and other behaviors and send outputs to a

robot’s actuators or other behaviors to achieve some

goals

• An alternative to hybrid control, inspired from biology

• Has the same capabilities as hybrid control:

– Act reactively and deliberatively

• Also built from layers

– However, there is no intermediate layer

– Components have a uniform representation and time-scale

CS 493/790(X) - Lecture 2 16

Fundamental Differences of Control

• Time-scale: How fast do things happen?

– how quickly the robot has to respond to the environment,

compared to how quickly it can sense and think

• Modularity: What are the components of the control

system?

– Refers to the way the control system is broken up into

modules and how they interact with each other

• Representation: What does the robot keep in its brain?

– The form in which information is stored or encoded in the

robot

CS 493/790(X) - Lecture 2 17

Behavior Coordination

• Behavior-based systems require consistent

coordination between the component behaviors for

conflict resolution

• Coordination of behaviors can be:

– Competitive: one behavior’s output is selected from

multiple candidates

– Cooperative: blend the output of multiple behaviors

– Combination of the above two

CS 493/790(X) - Lecture 2 18

Competitive Coordination

• Arbitration: winner-take-all strategy only one response chosen

• Behavioral prioritization

– Subsumption Architecture

• Action selection/activation spreading (Pattie Maes)

– Behaviors actively compete with each other

– Each behavior has an activation level driven by the robot’s

goals and sensory information

• Voting strategies

– Behaviors cast votes on potential responses

CS 493/790(X) - Lecture 2 19

Cooperative Coordination

• Fusion: concurrently use the output of multiple behaviors

• Major difficulty in finding a uniform command

representation amenable to fusion

• Fuzzy methods

• Formal methods

– Potential fields

– Motor schemas

– Dynamical systems

CS 493/790(X) - Lecture 2 20

Fusion: flocking (formations)

Example of Behavior Coordination

Arbitration: foraging (search, coverage)

CS 493/790(X) - Lecture 2 22

Learning & Adaptive Behavior

• Learning produces changes within an agent that

over time enable it to perform more effectively within

its environment

• Adaptation refers to an agent’s learning by making

adjustments in order to be more attuned to its

environment

– Phenotypic (within an individual agent) or genotypic

(evolutionary)

– Acclimatization (slow) or homeostasis (rapid)

CS 493/790(X) - Lecture 2 23

Learning

Learning can improve performance in additional ways:

• Introduce new knowledge (facts, behaviors, rules)

• Generalize concepts

• Specialize concepts for specific situations

• Reorganize information

• Create or discover new concepts

• Create explanations

• Reuse past experiences

CS 493/790(X) - Lecture 2 24

Learning Methods

• Reinforcement learning

• Neural network (connectionist) learning

• Evolutionary learning

• Learning from experience

– Memory-based

– Case-based

• Learning from demonstration

• Inductive learning

• Explanation-based learning

• Multistrategy learning

CS 493/790(X) - Lecture 2 25

Reinforcement Learning (RL)

• Motivated by psychology (the Law of Effect, Thorndike

1991):

Applying a reward immediately after the occurrence of a response increases its probability of reoccurring, while providing punishment after the response will decrease the probability

• One of the most widely used methods for adaptation in

robotics

CS 493/790(X) - Lecture 2 26

Reinforcement Learning

• Goal: learn an optimal policy that chooses the

best action for every set of possible inputs

• Policy: state/action mapping that determines

which actions to take

• Desirable outcomes are strengthened and undesirable

outcomes are weakened

• Critic: evaluates the system’s response and applies

reinforcement

– external: the user provides the reinforcement

– internal: the system itself provides the reinforcement (reward

function)

CS 493/790(X) - Lecture 2 27

Learning to Walk

• Maes, Brooks (1990)

• Genghis: hexapod robot

• Learned stable tripod

stance and tripod gait

• Rule-based subsumption

controller

• Two sensor modalities for feedback:

– Two touch sensors to detect hitting the floor: - feedback

– Trailing wheel to measure progress: + feedback

CS 493/790(X) - Lecture 2 28

Learning to Walk

• Nate Kohl & Peter Stone (2004)

CS 493/790(X) - Lecture 2 29

Supervised Learning

• Supervised learning requires the user to give the

exact solution to the robot in the form of the error

direction and magnitude

• The user must know the exact desired behavior for

each situation

• Supervised learning involves training, which can be

very slow; the user must supervise the system with

numerous examples

CS 493/790(X) - Lecture 2 30

Neural Networks

• One of the most used supervised learning methods

• Used for approximating real-valued and vector-

valued target functions

• Inspired from biology: learning systems are built

from complex networks of interconnecting neurons

• The goal is to minimize the error between the

network output and the desired output

– This is achieved by adjusting the weights on the network

connections

CS 493/790(X) - Lecture 2 31

ALVINN• ALVINN (Autonomous Land

Vehicle in a Neural Network)

• Dean Pomerleau (1991)

• Pittsburg to San Diego: 98.2%

autonomous

CS 493/790(X) - Lecture 2 32

Learning from Demonstration & RL

• S. Schaal (’97) • Pole balancing, pendulum-swing-up

CS 493/790(X) - Lecture 2 33

Learning from Demonstration

Inspiration:

• Human-like teaching by demonstration

Demonstration Robot performance

CS 493/790(X) - Lecture 2 34

Learning from Robot Teachers

• Transfer of task knowledge from humans to robots

Human demonstration Robot performance

CS 493/790(X) - Lecture 2 35

Classical Conditioning

• Pavlov 1927

• Assumes that unconditioned stimuli (e.g. food)

automatically generate an unconditioned

response (e.g., salivation)

• Conditioned stimulus (e.g., ringing a bell) can,

over time, become associated with the

unconditioned response

CS 493/790(X) - Lecture 2 36

Darvin’s Perceptual Categorization

• Two types of stimulus blocks– 6cm metallic cubes– Blobs: low conductivity (“bad taste”)– Stripes: high conductivity (“good taste”)

• Instead of hard-wiring stimulus-response rules, develop these associations over time

Early training After the 10th stimulus

CS 493/790(X) - Lecture 2 37

Genetic Algorithms

• Inspired from evolutionary biology

• Individuals in a populations have a particular fitness

with respect to a task

• Individuals with the highest fitness are kept as

survivors

• Individuals with poor performance are discarded: the

process of natural selection

• Evolutionary process: search through the space

of solutions to find the one with the highest fitness

CS 493/790(X) - Lecture 2 38

Genetic Operators

• Knowledge is encoded as bit strings: chromozome– Each bit represents a “gene”

• Biologically inspired operators are applied to yield

better generations

CS 493/790(X) - Lecture 2 39

Evolving Structure and Control

• Karl Sims 1994

• Evolved morphology and control

for virtual creatures performing

swimming, walking, jumping,

and following

• Genotypes encoded as directed graphs are used to produce

3D kinematic structures

• Genotype encode points of attachment

• Sensors used: contact, joint angle and photosensors

CS 493/790(X) - Lecture 2 40

Evolving Structure and Control

• Jordan Pollak

– Real structures

CS 493/790(X) - Lecture 2 41

Readings

• F. Martin: Sections 1.1, 1.2.3, 5

• M. Matarić: Chapters 1, 3, 10

Advanced Topics in Robotics CS493/790 (X) Lecture 2 Instructor: Monica Nicolescu.

Documents

Transcript of Advanced Topics in Robotics CS493/790 (X) Lecture 2 Instructor: Monica Nicolescu.