COMP 4180: Intelligent Mobile Robotics Reinforcement Learning

COMP 4180: Intelligent Mobile Robotics

Reinforcement Learning

Jacky BaltesDepartment of Computer Science

University of Manitoba

Email: jacky@cs.umanitoba.ca

http://www4.cs.umanitoba.ca/~jacky/...Teaching/Courses/COMP_4180-

IntelligentMobileRobotics/current/index.php

Outline

● Reinforcement Learning Problem– Dynamic Programming– Control learning– Control policies that choose optimal actions– Q Learning– Convergence

● Monte-Carlo Methods● Temporal Difference Learning

Control Learning

Example: TD-Gammon

Reinforcement Learning Problem

Markov Decision Processes

Agent's Learning Task

State Value Function

Bellman Equation(Deterministic Case)

Example

Iterative Policy Evaluation

What to learn?

Q (Action-Value) Function

Bellman EquationDeterministic Case

Optimal Value Functions

Policy Improvement

Example

Generalized Policy Iteration

Value IterationQ-Learning

Non-deterministic Case

Bellman EquationsNon-deterministic Case

Value IterationQ-Learning

Example

Reinforcement Learning

Monte-Carlo MethodsPolicy Evaluation

Monte Carlo MethodPolicy Evaluation

Temporal Difference (TD) Learning

TD(0): Policy Evaluation

e-Greedy Policy

SARSA Policy Iteration

SARSA Example

SARSA Example V(s)

SARSA ExampleQ(s,a)

Rotational Inverted Pendulum

Rotational Inverted Pendulum Stablization Demo, Tor Aarnodthttp://www.eecg.utoronto.ca/~aamodt/BAScThesis/RLsim.htm

Q-Learning (Off-Policy TD)

Q-Learning (Off Policy Iteration)

TD vs Monte Carlo

Temporal Difference Learning

Monte Carlo Method

N-Step return

TD() Learning

Eligibility Traces

On-line TD()

Function Approximation

Stochastic Gradient Descent

Convergence

Subtleties and Ongoing Research

● Replace Q^ table with neural net or other generalizer

● Handle cases where the state is only partially observable

● Design optimal exploration strategies● Extend to continuous action, state● Learn and use delta^: S x A -> S● Relationship to dynamic programming

References

● Reinforcement Learning: An Introduction. Richard S. Sutton, Andrew G. Barto. MIT Press 1998. http://www-anw.cs.umass.edu/~rich/book/the-book.html

● Neuro-Dynamic Programming, Dimitri Bertsekas, John Tsitsiklis, Athena Scientific, 1996.

● Reinforcement Learning: A Tutorial. M. Harmon, S. Harmon.● Reinforcement Learning: A Survey, L. Kaebling et al., Journal of Aritificial

Intelligence Research, Vol 4, pp. 237-285● How to Make Software Agents Do the Right Thing: An Introduction to

Reinforcement Learning, S. Singh, P. Norvig, D. Cohn.● Reinforcement Learning Software:

– http://www-anw.cs.umass.edu/~rich/software.html– http://www.cse.msu.edu/rlr/domains.html

● Reinforcement Learning for Humanoid Robots–

● Frank Hoffman. http://www.nada.kth.se/kurser/kth/2D1431/02/index.html

COMP 4180: Intelligent Mobile Robotics Reinforcement Learning

Documents

Transcript of COMP 4180: Intelligent Mobile Robotics Reinforcement Learning

Exploration and other applications of reinforcement ...automation.tkk.fi/attach/AS-84-4340/Exploration.pdf · Exploration and Other Applications of Reinforcement Learning in Robotics

CONFERENCIA IXTAPA D-4180

STIHL Basismotor 4180 - LawnSite

Model-Based Reinforcement Learning in Robotics...Model-Free Example Rainbow Algorithm [2] 1𝑓= 1 60 𝑠 200.000.000𝑓=…=925,925ℎ=38,58 Not feasible in robotics! MODEL-BASED

Reinforcement Learning in Robotics: A Survey - TU · PDF fileReinforcement Learning in Robotics: A Survey ... autonomous vehicles, and humanoid robots. (a) The OBELIX robot is a wheeled

PPO Reinforcement Learning for Dexterous In-Hand Manipulation · Main challenges for using Reinforcement Learning in Robotics –Differences between reality and simulation need to

Deep Reinforcement Learning for Robotics: Frontiers and Beyondwnzhang.net/teaching/past-courses/cs420-2018/slides/guest-shixiang-gu.pdf · 01 Deep Reinforcement Learning for Robotics:

REINFORCEMENT LEARNING IN ROBOTICS - IJSabr.ijs.si/upload/1423561726-ReinforcementLearningInRobotics .pdf · REINFORCEMENT LEARNING IN ROBOTICS AN INTRODUCTION TO Nemec Bojan, Jozef

Reinforcement Learning for Robotics - Columbia …RL Class of Methods Value Iteration methods (Q-Learning, SARSA) DQN: Playing Atari with Deep Reinforcement Learning (Mnih etal 2015)

Reinforcement Learning in Robotics

lecture 5 - reinforcement learningholly/teaching/4500/spring2018/...Reinforcement Learning COMP 4500 Mobile Robotics I Spring 2018 Prof. Yanco Many of the slides in this presentation

Delft University of Technology Improved deep reinforcement ...pure.tudelft.nl/ws/files/11577092/deBruinIROS2016.pdf · Improved deep reinforcement learning for robotics through distribution-based

Deep Reinforcement Learning for Robotics Using DIANNEon-demand.gputechconf.com/gtc-eu/2017/presentation/... · PUBLIC Deep Reinforcement Learning for Robotics Using DIANNE Tim Verbelen,

На страже №15 (4180)

YO SOY 4180 (4)

4180, 4181 - res.cloudinary.com

CSE-571 Probabilistic Robotics - University of WashingtonCSE-571 Probabilistic Robotics Reinforcement Learning for Active Sensing Manipulator Control Reinforcement Learning • Same

Reinforcement Learning Applications in Robotics Gerhard Neumann, Seminar A, SS 2006.

Reinforcement Learning of Motor Skills using Policy Search ... · In robotics, Reinforcement Learning (RL) (Kober, Bagnell, and Peters2013) has been used to learn and improve movement

Reinforcement Learning in Robotics: Applications and Real-World Challenges