Using Hierarchical Reinforcement Learning to Solve a Problem with Multiple Conflicting Sub-problems.
-
Upload
tabitha-lindsey -
Category
Documents
-
view
215 -
download
0
Transcript of Using Hierarchical Reinforcement Learning to Solve a Problem with Multiple Conflicting Sub-problems.
![Page 1: Using Hierarchical Reinforcement Learning to Solve a Problem with Multiple Conflicting Sub-problems.](https://reader036.fdocuments.net/reader036/viewer/2022082709/56649d205503460f949f50cd/html5/thumbnails/1.jpg)
Using Hierarchical Reinforcement Learning to Solve a Problem with Multiple Conflicting Sub-problems
![Page 2: Using Hierarchical Reinforcement Learning to Solve a Problem with Multiple Conflicting Sub-problems.](https://reader036.fdocuments.net/reader036/viewer/2022082709/56649d205503460f949f50cd/html5/thumbnails/2.jpg)
Reinforcement Learning
• Involves an agent interacting with an environment
• The agent can be in one of various states in the environment
• The agent is not told which action is correct, but is given a measure of an action for a given state
• After a while the agent develops a policy
![Page 3: Using Hierarchical Reinforcement Learning to Solve a Problem with Multiple Conflicting Sub-problems.](https://reader036.fdocuments.net/reader036/viewer/2022082709/56649d205503460f949f50cd/html5/thumbnails/3.jpg)
The curse
• As complexity of the environment grows, state space increases exponentially
• We can try to cleverly reduce state space
• Hierarchical reinforcement learning
![Page 4: Using Hierarchical Reinforcement Learning to Solve a Problem with Multiple Conflicting Sub-problems.](https://reader036.fdocuments.net/reader036/viewer/2022082709/56649d205503460f949f50cd/html5/thumbnails/4.jpg)
Hierarchical Reinforcement Learning
• A complex problem can often be broken up into multiple conflicting sub-problems
• Hierarchical reinforcement learning can handle this
• Deals with each sub-problem separately using reinforcement learning
• Decides which sub-problem to attempt next using reinforcement learning
![Page 5: Using Hierarchical Reinforcement Learning to Solve a Problem with Multiple Conflicting Sub-problems.](https://reader036.fdocuments.net/reader036/viewer/2022082709/56649d205503460f949f50cd/html5/thumbnails/5.jpg)
A Practical Example: The Mars Lander
Perform Various Conflicting Tasks:
• Explore the terrain• Collect soil
samples• Return to base for
refuelling
![Page 6: Using Hierarchical Reinforcement Learning to Solve a Problem with Multiple Conflicting Sub-problems.](https://reader036.fdocuments.net/reader036/viewer/2022082709/56649d205503460f949f50cd/html5/thumbnails/6.jpg)
My Project
• Apply hierarchical reinforcement learning to a complex problem
• Consist of an agent existing in an environment where it will have to achieve an overall goal
• Agent will be a primitive creature trying to survive in the wilderness
![Page 7: Using Hierarchical Reinforcement Learning to Solve a Problem with Multiple Conflicting Sub-problems.](https://reader036.fdocuments.net/reader036/viewer/2022082709/56649d205503460f949f50cd/html5/thumbnails/7.jpg)
My Project
• The overall goal will be for the creature to remain happy or comfortable in the wilderness
• Overall goal can be divided into sub-goals• These sub-goals will be:
– Eating food– Drinking water– Resting under a Shelter– Repairing Shelter– Avoiding hazards
![Page 8: Using Hierarchical Reinforcement Learning to Solve a Problem with Multiple Conflicting Sub-problems.](https://reader036.fdocuments.net/reader036/viewer/2022082709/56649d205503460f949f50cd/html5/thumbnails/8.jpg)
The Gridworld
![Page 9: Using Hierarchical Reinforcement Learning to Solve a Problem with Multiple Conflicting Sub-problems.](https://reader036.fdocuments.net/reader036/viewer/2022082709/56649d205503460f949f50cd/html5/thumbnails/9.jpg)
Motivation for this approach• X pos Y pos Hunger Thirst Fatigue Shelter Condition
• 13 x 13 x 10 x 10 x 10 x 10
= 1690000 Possible states• Sub-goals separated out:• (Xpos, Ypos, hunger) , (Xpos, Ypos, Thirst)
(Xpos, Ypos, Fatigue), (Xpos, Ypos, Shelter Condition)
• (13 x 13 x 10) x 4
=1690 x 4 = 6760 Possible states