Download - SWAN 2006, ARRI_ Control of Distributed-Information Nonlinear Stochastic Systems Prof. Thomas Parisini University of Trieste.

SWAN 2006, ARRI_

Control ofDistributed-Information

Nonlinear Stochastic Systems

Prof. Thomas Parisini

University of Trieste

SWAN 2006, ARRI

Summary

Examples

Problem Formulation

Approximate NN Solution

Two Significant Applications

SWAN 2006, ARRI

Example:Water Distribution Networks

Objective: control the spatio-temporal distribution of drinking water disinfectant throughout the network by the injection of appropriate amount of disinfectant at appropriately chosen actuator locations

SWAN 2006, ARRI

Example:Robot Soccer

Objective: By the year 2050, develop a team of fully autonomous humanoid robots that can win against the human world soccer champions.

Robot soccer incorporates various technologies including: design principles of autonomous agents, multi-agent cooperation, strategy acquisition, real-time reasoning, robotics, and sensor-fusion.

SWAN 2006, ARRI

Typical Applications

Traffic control systems

Coordination of teams of UAVs

Geographically distributed systems

Dynamic routing in communications networks

Large-scale process-control systems

Coordination of teams of UAVs

etc.

SWAN 2006, ARRI

Distributed Decision and Control

How did this happen?

Outer-loop control: as we move from low-level control to higher-level control, we face the need to take into consideration other feedback systems that influence the environment.

Complexity: due to the presence of more complex systems, we had to break down the controller into smaller controllers.

Communications and Networks: data networks and wireless communications have facilitated the design of feedback systems that require distributed decision and control techniques.

SWAN 2006, ARRI


What does it imply?

Need for different type of control problem formulations.Need to handle competition from other controllers (agents).

Need to handle cooperation with other controllersNeed to handle inter-controller communication issues.Need for suitable individual and team evaluation methods

SWAN 2006, ARRI


How does Learning come in?

Learning the environment

Learning the strategy of adversarial agents

Learning is crucial because it is a highly time-varying (evolving) environment.

Predicting the actions of collaborating agents

SWAN 2006, ARRI

Structure of a “Team” of Agents

SWAN 2006, ARRI

Problem formulation

Definitions

Unpredictable variables:

Random vector representing all uncertainties in the external world with known p.d.f.

SWAN 2006, ARRI

Problem formulation

Definitions

Information function:

SWAN 2006, ARRI

Problem formulation

Definitions

Decision function:

SWAN 2006, ARRI

Problem formulation

Definitions

Cost functional:

SWAN 2006, ARRI

Problem formulation

Problem T

Given

and

Find the optimal

strategies

that minimize

Solving analytically Problem T is in general impossible

SWAN 2006, ARRI

About Problem T

Further Definitions and Concepts

Predecessor:

The control actions generated by affect the information set of for any possible

SWAN 2006, ARRI

About Problem T


Information set inclusion:

The information set is a function of the information set ( is “nested” in )

SWAN 2006, ARRI

About Problem T


Information network:

SWAN 2006, ARRI

About Problem T


Information Structures

Static: information of each is not influenced by decisions of other

Dynamic: otherwise

SWAN 2006, ARRI

About Problem T

Sufficient conditions for analytic solution

LQG Static Teams

Linear optimal control strategy

SWAN 2006, ARRI

About Problem T


“Partially Nested” LQG Dynamic Teams

Any can reconstruct the information of the influencing its own information

Linear optimal control strategy

SWAN 2006, ARRI

About Problem T


Existence of a sequential partition

Optimal control strategy by DP

SWAN 2006, ARRI

Approximate Solution of Problem T

A Simple Methodology

Assumption: no loops in the information network

Given parametric structure

Vector of “free” parameters to be

optimized

SWAN 2006, ARRI

Formulation of the Approximate Problem

Problem T’

Substitute and into the cost function

SWAN 2006, ARRI

Formulation of the Approximate Problem

Problem T’

Given

and

Find the optimal vector

that minimizesGiven NN structures

Functional Optimization Problem T

Nonlinear Programming Problem T’

SWAN 2006, ARRI

NN Learning Algorithm

Gradient Method

cannot be written in explicit form

However:

SWAN 2006, ARRI


Stochastic Approximation

Compute the “realization”

SWAN 2006, ARRI


Stochastic Approximation

randomly generated

according to the (known) p.d.f. of

The step is chosen so as

Example:

SWAN 2006, ARRI


Important Remark

Gradient method + Stochastic Approximation

Distributed Learning: each DM is able to compute “locally” its control function by “exchanging messages” with cooperating DMs according to the Information Structure

SWAN 2006, ARRI

Methodology: Conceptual Steps

Problem T: minimize Exact optimal solutions

Replace with the NN structure

Problem T’: minimize

Stoc. Appr. to solve Problem T’

Approximate optimal solutions

SWAN 2006, ARRI

The Witsenhausen Counterexample

Problem W

Given:

and independentFind the optimal

strategies

that minimize information functions:

cost function:

SWAN 2006, ARRI


Problem W

SWAN 2006, ARRI


Remarks on Problem W

LQG hypotheses hold Information structure not partially nested

An optimal solution does exist

But:

are not affine functions of

SWAN 2006, ARRI



Best affine solution:

Wit. solution:

For and

SWAN 2006, ARRI



Optimized Wit. solution:

For and

given the structures

SWAN 2006, ARRI



0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.40

5

10

15

20

25

30

35

40

45

Opt. Wit. solution outperforms the best

linear solutions:

SWAN 2006, ARRI

Approximate NN Solution of Problem W

Given parametric structures

Vector of “free” parameters to be

optimized

Choice of the parametric structures

SWAN 2006, ARRI


Problem W’

Substitute into the cost functional

SWAN 2006, ARRI


Problem W’

Given

and independentFind the

optimal NN weights

that minimize

information functions:

cost function:

SWAN 2006, ARRI

Conceptual Steps to Solve Approximately Problem W

Problem W: minimize Exact optimal solutions

Replace with the NNs

Problem W’: minimize

Stoc. Appr. to solve Problem W’

Approximate optimal solutions

SWAN 2006, ARRI

-15 10 -5 0 5 10 15

-15

10

-5

0

5

10

15


-15 10 -5 0 5 10 15

-15

10

-5

0

5

10

15

Results and Comparisons

Best Linear

NN

Opt.

W.

Best Linear

NN

Opt.

W.

SWAN 2006, ARRI


-15 10 -5 0 5 10 15

-15

10

-5

0

5

10

15

-15 10 -5 0 5 10 15

-15

10

-5

0

5

10

15


Best Linear

NN

Opt.

W.

Best Linear

NN

Opt.

W.

SWAN 2006, ARRI


-15 10 -5 0 5 10 15

-15

10

-5

0

5

10

15

-15 10 -5 0 5 10 15

-15

10

-5

0

5

10

15


Best Linear

NN Opt.

W.

Best Linear

NN Opt.

W.

SWAN 2006, ARRI

3- 3- 5- 5- 5- 5- L+ L+

3- 3- 5- 5- 5- 5- L+ L+

3- 3- 5- 5- 5- 5- L+ L+

3- 3- 5- 5- 5- 5- 5+ L+

3- 3- 3- 5- 5- 5- 5- 5+ L+

3- 3- 3- 5- 5- 5- 5+ L+

3- 3- 3- 5- 5- 5- L+ L+3+ 3- 3- 3- 3- 5- L+ L+

3+ 3- 3- 3- 3- 5- L+ L+3+ 3- 3- 3- 3- 5- L+ L+ L+

3+ 3- 3- 3- 5- L+ L+3+ 3- 3- 3- 3- L+ L+

0 0.1 0.2 0.3 0.4 0.5 0.6 0.70

5

10

15

20

25

30

35

40

45



The “3-step” area The “linear” area The “5-step” area

SWAN 2006, ARRI



Costs of the Neural, Optimized Witsenhausen, and Best Linear Solutions

SWAN 2006, ARRI

Concluding Remarks

Decision makers act as cooperating members of a team

General approximate methodology for the solution of distributed-information control problems:

Team functional optimization problem reduced to a nonlinear programming one

Distributed learning scheme: each DM can compute (or adapt) its “personal” control function “locally” Straightforward extension to the infinite-horizon case (receding-horizon paradigm)

SWAN 2006, ARRI

Acknowledgments

Riccardo Zoppoli, Marios Polycarpou, Marco Baglietto, Angelo Alessandri, Alessandro Astolfi, Daniele Casagrande, Riccardo Ferrari, Elisa Franco, Frank Lewis, R. Selmic, Jason Speyer, Marcel Staroswiecki, Jakob Stoustrup, …