Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration...

48

Upload
others
Category

Documents
view
0
download
0

Embed Size (px):

Transcript of Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration...

Page 1: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 2: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 3: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Practice Theory

Powerfulmodeling,simpleexploration Sophisticatedexploration insmall-stateMDPs

e.g.:AtariDeepReinforcement Learning e.g.𝐸",R-MAXalgorithms

Limitedtheoryforrichobservations

Goal

DevelopReinforcementLearningapproachesguaranteed tolearnanoptimalpolicy withasmallnumberofsamples despiterichobservations.

Page 4: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Model PACGuarantees

Small-state MDPs Known

Structured large-stateMDPs New

ReactivePOMDPs Extended

ReactivePSRs New

LQR (continuousactions) Known

Page 5: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Model PACGuarantees

Small-state MDPs Known

Structured large-stateMDPs New

ReactivePOMDPs Extended

ReactivePSRs New

LQR (continuousactions) Known

Page 6: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

Page 7: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 8: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 9: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 10: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

𝐻

Page 11: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

Page 12: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

𝜋(𝑥')

§

Page 13: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 14: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 15: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

§

Page 16: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 17: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 18: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

§

Page 19: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 20: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

𝑥

§

Page 21: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

𝜋 𝑥 ) *

Distributionofinitialstate

Distributionofnextstate

Instantaneousreward

Page 22: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

max/E0~23 𝑟 𝑎 + E*7~8 *,/ 𝑉⋆(𝑥<)

Distributionofinitialstate

DistributionofnextstateInstantaneous

reward

Optimalaction

Page 23: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

max/E0~23 𝑟 𝑎 + E*7~8 *,/ 𝑉⋆(𝑥<)

𝑄⋆(𝑥, 𝑎)

𝜋⋆ 𝑥 = argmax/

𝑄⋆ 𝑥, 𝑎

Page 24: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

Page 25: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

Page 26: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

Page 27: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 28: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

E0~23 𝑟 𝑎 + E*7~8 *,/ 𝑉⋆ 𝑥<

§

Page 29: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

E0~23 𝑟 𝑎 + E*7~8 *,/ max/7𝑄⋆(𝑥<, 𝑎<)

§

Page 30: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

E0~23 𝑟 𝑎 + E*7~8 *,/ 𝑄⋆(𝑥<, 𝜋⋆ 𝑥< )

§

§

Page 31: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

E 𝑓 𝑥', 𝑎' − 𝑟' − 𝑓 𝑥'CD, 𝑎'CD ,

𝑥'

Page 32: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

E 𝑓 𝑥', 𝑎' − 𝑟' − 𝑓 𝑥'CD, 𝑎'CD ,

Page 33: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§ Validitycondition

Page 34: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

Page 35: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

Page 36: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 37: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

§

Page 38: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§§

§

§

§

Page 39: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 40: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§E*∼8F max/ [𝑄⋆ 𝑥, 𝑎 ]

E*∼8F𝑄⋆(𝑥, 𝜋⋆ 𝑥 )

Page 41: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§ 𝑉I = E𝒙∼𝚪𝟏[𝒇 𝒙, 𝝅𝒇 𝒙 ]

§

§

§

§

§

Optimismunderuncertainty,guessfor𝑉 𝜋⋆ if𝑓 = 𝑄⋆

Checkingouroptimisticbelief

Prunethepossiblesolutions

Page 42: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 43: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§§

§§

§

§

§

§

Page 44: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

Page 45: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

Page 46: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

§

Page 47: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

§

Page 48: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Detailsat:https://arxiv.org/abs/1610.09512

MAA slides ReinforcementLearning

MAA slides ReinforcementLearning

· Web viewGas production/ export ... Kruzenshternskoe gas field explorationIn the 1980s hydrocarbon exploration began in ... TEK not valued/ captured by science and non

· Web viewGas production/ export ... Kruzenshternskoe gas field explorationIn the 1980s hydrocarbon exploration began in ... TEK not valued/ captured by science and non

الشريحة 1 - KSUfac.ksu.edu.sa/sites/default/files/lsq-mly_8.pdf · e.g., Colocasia 4, Sucker e.g., Chrysanthemum e.g., Asparagus 5. Cladophylls e.g., Asparagus 6. Bulbil e.g.,

الشريحة 1 - KSUfac.ksu.edu.sa/sites/default/files/lsq-mly_8.pdf · e.g., Colocasia 4, Sucker e.g., Chrysanthemum e.g., Asparagus 5. Cladophylls e.g., Asparagus 6. Bulbil e.g.,

Immobility. Degrees of mobility Complete immobility e.g. unconscious patient Complete immobility e.g. unconscious patient Partial mobility e.g. patient.

Immobility. Degrees of mobility Complete immobility e.g. unconscious patient Complete immobility e.g. unconscious patient Partial mobility e.g. patient.

ReinforcementLearning for NLP - web.stanford.eduweb.stanford.edu/class/cs224n/lectures/lecture16-guest.pdf · Outline Introduction to Reinforcement Learning Policy-basedDeep RL Value-basedDeep

ReinforcementLearning for NLP - web.stanford.eduweb.stanford.edu/class/cs224n/lectures/lecture16-guest.pdf · Outline Introduction to Reinforcement Learning Policy-basedDeep RL Value-basedDeep

Universal Morphological Analysis using ReinforcementLearning

Universal Morphological Analysis using ReinforcementLearning

Learning in TensorFlow Deep ReinforcementLearning in TensorFlow Danijar Hafner · Stanford CS 20SI · 2017-03-10. Gu16. Barron16. Hafner16. Repeat until end of episode: Most methods

Learning in TensorFlow Deep ReinforcementLearning in TensorFlow Danijar Hafner · Stanford CS 20SI · 2017-03-10. Gu16. Barron16. Hafner16. Repeat until end of episode: Most methods

DESIGN GAME PECHA KUCHA 2010. Strategy Games (e.g. Red Alert) Simulation Games (e.g. Formula 1) Board Games (e.g. Poker) Arcade Games (e.g. Soccer)

DESIGN GAME PECHA KUCHA 2010. Strategy Games (e.g. Red Alert) Simulation Games (e.g. Formula 1) Board Games (e.g. Poker) Arcade Games (e.g. Soccer)

AnIntroductiontoDeep ReinforcementLearning arXiv:1811 ... · 1.2. Outline 3 maybemaybeconstrained(e.g.,notaccesstoanaccuratesimulator orlimiteddata). Overthepastfewyears,RLhasbecomeincreasinglypopulardue

AnIntroductiontoDeep ReinforcementLearning arXiv:1811 ... · 1.2. Outline 3 maybemaybeconstrained(e.g.,notaccesstoanaccuratesimulator orlimiteddata). Overthepastfewyears,RLhasbecomeincreasinglypopulardue

ROBUST CONSTRAINED REINFORCEMENTLEARNING FOR …

ROBUST CONSTRAINED REINFORCEMENTLEARNING FOR …

New RL19: BiologicalandneuralRL · 2015. 3. 23. · Reinforcementlearning Inspired by behaviorist psychology, reinforcement learning is an area of machine learning in computer science,

New RL19: BiologicalandneuralRL · 2015. 3. 23. · Reinforcementlearning Inspired by behaviorist psychology, reinforcement learning is an area of machine learning in computer science,

Original Article Artigo Original phonological development ... · epenthesis (e.g..: bruxa – [bu’ɾuʃa]), merger (e.g..: cravo – [´davu]), compensatory stretching (e.g.: planta

Original Article Artigo Original phonological development ... · epenthesis (e.g..: bruxa – [bu’ɾuʃa]), merger (e.g..: cravo – [´davu]), compensatory stretching (e.g.: planta

KOMUNIKASI VERBAL DAN NON VERBAL · non-verbal communication/ body language eye movements (e.g. winking) posture (e.g. slouching) appearance (e.g. untidiness) head movements (e.g.

KOMUNIKASI VERBAL DAN NON VERBAL · non-verbal communication/ body language eye movements (e.g. winking) posture (e.g. slouching) appearance (e.g. untidiness) head movements (e.g.

Learner name: School/Centre name: Teacher/Tutor name · • Face to face (e.g. a live presentation) • Audio (e.g. a podcast) • Visual (e.g. facial expression) • Virtual (e.g.

Learner name: School/Centre name: Teacher/Tutor name · • Face to face (e.g. a live presentation) • Audio (e.g. a podcast) • Visual (e.g. facial expression) • Virtual (e.g.

Artificial Intelligence - UCSByuxiangw/classes/CS165A...Artificial Intelligence CS 165A Mar 7, 2019 Instructor:Prof.Yu-XiangWang ®ReinforcementLearning ®Logic 1. ... Mastering the

Artificial Intelligence - UCSByuxiangw/classes/CS165A...Artificial Intelligence CS 165A Mar 7, 2019 Instructor:Prof.Yu-XiangWang ®ReinforcementLearning ®Logic 1. ... Mastering the

Junk DNA domestic imported (e.g., dead genes) (e.g., retroviruses)

Junk DNA domestic imported (e.g., dead genes) (e.g., retroviruses)

CS325ArtiﬁcialIntelligence Ch.17.5–6,GameTheory · MDPsandRLforgames: Civilization 2010PaperonplayingCivilizationIV;uses: MarkovDecisionProcesses ReinforcementLearning,amodel-based

CS325ArtiﬁcialIntelligence Ch.17.5–6,GameTheory · MDPsandRLforgames: Civilization 2010PaperonplayingCivilizationIV;uses: MarkovDecisionProcesses ReinforcementLearning,amodel-based

AnIntroductiontoDeep ReinforcementLearning

AnIntroductiontoDeep ReinforcementLearning

Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

White Space Networking - microsoft.com · cloud gaming (e.g. Xbox Live), video conferencing (e.g. Skype), file sharing & collaboration (e.g. SharePoint), cloud Storage (e.g. Azure),…

White Space Networking - microsoft.com · cloud gaming (e.g. Xbox Live), video conferencing (e.g. Skype), file sharing & collaboration (e.g. SharePoint), cloud Storage (e.g. Azure),…

Languages

Pages

Legal

Copyright © 2022 FDOCUMENTS