Monte Carlo Methods. Learn from complete sample returns – Only defined for episodic tasks Can work...

15

Monte Carlo Methods

Upload
dwayne-fisher
Category

Documents
view
215
download
1

Embed Size (px):

Transcript of Monte Carlo Methods. Learn from complete sample returns – Only defined for episodic tasks Can work...

Page 1: Monte Carlo Methods. Learn from complete sample returns – Only defined for episodic tasks Can work in different settings – On-line: no model necessary.

Monte Carlo Methods

Page 2: Monte Carlo Methods. Learn from complete sample returns – Only defined for episodic tasks Can work in different settings – On-line: no model necessary.

• Learn from complete sample returns– Only defined for episodic tasks

• Can work in different settings– On-line: no model necessary– Simulated: no need for full model

Page 3: Monte Carlo Methods. Learn from complete sample returns – Only defined for episodic tasks Can work in different settings – On-line: no model necessary.

• DP doesn’t need any experience

• MC doesn’t need a full model– Learn from actual or simulated experience

• MC expense independent of total # of states• MC has no bootstrapping– Not harmed by Markov violations

Page 4: Monte Carlo Methods. Learn from complete sample returns – Only defined for episodic tasks Can work in different settings – On-line: no model necessary.

• Every-Visit MC: average returns every time s is visited in an episode

• First-visit MC: average returns only first time s is visited in an episode

• Both converge asymptotically

Page 5: Monte Carlo Methods. Learn from complete sample returns – Only defined for episodic tasks Can work in different settings – On-line: no model necessary.

1st Visit MC for Vπ

What about control?

Page 6: Monte Carlo Methods. Learn from complete sample returns – Only defined for episodic tasks Can work in different settings – On-line: no model necessary.

Want to improve policy by greedily following Q

Page 7: Monte Carlo Methods. Learn from complete sample returns – Only defined for episodic tasks Can work in different settings – On-line: no model necessary.

• MC can estimate Q when model isn’t available– Want to learn Q*

• Converges asymptotically if every state-action pair is visited “enough”– Need to explore: why?

• Exploring starts: every state-action pair has non-zero probability of being starting pair– Why necessary?– Still necessary if π is probabilistic?

Page 8: Monte Carlo Methods. Learn from complete sample returns – Only defined for episodic tasks Can work in different settings – On-line: no model necessary.

MC for Control (Exploring Starts)

•

Page 9: Monte Carlo Methods. Learn from complete sample returns – Only defined for episodic tasks Can work in different settings – On-line: no model necessary.

• Q0(.,.) = 0

• π0 = ?• Action: 1 or 2• State: amount of money• Transition: based on state, flip outcome, and

action• Reward: amount of money• Return: money at end of episode

Page 10: Monte Carlo Methods. Learn from complete sample returns – Only defined for episodic tasks Can work in different settings – On-line: no model necessary.

MC for Control, On-Policy

• Soft policy: – π(s,a) > 0 for all s,a– ε-greedy

• Can prove version of policy improvement for soft policies

Page 11: Monte Carlo Methods. Learn from complete sample returns – Only defined for episodic tasks Can work in different settings – On-line: no model necessary.

MC for Control, On-Policy

Page 12: Monte Carlo Methods. Learn from complete sample returns – Only defined for episodic tasks Can work in different settings – On-line: no model necessary.

Off-Policy Control

• Why consider off-policy methods?

Page 13: Monte Carlo Methods. Learn from complete sample returns – Only defined for episodic tasks Can work in different settings – On-line: no model necessary.

Off Policy Control

• Learn about π while following π’– Behavior policy vs. Estimation policy

Page 14: Monte Carlo Methods. Learn from complete sample returns – Only defined for episodic tasks Can work in different settings – On-line: no model necessary.

Page 15: Monte Carlo Methods. Learn from complete sample returns – Only defined for episodic tasks Can work in different settings – On-line: no model necessary.

Off policy learning

• Why only learn from tail?• How choose policy to follow?

AS MONACO · TO MONACO Fabulous events in the Principality Football Prince’s Palace Monte-Carlo Monaco Monte-Carlo Monte-Carlo Monaco Monte-Carlo AS Monaco of Monaco Casino Grand

AS MONACO · TO MONACO Fabulous events in the Principality Football Prince’s Palace Monte-Carlo Monaco Monte-Carlo Monte-Carlo Monaco Monte-Carlo AS Monaco of Monaco Casino Grand

CLASSICAL MONTE CARLO & METROPOLIS ALGORITHMstaff.ustc.edu.cn/~yjdeng/CQMC2012/lecture_notes/part1.pdf · CLASSICAL MONTE CARLO & METROPOLIS ALGORITHM Monte Carlo (MC) simulations

CLASSICAL MONTE CARLO & METROPOLIS ALGORITHMstaff.ustc.edu.cn/~yjdeng/CQMC2012/lecture_notes/part1.pdf · CLASSICAL MONTE CARLO & METROPOLIS ALGORITHM Monte Carlo (MC) simulations

14 Monte Carlo methodsquantum.phys.unm.edu/467-17/chap14.pdf · 2017. 4. 5. · 14 Monte Carlo methods 14.1 The Monte Carlo method The Monte Carlo method is simple, robust, and useful.

14 Monte Carlo methodsquantum.phys.unm.edu/467-17/chap14.pdf · 2017. 4. 5. · 14 Monte Carlo methods 14.1 The Monte Carlo method The Monte Carlo method is simple, robust, and useful.

Monte Carlo Simulation of Stochastic Processescacs.usc.edu/education/cs596/09Stochastic.pdf · Monte Carlo Simulation of Stochastic Processes MONTE CARLO METHOD • Monte Carlo ...

Monte Carlo Simulation of Stochastic Processescacs.usc.edu/education/cs596/09Stochastic.pdf · Monte Carlo Simulation of Stochastic Processes MONTE CARLO METHOD • Monte Carlo ...

Monte Carlo Methods - Rupam Mahmood...is extended to the Monte Carlo case in which only sample experience is available. 5.1 Monte Carlo Prediction We begin by considering Monte Carlo

Monte Carlo Methods - Rupam Mahmood...is extended to the Monte Carlo case in which only sample experience is available. 5.1 Monte Carlo Prediction We begin by considering Monte Carlo

Monte Carlo approach to uncertainty analyses in forestry ... Monte Carlo... · inventory Monte Carlo Small or large ... To do Monte Carlo simulations, ... Monte Carlo simulation #

Monte Carlo approach to uncertainty analyses in forestry ... Monte Carlo... · inventory Monte Carlo Small or large ... To do Monte Carlo simulations, ... Monte Carlo simulation #

Monte Carlo Method - Monte Carlo Simulation · Monte Carlo Method Monte Carlo Simulation Peter Frank Perroni December 1, 2015 Peter Frank Perroni Monte Carlo Method

Monte Carlo Method - Monte Carlo Simulation · Monte Carlo Method Monte Carlo Simulation Peter Frank Perroni December 1, 2015 Peter Frank Perroni Monte Carlo Method

Quantum Monte Carlo - academic.sun.ac.zaacademic.sun.ac.za/somerskool/2005/documents/lecture_notes/... · 1 ceperley random walks Quantum Monte Carlo 1. Introduction to Monte Carlo:

Quantum Monte Carlo - academic.sun.ac.zaacademic.sun.ac.za/somerskool/2005/documents/lecture_notes/... · 1 ceperley random walks Quantum Monte Carlo 1. Introduction to Monte Carlo:

Monte Carlo and quasi-Monte Carlo for image synthesisstatweb.stanford.edu/~owen/pubtalks/egsr2014post.pdf · 2014-06-29 · Monte Carlo and quasi-Monte Carlo for image synthesis Art

Monte Carlo and quasi-Monte Carlo for image synthesisstatweb.stanford.edu/~owen/pubtalks/egsr2014post.pdf · 2014-06-29 · Monte Carlo and quasi-Monte Carlo for image synthesis Art

Fatin Sezgin - MCQMC2010 - Monte Carlo and Quasi-Monte Carlo

Fatin Sezgin - MCQMC2010 - Monte Carlo and Quasi-Monte Carlo

Monte Carlo tools for LHC - INFN Lecce web · Monte Carlo tools for LHC Contents Basics of Monte Carlo methods Monte Carlo programs for particle physics 1. Monte Carlo integrators

Monte Carlo tools for LHC - INFN Lecce web · Monte Carlo tools for LHC Contents Basics of Monte Carlo methods Monte Carlo programs for particle physics 1. Monte Carlo integrators

Monte Carlo and Kinetic Monte Carlo Methods – A Tutorial€¦ · Monte Carlo and Kinetic Monte Carlo Methods – A Tutorial Peter Kratzer published in Multiscale Simulation Methods

Monte Carlo and Kinetic Monte Carlo Methods – A Tutorial€¦ · Monte Carlo and Kinetic Monte Carlo Methods – A Tutorial Peter Kratzer published in Multiscale Simulation Methods

Monte Carlo Trails Park Concept Plan Phase One Monte Carlo ... · 5/29/2018 · Monte Carlo Trails Park Concept Plan Phase One Parque de Senderos Monte Carlo Plan de Conceptual Fase

Monte Carlo Trails Park Concept Plan Phase One Monte Carlo ... · 5/29/2018 · Monte Carlo Trails Park Concept Plan Phase One Parque de Senderos Monte Carlo Plan de Conceptual Fase

Markov Chain Monte Carlo Methods By Marc Sobel. Exploring Monte Carlo A biulding in the country of Monte Carlo:

Markov Chain Monte Carlo Methods By Marc Sobel. Exploring Monte Carlo A biulding in the country of Monte Carlo:

Monte Carlo Simulationathreya/Teaching/statistics1/dootika.pdf · Monte Carlo Simulation Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms

Monte Carlo Simulationathreya/Teaching/statistics1/dootika.pdf · Monte Carlo Simulation Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms

Monte Carlo and quasi-Monte Carlo methods - PKUdsec.pku.edu.cn/~tieli/notes/numer_anal/MCQMC_Caflisch.pdf · MONTE CARLO AND QUASI-MONTE CARLO 3 quasi-random points converges more

Monte Carlo and quasi-Monte Carlo methods - PKUdsec.pku.edu.cn/~tieli/notes/numer_anal/MCQMC_Caflisch.pdf · MONTE CARLO AND QUASI-MONTE CARLO 3 quasi-random points converges more

Monte Carlo Methods and Stochastic Algorithmsbl/monte-carlo/poly.pdf · 1.1. ON THE CONVERGENCE RATE OF MONTE-CARLO METHODS 3 1.1 On the convergence rate of Monte-Carlo methods In

Monte Carlo Methods and Stochastic Algorithmsbl/monte-carlo/poly.pdf · 1.1. ON THE CONVERGENCE RATE OF MONTE-CARLO METHODS 3 1.1 On the convergence rate of Monte-Carlo methods In

Quasi-Monte Carlo Variational InferenceQuasi-Monte Carlo Variational Inference 3. Quasi-Monte Carlo Variational Inference In this Section, we introduce Quasi-Monte Carlo Varia-tionalInference(qmcvi),usingrandomizedqmc

Quasi-Monte Carlo Variational InferenceQuasi-Monte Carlo Variational Inference 3. Quasi-Monte Carlo Variational Inference In this Section, we introduce Quasi-Monte Carlo Varia-tionalInference(qmcvi),usingrandomizedqmc

Maximum Entropy Monte-Carlo Planning › paper › 9148-maximum-entropy-monte-carlo-p… · 2.2 Monte Carlo Tree Search and UCT To solve the online planning task, Monte Carlo Tree

Maximum Entropy Monte-Carlo Planning › paper › 9148-maximum-entropy-monte-carlo-p… · 2.2 Monte Carlo Tree Search and UCT To solve the online planning task, Monte Carlo Tree

Languages

Pages

Legal

Copyright © 2022 FDOCUMENTS