RLtoolbox Function
Off policy Evaluation
Calling Sequence
- [N, D, Q]=Mc_Off_Policy_Evaluation(Episode, tau, PiPrime, N, D, Q)
Parameters
- Episode
: List of each action-value during episode
- tau
:
- PiPrime
: Policy to be evaluated
- N
: Numerator of the evaluative function
- D
: Denominator of the evaluative function
- Q
: The evaluative function
Description
Evaluate the policy given in each Episodes, and returned the modified Returns.
Examples
None
See Also
Mc Off Policy Improvement