policy reinforcement learning
a policy π is a function that takes as input a state s and returns an action a.
That is: π(s) → a
a policy π is a probability distribution over actions given states.
policy reinforcement learning
a policy π is a function that takes as input a state s and returns an action a.
That is: π(s) → a
a policy π is a probability distribution over actions given states.
policy reinforcement learning
state----action----probability/'goodness' of taking the action
1 1 0.6
1 2 0.4
2 1 0.3
2 2 0.7
Copyright © 2021 Codeinu
Forgot your account's password or having trouble logging into your Account? Don't worry, we'll help you to get back your account. Enter your email address and we'll send you a recovery link to reset your password. If you are experiencing problems resetting your password contact us