Answers for "policy reinforcement learning"

0

policy reinforcement learning

a policy π is a function that takes as input a state s and returns an action a.
That is: π(s) → a
a policy π is a probability distribution over actions given states.
Posted by: Guest on October-10-2021
0

policy reinforcement learning

state----action----probability/'goodness' of taking the action
1         1                     0.6 
1         2                     0.4 
2         1                     0.3
2         2                     0.7
Posted by: Guest on October-10-2021

Browse Popular Code Answers by Language