Reinforcement Learning

  • You don’t know the rules, only get a reward signal. Want to maximize reward
  • State: recent history, etc
  • Actions: things you can do
  • Transition function: based on state. Determines scoring
  • Minimax: minimize opponent maximize self
  • Don’t start with reinforcement learning. If human can’t determine good features machine won’t be able to