000 nam a22 7a 4500
999 _c29072
_d29072
008 180820b xxu||||| |||| 00| 0 eng d
020 _a9781608454921
082 _a006.31
_bSZE
100 _aSzepesvari, Csaba
245 _aAlgorithms for reinforcement learning
260 _bMorgan & Claypool,
_c2010
_aUK:
300 _axii, 89 p. :
_bill.;
_c23.5 cm.
365 _aUS$
_b35.00
440 _aSynthesis lectures on artificial intelligence and machine learning #9
504 _aIncludes bibliographical references.
520 _aReinforcement learning is a learning paradigm concerned with learning to control a system so as to maximize a numerical performance measure that expresses a long-term objective. What distinguishes reinforcement learning from supervised learning is that only partial feedback is given to the learner about the learner's predictions. Further, the predictions may have long term effects through influencing the future state of the controlled system. Thus, time plays a special role. The goal in reinforcement learning is to develop efficient learning algorithms, as well as to understand the algorithms'
650 _aMachine learning
650 _aNatural gradient
650 _aPolicy gradient
650 _aActor-critic methods
650 _aQ-learning
650 _aPAC-learning
650 _aPlanning
650 _aSimulation
650 _aOnline learning
650 _aActive learning
650 _aBias-variance tradeoff
650 _aOverfitting
650 _aLeast-squares methods
650 _aStochastic gradient methods
650 _aFunction approximation
650 _aSimulation optimization
650 _aTwo-timescale stochastic approximation
650 _aMonte-Carlo methods
650 _aStochastic approximation
650 _aMathematical models
650 _aTemporal difference learning
650 _aEngineering &​ Applied Sciences
650 _aMarkov decision processes
942 _2ddc
_cBK