MARC View

000			nam a22 7a 4500
999			_c29072 _d29072
008			180820b xxu\|\|\|\|\| \|\|\|\| 00\| 0 eng d
020			_a9781608454921
082			_a006.31 _bSZE
100			_aSzepesvari, Csaba
245			_aAlgorithms for reinforcement learning
260			_bMorgan & Claypool, _c2010 _aUK:
300			_axii, 89 p. : _bill.; _c23.5 cm.
365			_aUS$ _b35.00
440			_aSynthesis lectures on artificial intelligence and machine learning #9
504			_aIncludes bibliographical references.
520			_aReinforcement learning is a learning paradigm concerned with learning to control a system so as to maximize a numerical performance measure that expresses a long-term objective. What distinguishes reinforcement learning from supervised learning is that only partial feedback is given to the learner about the learner's predictions. Further, the predictions may have long term effects through influencing the future state of the controlled system. Thus, time plays a special role. The goal in reinforcement learning is to develop efficient learning algorithms, as well as to understand the algorithms'
650			_aMachine learning
650			_aNatural gradient
650			_aPolicy gradient
650			_aActor-critic methods
650			_aQ-learning
650			_aPAC-learning
650			_aPlanning
650			_aSimulation
650			_aOnline learning
650			_aActive learning
650			_aBias-variance tradeoff
650			_aOverfitting
650			_aLeast-squares methods
650			_aStochastic gradient methods
650			_aFunction approximation
650			_aSimulation optimization
650			_aTwo-timescale stochastic approximation
650			_aMonte-Carlo methods
650			_aStochastic approximation
650			_aMathematical models
650			_aTemporal difference learning
650			_aEngineering & Applied Sciences
650			_aMarkov decision processes
942			_2ddc _cBK