Research Repository

Temporal difference learning with interpolated N-tuple networks: Initial results on pole balancing

Abdullahi, AA and Lucas, SM (2010) Temporal difference learning with interpolated N-tuple networks: Initial results on pole balancing. In: UNSPECIFIED, ? - ?.

Full text not available from this repository.

Abstract

Temporal difference learning (TDL) is perhaps the most widely used reinforcement learning method and gives competitive results on a range of problems, especially when using linear or table-based function approximators. However, it has been shown to give poor results on some continuous control problems and an important question is how it can be applied to such problems more effectively. The crucial point is how TDL can be generalized and scaled to deal with complex, high-dimensional problems without suffering from the curse of dimensionality. We introduce a new function approximation architecture called the Interpolated N-Tuple network and perform a proof-of-concept test on a classic reinforcement learning problem of pole balancing. The results show the method to be highly effective on this problem. They offer an important counter-example to some recently reported results that showed neuroevolution outperforming TDL. The TDL with Interpolated N-Tuple networks learns to balance the pole considerably faster than the leading neuro-evolution techniques.

Item Type: Conference or Workshop Item (Paper)
Additional Information: Published proceedings: 2010 UK Workshop on Computational Intelligence, UKCI 2010
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Faculty of Science and Health > Computer Science and Electronic Engineering, School of
Depositing User: Jim Jamieson
Date Deposited: 19 Oct 2012 14:38
Last Modified: 17 Aug 2017 18:07
URI: http://repository.essex.ac.uk/id/eprint/4061

Actions (login required)

View Item View Item