Temporal difference learning with Interpolated N-Tuple networks: initial results on pole balancing

Abdullahi, Aisha A and Lucas, Simon M (2010) Temporal difference learning with Interpolated N-Tuple networks: initial results on pole balancing. In: 2010 UK Workshop on Computational Intelligence (UKCI), 2010-09-08 - 2010-09-10.

Abstract

Temporal difference learning (TDL) is perhaps the most widely used reinforcement learning method and gives competitive results on a range of problems, especially when using linear or table-based function approximators. However, it has been shown to give poor results on some continuous control problems and an important question is how it can be applied to such problems more effectively. The crucial point is how TDL can be generalized and scaled to deal with complex, high-dimensional problems without suffering from the curse of dimensionality. We introduce a new function approximation architecture called the Interpolated N-Tuple network and perform a proof-of-concept test on a classic reinforcement learning problem of pole balancing. The results show the method to be highly effective on this problem. They offer an important counter-example to some recently reported results that showed neuroevolution outperforming TDL. The TDL with Interpolated N-Tuple networks learns to balance the pole considerably faster than the leading neuro-evolution techniques.

Item Metadata

Item Type:	Conference or Workshop Item (Paper)
Additional Information:	Published proceedings: 2010 UK Workshop on Computational Intelligence, UKCI 2010
Subjects:	Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:	Faculty of Science and Health Faculty of Science and Health > Computer Science and Electronic Engineering, School of
SWORD Depositor:	Unnamed user with email elements@essex.ac.uk
Depositing User:	Unnamed user with email elements@essex.ac.uk
Date Deposited:	19 Oct 2012 14:38
Last Modified:	05 Dec 2024 21:46
URI:	http://repository.essex.ac.uk/id/eprint/4061

Temporal difference learning with Interpolated N-Tuple networks: initial results on pole balancing

Abstract

Item Metadata

Share and export

Available files

Statistics

Altmetrics

Downloads