Nisioti, Eleni (2021) Reinforcement Learning-based Optimization of Multiple Access in Wireless Networks. PhD thesis, University of Essex.
Nisioti, Eleni (2021) Reinforcement Learning-based Optimization of Multiple Access in Wireless Networks. PhD thesis, University of Essex.
Nisioti, Eleni (2021) Reinforcement Learning-based Optimization of Multiple Access in Wireless Networks. PhD thesis, University of Essex.
Abstract
In this thesis, we study the problem of Multiple Access (MA) in wireless networks and design adaptive solutions based on Reinforcement Learning (RL). We analyze the importance of MA in the current communications scenery, where bandwidth-hungry applications emerge due to the co-evolution of technological progress and societal needs, and explain that improvements brought by new standards cannot overcome the problem of resource scarcity. We focus on resource-constrained networks, where devices have restricted hardware-capabilities, there is no centralized point of control and coordination is prohibited or limited. The protocols that we optimize follow a Random Access (RA) approach, where sensing the common medium prior to transmission is not possible. We begin with the study of time access and provide two reinforcement learning algorithms for optimizing Irregular Repetition Slotted ALOHA (IRSA), a state-of-the-art RA protocol. First, we focus on ensuring low complexity and propose a Q-learning variant where learners act independently and converge quickly. We, then, design an algorithm in the area of coordinated learning and focus on deriving convergence guarantees for learning while minimizing the complexity of coordination. We provide simulations that showcase how coordination can help achieve a fine balance, in terms of complexity and performance, between fully decentralized and centralized solutions. In addition to time access, we study channel access, a problem that has recently attracted significant attention in cognitive radio. We design learning algorithms in the framework of Multi-player Multi-armed Bandits (MMABs), both for static and dynamic settings, where devices arrive at different time steps. Our focus is on deriving theoretical guarantees and ensuring that performance scales well with the size of the network. Our works constitute an important step towards addressing the challenges that the properties of decentralization and partial observability, inherent in resource-constrained networks, pose for RL algorithms.
Item Type: | Thesis (PhD) |
---|---|
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | Faculty of Science and Health > Computer Science and Electronic Engineering, School of |
Depositing User: | Eleni Nisioti |
Date Deposited: | 08 Mar 2021 12:20 |
Last Modified: | 08 Mar 2021 12:20 |
URI: | http://repository.essex.ac.uk/id/eprint/30006 |
Available files
Filename: report.pdf