Publications

Split Learning for Sensing-Aided Single and Multi-Level Beam Selection in Multi-Vendor RAN

Published in IEEE Globecom23, 2023

Proper and efficient beam selection is of great importance to unleash the full potential of mmWave communications. Traditionally, each candidate beam is evaluated using reference signals (beam sweeping), however, the exhaustive search method can be time-consuming with high signaling overhead. To avoid such problems, in B5G and 6G, sensing information is considered to be used, as in Integrated Sensing and Communication (ISAC) solutions, and Machine Learning (ML) methods can be applied to map sensing data inputs to an optimal beam index. When using sensing information sources external to the Radio Access Network (RAN) in a multi-vendor disaggregated environment, those methods need to account for issues such as privacy and data ownership. In this work, we apply multi-modal sensing information to the beam selection task. Specifically, we propose a multi-modal sensing-aided ML strategy based on Split Learning (SL) that can cope with deployment challenges in novel RAN architectures. Moreover, the method is applied to single and multi-level beam selection decisions, where the latter considers the case of hierarchical codebook structures. With the proposed approach, accuracy levels above 90\% can be achieved while overhead diminishes by 85\% or more. SL achieves comparable performance with centralized learning-based strategies, with the added value of accounting for privacy and data ownership issues. We also show that sensing-aided ML-based beam selection decisions in multi-level codebooks are more effective when applied to their first level.

Recommended citation: Y. Dantas, P. E. Iturria-Rivera, H. Zhou, M. Elsayed, M. Bavand, R. Gaigalas, S. Furr, and M. Erol-Kantarci, “Split Learning for Sensing-Aided Single and Multi-Level Beam Selection in Multi-Vendor RAN”, Dec. 2023. -

Deep Reinforcement Learning-Based Joint User Association and CU–DU Placement in O-RAN

Published in IEEE Transactions on Network and Service Management, 2022

Open Radio Access Networks (O-RAN) architecture is based on disaggregation, virtualization, openness, and intelligence. These features allow the RAN network functions (NFs) to be split into Central Unit (CU), Distributed Unit (DU), and Radio Unit (RU); and deployed on open hardware and cloud nodes as Virtualized Network Functions (VNFs) or Containerized Network Functions (CNFs). In this paper, we propose strategies for the placement of CU and DU network functions in the regional and edge O-Cloud nodes while jointly associating the users to RUs. The aim is to minimize the end-to-end delay of users and minimize the cost of O-RAN deployment. Thus, we first formulate the end-to-end delay, the cost, and the constraints. We then model the problem as a multi-objective optimization problem The optimization formulation consists of a huge number of constraints and variables. To provide a solution to the problem, we develop the corresponding Markov Decision Problem (MDP) and propose a Deep Q-Network (DQN)-based algorithm. The simulation results demonstrate that our proposed scheme reduces the average user delay up to 40% and the deployment cost up to 20% with respect to our baselines.

Recommended citation: R. Joda, T. Pamuklu, P. E. Iturria-Rivera and M. Erol-Kantarci, "Deep Reinforcement Learning-Based Joint User Association and CU–DU Placement in O-RAN," IEEE Transactions on Network and Service Management, vol. 19, no. 4, pp. 4097-4110, Dec. 2022. https://ieeexplore.ieee.org/document/9946423

Published in , 1900

Channel Selection for Wi-Fi 7 Multi-Link Operation via Optimistic-Weighted VDN and Parallel Transfer Reinforcement Learning

Published in IEEE PIMRC23, 2023

Dense and unplanned IEEE 802.11 Wireless Fidelity(Wi-Fi) deployments and the continuous increase of throughput and latency stringent services for users have led to machine learning algorithms to be considered as promising techniques in the industry and the academia. Specifically, the ongoing IEEE 802.11be EHT – Extremely High Throughput, known as Wi-Fi 7 – amendment propose, for the first time, Multi-Link Operation (MLO). Among others, this new feature will increase the complexity of channel selection due the novel multiple interfaces proposal. In this paper, we present a Parallel Transfer Reinforcement Learning (PTRL)-based cooperative Multi-Agent Reinforcement Learning (MARL) algorithm named Parallel Transfer Reinforcement Learning Optimistic-Weighted Value Decomposition Networks (oVDN) to improve intelligent channel selection in IEEE 802.11be MLO-capable networks. Additionally, we compare the impact of different parallel transfer learning alternatives and a centralized non-transfer MARL baseline. Two PTRL methods are presented: Multi-Agent System (MAS) Joint Q-function Transfer, where the joint Q-function is transferred and MAS Best/Worst Experience Transfer where the best and worst experiences are transferred among MASs. Simulation results show that oVDNg – only the best experiences are utilized – is the best algorithm variant. Moreover, oVDNg offers a gain up to 3%, 7.2% and 11% when compared with VDN, VDN-nonQ and non-PTRL baselines. Furthermore, oVDNg experienced a reward convergence gain in the 5 GHz interface of 33.3% over oVDNb and oVDN where only worst and both types of experiences are considered, respectively. Finally, our best PTRL alternative showed an improvement over the non-PTRL baseline in terms of speed of convergence up to 40 episodes and reward up to 135%.

Recommended citation: P. E. Iturria-Rivera, M. Chenier, B. Herscovici, B. Kantarci and M. Erol-Kantarci, "Channel Selection for Wi-Fi 7 Multi-Link Operation via Optimistic-Weighted VDN and Parallel Transfer Reinforcement Learning", Sept. 2023. https://arxiv.org/abs/2307.05419

Hierarchical Reinforcement Learning Based Traffic Steering in Multi-RAT 5G Deployments (Best Paper Award)

Published in IEEE ICC23, 2023

In 5G non-standalone mode, an intelligent traffic steering mechanism can vastly aid in ensuring smooth user experience by selecting the best radio access technology (RAT) from a multi-RAT environment for a specific traffic flow. In this paper, we propose a novel load-aware traffic steering algorithm based on hierarchical reinforcement learning (HRL) while satisfying diverse QoS requirements of different traffic types. HRL can significantly increase system performance using a bi-level architecture having a meta-controller and a controller. In our proposed method, the meta-controller provides an appropriate threshold for load balancing, while the controller performs traffic admission to an appropriate RAT in the lower level. Simulation results show that HRL outperforms a Deep Q-Learning (DQN) and a threshold-based heuristic baseline with 8.49%, 12.52% higher average system throughput and 27.74%, 39.13% lower network delay, respectively.

Recommended citation: M. A. Habib, H. Zhou, P. E. Iturria-Rivera, M. Elsayed, M. Bavand, R. Gaigalas, S. Furr, and M. Erol-Kantarci, “Hierarchical Reinforcement Learning Based Traffic Steering in Multi-RAT 5G Deployments",May. 2023. https://arxiv.org/abs/2301.07818

RL meets Multi-Link Operation in IEEE 802.11be: Multi-Headed Recurrent Soft-Actor Critic-based Traffic Allocation (Best Paper Award)

Published in IEEE ICC23, 2023

IEEE 802.11be -Extremely High Throughput-, commercially known as Wireless-Fidelity (Wi-Fi) 7 is the newest IEEE 802.11 amendment that comes to address the increasingly throughput hungry services such as Ultra High Definition (4K/8K) Video and Virtual/Augmented Reality (VR/AR). To do so, IEEE 802.11be presents a set of novel features that will boost the Wi-Fi technology to its edge. Among them, Multi-Link Operation (MLO) devices are anticipated to become a reality, leaving Single-Link Operation (SLO) Wi-Fi in the past. To achieve superior throughput and very low latency, a careful design approach must be taken, on how the incoming traffic is distributed in MLO capable devices. In this paper, we present a Reinforcement Learning (RL) algorithm named Multi-Headed Recurrent Soft-Actor Critic (MH-RSAC) to distribute incoming traffic in 802.11be MLO capable networks. Moreover, we compare our results with two non-RL baselines previously proposed in the literature named: Single Link Less Congested Interface (SLCI) and Multi-Link Congestion-aware Load balancing at flow arrivals (MCAA). Simulation results reveal that the MH-RSAC algorithm is able to obtain gains in terms of Throughput Drop Ratio (TDR) up to 35.2% and 6% when compared with the SLCI and MCAA algorithms, respectively. Finally, we observed that our scheme is able to respond more efficiently to high throughput and dynamic traffic such as VR and Web Browsing (WB) when compared with the baselines. Results showed an improvement of the MH-RSAC scheme in terms of Flow Satisfaction (FS) of up to 25.6% and 6% over the the SCLI and MCAA algorithms.

Recommended citation: P. E. Iturria-Rivera, M. Chenier, B. Herscovici, B. Kantarci and M. Erol-Kantarci,"RL meets Multi-Link Operation in IEEE 802.11 be: Multi-Headed Recurrent Soft- Actor Critic-based Traffic Allocation", May. 2023. https://arxiv.org/abs/2303.08959

Uplink Scheduling in Federated Learning: an Importance-Aware Approach via Graph Representation Learning

Published in IEEE Workshops ICC23, 2023

Federated Learning (FL) has emerged as a promising framework for distributed training of AI-based services, applications, and network procedures in 6G. One of the major challenges affecting the performance and efficiency of 6G wireless FL systems is the massive scheduling of user devices over resource-constrained channels. In this work, we argue that the uplink scheduling of FL client devices is a problem with a rich relational structure. To address this challenge, we propose a novel, energy-efficient, and importance-aware metric for client scheduling in FL applications by leveraging Unsupervised Graph Representation Learning (UGRL). Our proposed approach introduces a relational inductive bias in the scheduling process and does not require the collection of training feedback information from client devices, unlike state-of-the-art importance-aware mechanisms. We evaluate our proposed solution against baseline scheduling algorithms based on recently proposed metrics in the literature. Results show that, when considering scenarios of nodes exhibiting spatial relations, our approach can achieve an average gain of up to 10% in model accuracy and up to 17 times in energy efficiency compared to state-of-the-art importance-aware policies.

Recommended citation: M. Skojac, P. E. Iturria-Rivera, M. Erol-Kantarci and R. Verdone, “Uplink Scheduling in Federated Learning: an Importance-Aware Approach via Graph Representation Learning ”, May. 2023. https://arxiv.org/abs/2301.11903

Beam Selection for Energy-Efficient mmWave Network Using Advantage Actor Critic Learning

Published in IEEE ICC23, 2023

The growing adoption of mmWave frequency bands to realize the full potential of 5G, turns beamforming into a key enabler for current and next-generation wireless technologies. Many mmWave networks rely on beam selection with Grid-of-Beams (GoB) approach to handle user-beam association. In beam selection with GoB, users select the appropriate beam from a set of pre-defined beams and the overhead during the beam selection process is a common challenge in this area. In this paper, we propose an Advantage Actor Critic (A2C) learning-based framework to improve the GoB and the beam selection process, as well as optimize transmission power in a mmWave network. The proposed beam selection technique allows performance improvement while considering transmission power improves Energy Efficiency (EE) and ensures the coverage is maintained in the network. We further investigate how the proposed algorithm can be deployed in a Service Management and Orchestration (SMO) platform. Our simulations show that A2C-based joint optimization of beam selection and transmission power is more effective than using Equally Spaced Beams (ESB) and fixed power strategy, or optimization of beam selection and transmission power disjointly. Compared to the ESB and fixed transmission power strategy, the proposed approach achieves more than twice the average EE in the scenarios under test and is closer to the maximum theoretical EE.

Recommended citation: Y. Dantas, P. E. Iturria-Rivera, H. Zhou, M. Elsayed, M. Bavand, R. Gaigalas, S. Furr, and M. Erol-Kantarci, “Beam Selection for Energy-Efficient mmWave Network Using Advantage Actor Critic Learning”, May. 2023. https://arxiv.org/pdf/2302.00156

Meta-Bandit: Spatial Reuse Adaptation via Meta-Learning in Distributed Wi-Fi 802.11ax

Published in IEEE Networking Letters, 2023

IEEE 802.11ax introduces several amendments to previous standards with a special interest in spatial reuse (SR) to respond to dense user scenarios with high demanding services. In dynamic scenarios with more than one Access Point, the adjustment of joint Transmission Power (TP) and Clear Channel Assessment (CCA) threshold remains a challenge. With the aim of mitigating Quality of Service (QoS) degradation, we introduce a solution that builds on meta-learning and multi-arm bandits. Simulation results show that the proposed solution can adapt with an average of 1250 fewer environment steps and 72% average improvement in terms of fairness and starvation than a transfer learning baseline.

Recommended citation: P. E. Iturria-Rivera, M. Chenier, B. Herscovici, B. Kantarci and M. Erol-Kantarci, "Meta-Bandit: Spatial Reuse Adaptation via Meta-Learning in Distributed Wi-Fi 802.11ax", IEEE Networking Letters, Apr. 2023. https://ieeexplore.ieee.org/document/10105943

Traffic Steering for 5G Multi-RAT Deployments using Deep Reinforcement Learning

Published in IEEE CCNC23, 2023

In 5G non-standalone mode, traffic steering is a critical technique to take full advantage of 5G new radio while optimizing dual connectivity of 5G and LTE networks in multiple radio access technology (RAT). An intelligent traffic steering mechanism can play an important role to maintain seamless user experience by choosing appropriate RAT (5G or LTE) dynamically for a specific user traffic flow with certain QoS requirements. In this paper, we propose a novel traffic steering mechanism based on Deep Q-learning that can automate traffic steering decisions in a dynamic environment having multiple RATs, and maintain diverse QoS requirements for different traffic classes. The proposed method is compared with two baseline algorithms: a heuristic-based algorithm and Q-learningbased traffic steering. Compared to the Q-learning and heuristic baselines, our results show that the proposed algorithm achieves better performance in terms of 6% and 10% higher average system throughput, and 23% and 33% lower network delay, respectively.

Recommended citation: M. A. Habib, H. Zhou, P. E. Iturria-Rivera, M. Elsayed, M. Bavand, R. Gaigalas, S. Furr, and M. Erol-Kantarci, “Traffic Steering for 5G Multi-RAT Deployments using Deep Reinforcement Learning,” (Accepted to 2023 IEEE CCNC), pp. 164-169, Jan. 2023. https://arxiv.org/abs/2301.05316

Hierarchical Deep Q-Learning Based Handover in Wireless Networks with Dual Connectivity

Published in IEEE Globecom22, 2022

5G New Radio proposes the usage of frequencies above 10 GHz to speed up LTE’s existent maximum data rates. However, the effective size of 5G antennas and consequently its repercussions in the signal degradation in urban scenarios makes it a challenge to maintain stable coverage and connectivity. In order to obtain the best from both technologies, recent dual connectivity solutions have proved their capabilities to improve performance when compared with coexistent standalone 5G and 4G technologies. Reinforcement learning (RL) has shown its huge potential in wireless scenarios where parameter learning is required given the dynamic nature of such context. In this paper, we propose two reinforcement learning algorithms: a single agent RL algorithm named Clipped Double Q-Learning (CDQL) and a hierarchical Deep Q-Learning (HiDQL) to improve Multiple Radio Access Technology (multi-RAT) dual-connectivity handover. We compare our proposal with two baselines: a fixed parameter and a dynamic parameter solution. Simulation results reveal significant improvements in terms of latency with a gain of 47.6% and 26.1% for Digital-Analog beamforming (BF), 17.1% and 21.6% for Hybrid-Analog BF, and 24.7% and 39% for Analog-Analog BF when comparing the RL-schemes HiDQL and CDQL with the with the existent solutions, HiDQL presented a slower convergence time, however obtained a more optimal solution than CDQL. Additionally, we foresee the advantages of utilizing context-information as geo-location of the UEs to reduce the beam exploration sector, and thus improving further multi-RAT handover latency results.

Recommended citation: P. E. Iturria-Rivera, M. Elsayed, M. Bavand, R. Gaigalas, S. Furr and M. Erol-Kantarci, "Hierarchical Deep Q-Learning Based Handover in Wireless Networks with Dual Connectivity", pp. 6553-6558, Dec. 2022. https://ieeexplore.ieee.org/document/10000894

Multi-Agent Team Learning in Virtualized Open Radio Access Networks (O-RAN)

Published in Sensors, 2022

Starting from the concept of the Cloud Radio Access Network (C-RAN), continuing with the virtual Radio Access Network (vRAN) and most recently with the Open RAN (O-RAN) initiative, Radio Access Network (RAN) architectures have significantly evolved in the past decade. In the last few years, the wireless industry has witnessed a strong trend towards disaggregated, virtualized and open RANs, with numerous tests and deployments worldwide. One unique aspect that motivates this paper is the availability of new opportunities that arise from using machine learning, more specifically multi-agent team learning (MATL), to optimize the RAN in a closed-loop where the complexity of disaggregation and virtualization makes well-known Self-Organized Networking (SON) solutions inadequate. In our view, Multi-Agent Systems (MASs) with MATL can play an essential role in the orchestration of O-RAN controllers, i.e., near-real-time and non-real-time RAN Intelligent Controllers (RIC). In this article, we first provide an overview of the landscape in RAN disaggregation, virtualization and O-RAN, then we present the state-of-the-art research in multi-agent systems and team learning as well as their application to O-RAN. We present a case study for team learning where agents are two distinct xApps: power allocation and radio resource allocation. We demonstrate how team learning can enhance network performance when team learning is used instead of individual learning agents. Finally, we identify challenges and open issues to provide a roadmap for researchers in the area of MATL based O-RAN optimization

Recommended citation: P. E. Iturria-Rivera, H. Zhang, H. Zhou, S. Mollahasani, and M. Erol-Kantarci, “Multi-Agent Team Learning in Virtualized Open Radio Access Networks (O-RAN),” Sensors, vol.22, no.14, pp.1-13, Jul. 2022. https://www.mdpi.com/1424-8220/22/14/5375

Competitive Multi-Agent Load Balancing with Adaptive Policies in Wireless Networks

Published in IEEE CCNC22, 2022

Using Machine Learning (ML) techniques for the next generation wireless networks have shown promising results in the recent years, due to high learning and adaptation capability of ML algorithms. More specifically, ML techniques have been used for load balancing in Self-Organizing Networks (SON). In the context of load balancing and ML, several studies propose network management automation (NMA) from the perspective of a single and centralized agent. However, a single agent domain does not consider the interaction among the agents. In this paper, we propose a more realistic load balancing approach using novel Multi-Agent Deep Deterministic Policy Gradient with Adaptive Policies (MADDPG-AP) scheme that considers throughput, resource block utilization and latency in the network. We compare our proposal with a single-agent RL algorithm named Clipped Double Q-Learning (CDQL) . Simulation results reveal a significant improvement in latency, packet loss ratio and convergence time.

Recommended citation: P. E. Iturria-Rivera and M. Erol-Kantarci, "Competitive Multi-Agent Load Balancing with Adaptive Policies in Wireless Networks" (Accepted to 2022 IEEE CCNC), pp. 796-801, Jan. 2022. https://ieeexplore.ieee.org/document/9946423

QoS-Aware Load Balancing in Wireless Networks using Clipped Double Q-Learning

Published in IEEE MASS21, 2021

In recent years, long-term evolution (LTE) and 5G NR (5 th Generation New Radio) technologies have showed great potential to utilize Machine Learning (ML) algorithms in optimizing their operations, both thanks to the availability of fine-grained data from the field, as well as the need arising from growing complexity of networks. The aforementioned complexity sparked mobile operators’ attention as a way to reduce the capital expenditures (CAPEX) and the operational (OPEX) expenditures of their networks through network management automation (NMA). NMA falls under the umbrella of Self-Organizing Networks (SON) in which 3GPP has identified some challenges and opportunities in load balancing mechanisms for the Radio Access Networks (RANs). In the context of machine learning and load balancing, several studies have focused on maximizing the overall network throughput or the resource block utilization (RBU). In this paper, we propose a novel Clipped Double Q-Learning (CDQL)-based load balancing approach considering resource block utilization, latency and the Channel Quality Indicator (CQI). We compare our proposal with a traditional handover algorithm and a resource block utilization based handover mechanism. Simulation results reveal that our scheme is able to improve throughput, latency, jitter and packet loss ratio in comparison to the baseline algorithms.

Recommended citation: P. E. Iturria-Rivera and M. Erol-Kantarci, "QoS-Aware Load Balancing in Wireless Networks using Clipped Double Q-Learning",(Accepted to 2021 IEEE MASS),pp. 10-16, Oct. 2021. https://ieeexplore.ieee.org/abstract/document/9637122

Pedro Enrique Iturria Rivera, PhD Candidate

Publications