[PhD Thesis] Unsupervised learning for parametric optimization in wireless networks

[PhD Thesis] Unsupervised learning for parametric optimization in wireless networks

Author: Rasoul Nikbakht Silab

Supervisors: Àngel Lozano Solsona

This thesis studies parametric optimization in cellular and cell-free networks, exploring data-based and expert-based paradigms. Power allocation and power control, which adjust the transmit power to meet different fairness criteria such as max-min or max-product, are crucial tasks in wireless communications that fall into the parametric optimization category. The state-of-the-art approaches for power control and power allocation often demand huge computational costs and are not suitable for real-time applications. To address this issue, we develop a general-purpose unsupervised-learning approach for solving parametric optimizations; and extend the well-known fractional power control algorithm. In the data-based paradigm, we create an unsupervised learning framework that defines a custom neural network (NN), incorporating expert knowledge to the NN loss function to solve the power control and power allocation problems. In this approach, a feedforward NN is trained by repeatedly sampling the parameter space, but, rather than solving the associated optimization problem completely, a single step is taken along the gradient of the objective function. The resulting method is applicable for both convex and non-convex optimization problems. It offers two-to-three orders of magnitude speedup in the power control and power allocation problems compared to a convex solver—whenever appliable. In the expert-driven paradigm, we investigate the extension of fractional power control to cell-free networks. The resulting closed-form solution can be evaluated for uplink and downlink effortlessly and reaches an (almost) optimum solution in the uplink case. In both paradigms, we place a particular focus on large scale gains—the amount of attenuation experienced by the local-average received power. The slow-varying nature of the large-scale gains relaxes the need for a frequent update of the solutions in both the data-driven and expert-driven paradigms, enabling real-time application for both methods. 

Link to manuscript: http://hdl.handle.net/10803/671246


[PhD Thesis] Audio-visual deep learning methods for musical instrument classification and separation

[PhD Thesis] Audio-visual deep learning methods for musical instrument classification and separation

Author: Olga Slizovskaia

Supervisors: Emilia Gómez, Gloria Haro

In music perception, the information we receive from a visual system and audio system is often complementary. Moreover, visual perception plays an important role in the overall experience of being exposed to a music performance. This fact brings attention to machine learning methods that could combine audio and visual information for automatic music analysis. This thesis addresses two research problems: instrument classification and source separation in the context of music performance videos. A multimodal approach for each task is developed using deep learning techniques to train an encoded representation for each modality. For source separation, we also study two approaches conditioned on instrument labels and examine the influence that two extra sources of information have on separation performance compared with a conventional model. Another important aspect of this work is in the exploration of different fusion methods which allow for better multimodal integration of information sources from associated domains.

 

Link to manuscript: http://hdl.handle.net/10803/669963

 


[PhD Thesis] Responsive spectrum management for wireless local area networks: from heuristic-based policies to model-free reinforcement learning

[PhD Thesis] Responsive spectrum management for wireless local area networks: from heuristic-based policies to model-free reinforcement learning

Author: Sergio Barrachina

Supervisor: Boris Bellalta

In this thesis, we focus on the so-called spectrum management's joint problem: efficient allocation of primary and secondary channels in channel bonding wireless local area networks (WLANs). From IEEE 802.11n to more recent standards like 802.11ax and 802.11be, bonding channels together is permitted to increase transmissions' bandwidth. While such an increase favors the potential network capacity and the activation of higher transmission rates, it comes at the price of reduced power per Hertz and accentuated issues on contention and interference with neighboring nodes. So, if WLANs were per se complex deployments, they are becoming even more complicated due to the increasing node density and the new technical features required by novel highly bandwidth-demanding applications. This dissertation provides an in-depth study of channel allocation and channel bonding in WLANs and discusses the suitability of solutions ranging from heuristic-based to reinforcement learning (RL)-based. To characterize channel bonding in saturated WLANs, we first propose an analytical model based on continuous-time Markov networks (CTMNs). This model relies on a novel, purpose-designed algorithm that generates CTMNs from spatially distributed scenarios, where nodes are not required to be within the carrier sense range of each other. We identify the key factors affecting the throughput and fairness of different channel bonding policies and expose critical interrelations among nodes in the spatial domain. By extending the analytical model to support unsaturated regimes, we highlight the benefits of allocating channels as wide as possible all together with adaptive policies to cope with unfair situations. Apart from the analytical model, this thesis relies on simulations to generalize channel bonding in dense scenarios while avoiding costly, sometimes unfeasible, experimental testbeds. Unfortunately, existing wireless network simulators tend to be too simplistic or too computational demanding. That is why we develop the Komondor wireless network simulator, with the essential advantage over other well-known simulators lying in its high event processing rate. We then deviate from analytical models and simulations and tackle real measurements through the Wi-Fi All-Channel Analyzer (WACA), the first system specifically designed to simultaneously measure the energy in all the 24 bondable Wi-Fi channels at the 5 GHz band. With WACA, we perform a first-of-its-kind spectrum measurement in areas including urban hotspots, residential neighborhoods, universities, and even a football match in Futbol Club Barcelona’s Camp Nou stadium. Our experimental findings reveal the underpinning factors controlling throughput gain, from which we highlight the inter-channel correlation. %We show the significance of the gathered dataset for finding new insights, which would not be possible otherwise, given that simple channel occupancy models severely underestimate the potential gains. As for solution proposals, we first cover heuristic-based approaches to find satisfactory configurations quickly. In this regard, we propose dynamic-wise (DyWi), a lightweight, decentralized, online primary channel selection algorithm for dynamic channel bonding. DyWi improves the expected WLAN throughput by considering not only the occupancy of the target primary channel but also the activity in the secondary channels. Even when assuming significant delays due to primary channel switching, simulations reveal important throughput and delay improvements. Finally, we identify machine learning (ML) approaches applicable to the spectrum management problem in WLANs and justify why model-free RL suits it the most. In particular, we put the focus on the adequate performance of stateless variations of RL and anticipate multi-armed bandits as the right solution since i) we need fast adaptability to suit user experience in dynamic Wi-Fi scenarios and ii) the number of multichannel configurations a network can adopt is limited; thus, agents can fully explore the action space in a reasonable time.

Link to manuscript: http://hdl.handle.net/10803/670782


[PhD thesis] The Orchestration of computer-supported collaboration scripts with learning analytics

[PhD thesis] The Orchestration of computer-supported collaboration scripts with learning analytics

Author: Ishari Amarasinghe

Supervisors: Davinia Hernández-Leo; Anders Jonsson

Computer-supported collaborative learning (CSCL) creates avenues for productive collaboration between students. In CSCL, collaborative learning flow patterns (CLFPs) provide pedagogical rationale and constraints for structuring the collaboration process. While structured collaboration facilitates the design of favourable learning conditions, orchestration of collaboration becomes an important factor, as learner participation and real-world constraints can create deviations in real time. On the one hand, limited research has examined the orchestration challenges related to collaborative learning situations scripted according to CLFPs in authentic educational contexts to resolve collaboration at different scales. On the other hand, learning analytics (LA) can be used to provide proper technological tooling, infrastructure and support to orchestrate collaboration. To this end, this dissertation addresses the following research question: How can LA support orchestration mechanisms for scripted CSCL? To address this question, this dissertation first focuses on studying the orchestration challenges associated with scripted CSCL situations on small scales (in the classroom learning context) and large scales (in the distance learning context, specifically in massive open online courses [MOOCs]). In the classroom learning context, lack of teacher access to activity regulation mechanisms constituted a key challenge. In MOOCs, sustained student participation in multiple phases of the script was a primary challenge. The dissertation also focuses on studying the design of LA interventions that might address the orchestration challenges under examination. The proposed LA interventions range from human-in-control to machine-in-control in nature given the feasibility and regulation needs of the learning contexts under investigation. Following a design-based research (DBR) methodology, evaluation studies were conducted in naturalistic classrooms and in MOOCs to evaluate the effects of the proposed LA interventions and to understand the conditions for their successful implementation. The results of the evaluation studies conducted in the classroom context shed light on how teachers interpret LA data and how they action the resulting knowledge in authentic collaborative learning situations. In the distance learning context, the proposed interventions were critical in sustaining continuous flows of collaboration. The practical benefits and limitations of deploying LA solutions in real-world settings, as well as future research directions, are outlined.

Link to manuscript: http://hdl.handle.net/10803/670420


[PhD thesis] Machine learning and deep neural networks approach to modelling musical gestures

[PhD thesis] Machine learning and deep neural networks approach to modelling musical gestures

Author: David Cabrera Dalmazzo

Supervisor: Rafael Ramírez

Gestures can be defined as a form of non-verbal communication associated with an intention or an emotional state articulation. They are not only intrinsically part of the human language, but also explain specific details of a body-knowledge execution. Gestures are being studied not only in the language research field but also in dance, sports, rehabilitation, and music; where the term is understood as a “learned technique of the body”. Therefore, in music education, gestures are assumed as automatic-motor abilities learned by repetitional practice, to self-teach and fine-tune the motor actions optimally. Hence, those gestures are intended to be part of the performer’s technical repertoire to take fast actions/decisions on-the flight, assuming that they are not only relevant in music expressive capabilities but also, a method for a correct ‘energy-consumption’ habit development to avoid injuries. In this thesis, we applied state-of-the-art machine learning (ML) techniques to model violin bowing gestures in professional players. Concretely, we recorded a database of expert performers and different student levels and developed three strategies to classify and recognise those gestures in real-time: a) First, we developed a multimodal synchronisation system to record audio, video and IMU sensor data with a unified time reference. We programmed a custom C++ application to visualise the output from the ML models. We implemented a Hidden Markov Model to detect fingering disposition and bow-stroke gesture performance. b) A second approach is a system that extracts general time features from the gestures samples, creating a dataset of audio and motion data from expert performers implementing a Deep Neural Networks algorithm. To do so, we have implemented the hybrid model CNN LSTM architecture. c) Furthermore, a Melspectrogram based analysis that can read and extract patterns from only audio data, opening the option of recognising relevant information from the audio recordings without the need for external sensors to achieve similar results. All of these techniques are complementary and also incorporated into an education application as a computer assistant to enhance music-learners practice by providing useful real-time feedback. The application will be tested in a professional education institution.

Link to manuscript: http://hdl.handle.net/10803/670399