Publications

Ramoneda P.; Suzuki M.; Maezawa A.; Serra X.. Difficulty-Aware Score Generation for Piano Sight-Reading. Expert Systems with Applications 2026; : . Publication link Full text
Nuttall, Thomas; Manickavasakan, Brindha; Serra, Xavier; Pearson, Lara. Boundary characteristics of Sancara segments in carnatic music. In: Neumann J (ed.). DLfM '26: Proceedings of the 13th International Conference on Digital Libraries for Musicology. . : Association for Computing Machinery; 2026. p. 63-70. Publication link Full text
Ibáñez-Martínez L; Nkama C; Poltronieri A; Serra X; Rocamora M. Evaluating disentangled representations for controllable music generation. In: IEEE. ICASSP 2026 - 2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). . : IEEE; 2026. p. 15092-15096. Publication link Full text
Serra X; Araz RO; Batlle-Roca R; Juvela L; López D; Rocamora M. Technical solutions for marking and detecting AI-generated audio content in the context of article 50(2) AI Act: final study report. European Comission; 2026. Publication link Full text
Plaja-Roglans G; Hung YN; Serra X; Pereira I. Generating Separated Singing Vocals Using a Diffusion Model Conditioned on Music Mixtures. arXiv.org; 2025. Publication link
Cortès-Sebastià G; Miron M; Molina E; Ciurana A; Serra X. Enhanced television broadcast monitoring with source separation-assisted audio fingerprinting: A case study. Multimedia Tools and Applications 2025; : . Publication link Full text
Kim H; Benetos E; Serra X. Velocity2DMs: A Contextual Modeling Approach to Dynamics Marking Prediction in Piano Performance. IEEE Signal Processing Letters 2025; 32(0): 4459-4463. Publication link Full text
Pérez M; Kirchhoff H; Grosche P; Serra X. Singing Voice Accompaniment Data Augmentation with Generative Models. In: -. 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops. . Piscataway: IEEE; 2025. . . Publication link
Nuttall T; Serra X; Pearson L. Svara-Forms in Carnatic Music: Contextual Influences on the Performance of Svara. In: -. 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops. . Piscataway: IEEE; 2025. . . Publication link Full text
Shankar A; Schweinitz S; Plaja-Roglans G; Serra X; Rocamora M. Disentangling Overlapping Sources: Improving Vocal and Violin Source Separation in Carnatic Music. In: -. 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops. . Piscataway: IEEE; 2025. . . Publication link Full text
Plaja-Roglans G; Serra X; Rocamora M. Leveraging Carnatic live recordings for singing voice separation using regression-guided latent diffusion. In: -. 26th International Society for Music Information Retrieval Conference (ISMIR 2025). . : ; 2025. . . Full text
Poltronieri A; Serra X; Rocamora M. From discord to harmony: decomposed consonance-based training for improved audio chord estimation. In: -. 26th International Society for Music Information Retrieval Conference (ISMIR 2025). . : ; 2025. . . Full text
Batlle-Roca R; Ibáñez-Martínez L; Serra X; Gómez E; Rocamora M. MusGO: a community-driven framework for assessing openness in music-generative AI. In: -. 26th International Society for Music Information Retrieval Conference (ISMIR 2025). . : ; 2025. . . Full text
Plaja-Roglans G; Hung YN; Serra X; Pereira I. Efficient and Fast Generative-Based Singing Voice Separation using a Latent Diffusion Model. In: -. International Joint Conference on Neural Networks (IJCNN 2025). . : IEEE; 2025. . . Publication link Full text
Perez M.; Kirchhoff H.; Grosche P.; Serra X.. Improving Singing Voice Transcription Generalization with AI Generated Accompaniments. In: Ide I., Kompatsiaris I., Xu C., Yanai K., Chu W. T., Nitta N., Riegler M., Yamasaki T. (eds.). MultiMedia Modeling. MMM 2025. Lecture Notes in Computer Science. . : Springer; 2025. p. 115-128. Publication link
Anastasopoulou P; Ardan Dal Rí F; Serra X; Font F. Hierarchical and multimodal learning for heterogeneous sound classification. In: Benetos E; Font F; Fuentes M; Martin Morato I; Rocamora M. Proceedings of the 10th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE 2025). . : DCASE; 2025. p. 105-109. Publication link Full text
Fernández MP; Grosche P; Kirchhoff H; Serra X. Refining audio-to-score alignment for singing voice transcription. In: -. Proceedings of the 22nd Sound and Music Computing Conference (SMC 2025). . Graz: Institute of Electronic Music and Acoustics, University of Music and Performing Arts Graz; 2025. p. 168-175. Publication link
Morsi A; Chiruthapudi S; Peter S; Pilkov I; Bishop L; Maezawa A; Serra X; Cancino-Chacón CE. Enabling Empirical Analysis of Piano Performance Rehearsal with the Rach3 MIDI Dataset. In: -. Proceedings of the 26th International Society for Music Information Retrieval Conference (ISMIR 2025). . : International Society for Music Information Retrieval; 2025. p. 484-491. Publication link Full text
Araz R.O.; Cortès-Sebastià G; Molina E; Serrà J; Serra X; Mitsufuji Y; Bogdanov D. Enhancing neural audio fingerprint robustness to audio degradation for music identification. In: -. Proceedings of the 26th International Society for Music Information Retrieval Conference (ISMIR 2025). . : International Society for Music Information Retrieval; 2025. p. 413-420.
Ramoneda P; Jeong D; Eremenko V; Tamer NC; Miron M; Serra X. Combining piano performance dimensions for score difficulty classification. Expert Systems with Applications 2024; 238(PartB): 1-15. Publication link Full text