Thesis linked to the implementation of the María de Maeztu Strategic Research Program.

Open access to PhD thesis carried out at the Department can be found at TDX

Please visit these pages for information on our PhD, MSc and BSc programs.


Back Towards Intelligible and Conversational Speech Synthesis Engines

(Text by Mónica Domínguez Bajo, details on her activity in her repository)

With the funding from my MdM Award for reproducibility at the PhD workshop, I attended the 4th Speech Processing Course in Crete (SPCC2017) from 24th to 28th of July 2017.  The title of this year was: "Towards Intelligible and Conversational Speech Synthesis Engines" and one of the topics was prosody generation, which is directly related to my PhD dissertation.

The programme included a wide spectrum of aspects around speech synthesis: analysis of different synthesis techniques (including unit selection and parametric), text normalization, neural networks applied to speech generation and speech signal processing. The most interesting part of the course, in my opinion, was the fact that it included a balance of theoretical sessions and hands-on practice. We got to train models, generate voices and play around the concepts explained in the lectures. Furthermore, the panel of lecturers was made of outstanding researchers from both academia and industry.

Surprisingly enough, not everything was hard work in the course. Social activities included lessons on traditional greek dances at lunch time.  As weird and somewhat scary as it may sound in the beginning, it turned out to be a convenient way to interact with the rest of the participants in an amusing and relaxing atmosphere. There are pictures to prove this. And, hey! I got the best dancer award in the final competition! These performing skills of mine seem to pop every now and then. 

Anyway, let’s go back to business. I presented my demo on thematicity-based prosody enrichment for CTS that was accepted for publication at Interspeech 2017 (available from my repository). I was really amazed by the great expectations this work has aroused in both participants and lecturers. I even got a job offer from the industry! Unfortunately, the vacancy was related more to automatic speaker recognition than to speech synthesis.  In any case, seeing these positive reactions has been a great boost of energy and motivation for the last part in the race towards finalizing my PhD dissertation. All in all, it has been a great experience from both the professional and personal point of view. Not only I've learnt a lot, but I've also met really interesting people working in my field. The only possible drawback I can find was that the weather was extremely hot with temperatures of 29ºC from 8 a.m. Apart from that, the SPCC2017 was really worth it.