Björn W. Schuller
Professor and ZD.B Chair of Embedded Intelligence for Health Care and Wellbeing
University of Augsburg / Germany
Professor of Artificial Intelligence
Head GLAM – Group on Language, Audio & Music
Imperial College London / UK
“Social Speech Processing and Deep Learning toolkits”
Speech bears a plethora of socially relevant cues such as laughter, sighs, addressee, and empathy and mimicry in dyadic conversations, or simply the affective and cognitive state of the speakers alongside their traits. This talk discusses the state-of-the-art in the field of Computational Paralinguistics that deal with according computational solutions towards automatic recognition of such. It largely centres around deep learning for highly automated modelling of information of interest. Speech is thereby considered as the blend of acoustic and linguistic information. The methods touched upon include semi-supervised and active learning to cooperatively learn with users. In the often-faced case of sparse data, transfer learning and generative adversarial approaches are further aids. Furthermore, attention models are presented learning the relevant parts of the speech input to focus upon. End-to-end learning from the raw speech signal or text string allow to model novel problems without expert knowledge on representation. In addition, automatic machine learning enables to self-shape suited learning architectures. The discussion is rounded off by means of multimodal integration in the presence of video or physiological information.
“Deep Learning Toolkits”
In the hands-on presentation, recent toolkits in the field are introduced including openSMILE, openXBOW, auDeep, and end2you for representation of the information of interest. Furthermore, the DeepSpectrum toolkit allows to transfer learn from pre-trained networks in case of limited data availability. Finally, the iHEARu-PLAY platform is featured – a gamified crowdsourcing platform with a highly efficient cooperative learning backend.
Björn W. Schuller received his diploma, doctoral degree, habilitation, and Adjunct Teaching Professor all in EE/IT from TUM in Munich/Germany. He is Professor of Artificial Intelligence and Head of GLAM – the Group on Language Audio & Music – at Imperial College London/UK, Full Professor and ZD.B Chair of Embedded Intelligence for Health Care and Wellbeing at the University of Augsburg/Germany, co-founding CEO and current CSO of audEERING, and permanent Visiting Professor at HIT/China. Before, he was Full Professor and Chair of Complex and Intelligent Systems and Chair of Sensor Systems at the University of Passau/Germany, with Joanneum Research in Graz/Austria, and the CNRS-LIMSI in Orsay/France among other stations. He is a Fellow of the IEEE, President-Emeritus of the AAAC, and Senior Member of the ACM. He (co-)authored 800+ publications (22000+ citations, h-index=69), and served/serves as the Editor in Chief of the IEEE Transactions on Affective Computing, General Chair of ACII 2019, ACII Asia 2018, and ACM ICMI 2014, and Program Chair of Interspeech 2019, ACM ICMI 2019/2013, ACII 2015/2011, and IEEE SocialCom 2012. He was honoured as one of 40 extraordinary scientists under the age of 40 by the WEF in 2015/16, served as Coordinator/PI in more than a dozen European Projects, is an ERC Starting Grantee, and consultant of companies such as Barclays, GN, Huawei or Samsung.