The ecological role of sound – understanding patterns and drivers of soundscape dynamics and composition

Date: 4th October 2023, 2:00 PM


Location: Room 004, Building F1 (Alte Universität) + Zoom


Speakers: Dr. Sandra Müller & Dominik Arend, M.Sc. (both Chair of Geobotany, University of Freiburg)


Bio: Sandra Müller specialises in vegetation ecology, with a focus on how biodiversity influences ecosystem processes. Since 2014, her special interest is on the role of acoustic diversity within the broader spectrum of biodiversity. She is interested in basic projects for a better understanding of sound's ecological role in landscapes, and she's also involved in adapting ecoacoustic methods for ecosystem monitoring and conservation.

Dominik Arend focuses on the entities of acoustic data and the ecological information it transports. The composition of soundscapes (the acoustic environment) and the relationships between their sources are his main research items. He aims to study how local and regional factors shape soundscapes and their impact on the acoustic community.


Abstract: Every habitat has its own acoustic signature. Sounds play an important ecological role in communication among individuals of a species and interspecific interactions. This exchange of acoustic signals happens in the interplay of noises and sounds of the abiotic environment. A variety of information is thus incorporated into the soundscape, such as the composition of the vocalizing species community, day time and seasons, vegetation zones, landscape structure, and land use. Changes due to species decline and climate change can also be recorded acoustically. This talk will present particular acoustic aspects and insights from different ecosystems and introduce „HearTheSpecies”-  a joined DFG project among our ecoacoustic working group and the Chair of Embedded Intelligence for Health Care and Wellbeing, were we hope to improve our analytic toolkit to gain a better understanding on what soundscape can tell us about a local species community and how it interacts with the abiotic (acoustic) environment.

Inter-Emotion Transfer in Speech

Date: 20th July 2023, 11:00 AM
Location: tba
Speaker: Dr. Xinzhou Xu
Abstract: This talk will briefly show our works in recent years on inter-emotion transfer in speech, which implicitly employ inter-modality transfer for learning low-resource emotional states in speech. Within the works in this talk, we aim to answered four questions: 1) Is antonomous zero-shot learning available in Speech Emotion Recognition (SER)? 2) Is conventional zero-shot learning effective in zero-shot SER tasks? 3) Which type of descriptors (Auditory Affective Descriptors; AADs) can better describe affective perception for spoken signals? 4) Can we achieve better performance for zero-shot SER tasks through reducing the gap between the descriptors and speech signals? Then, this talk will also present an overview for our furture possible works in relation to inter-emotion transfer in speech.

Biosignal-Adaptive Cognitive Systems

Referentin: Prof. Dr.-Ing. Tanja Schultz

Datum: July 14th, 11:00 AM

Ort: Eichleitnerstraße 30, Room 004



In my talk, I will describe technical cognitive systems that automatically adapt to users’ needs by interpreting their biosignals. Human behavior includes physical, mental, and social actions that emit a range of biosignals which can be captured by a variety of sensors. The processing and interpretation of such biosignals provides an inside perspective on human physical and mental activities, complementing the traditional approach of merely observing human behavior. As great strides have been made in recent years in integrating sensor technologies into ubiquitous devices and in machine learning methods for processing and learning from data, I argue that the time has come to harness the full spectrum of biosignals to understand user needs. I will present illustrative cases ranging from silent and imagined speech interfaces that convert myographic and neural signals directly into audible speech, to interpretation of human attention and decision making from multimodal biosignals. 


Tanja Schultz is Professor for Cognitive Systems of the Faculty of Mathematics & Computer Science at the University of Bremen, Germany. Prior to Bremen she spent 7 years as Professor for Cognitive Systems at KIT (2007-2015) and over 20 years as Researcher (2000-2007) and adjunct Research Professor (2007-2022) at the Language Technologies Institute at Carnegie Mellon, PA USA. She received the diploma and doctoral degrees in Informatics from University of Karlsruhe and a Master degree in Mathematics and Sport Sciences from Heidelberg University, both in Germany. In 2007, she founded the Cognitive Systems Lab (CSL) and serves as Director since then. She is the spokesperson of the University Bremen high-profile area “Minds, Media, Machines” and the DFG Research Unit “Lifespan AI: From longitudinal data to lifespan inference in health”. She also serves on the board of directors for the DFG CRC 1320 EASE, the DFG RTG 2739 KD2School – Designing Adaptive Systems for Economic Decision Making, and the Leibniz ScienceCampus on Digital Public Health. Professor Schultz is a recognized scholar in the field of multilingual speech recognition and cognitive technical systems, where she combines machine learning methods with innovations in biosignal processing to create technologies such as in "Silent Speech Communication" and "Brain- to-Speech". She is a Fellow of the IEEE, elected in 2020 “for contributions to multilingual speech recognition and biosignal processing”; a Fellow of the International Speech Communication Association, elected in 2016 “for contributions to multilingual speech recognition and biosignal processing for human-machine interaction”; a Fellow of the European Academy of Science and Arts (2017), and a Fellow of the Asian-Pacific Artificial Intelligence Association (2021). Her recent awards include the Google Faculty Research Award (2020 and 2013), the ISCA/EURASIP Best Journal Paper Award (2015 and 2001), the Otto Haxel Award (2013), and the Research Award for Technical Communication from the Alcatel-Lucent Award (2012) “for her overall scientific work in the interaction of human and technology in communication systems”.

Bachelor Presentations July 2023


Datum: July 4th, 9:00 AM

Ort: Online



  • Rares Tincu: Sound Event Classification using AI-generated Audio Captions (Supervisors: Alexander Gebhard and Andreas Triantafyllopoulos)
  • Anneli Sokolov: Earth Mover Distance-Guided Learning for Age Prediction from Speech (Supervisor: Manuel Milling)

Emotion Modelling for Speech Generation

Referent: Kun Zhou, M.Sc.

Datum: 24.Mai 2023, 11:00

Ort: Eichleitnerstraße 30, Raum 004



Speech generation aims to synthesize human-like voices from the input of text or speech. Current speech generation techniques can generate natural-sounding speech but do not convey emotional context in human-human interaction. Despite significant research efforts, several open issues remain, such as the limited generalizability and controllability of the generated emotions, which limit the scope of applications. This talk introduces the recent advances in emotion modelling for speech generation, to overcome the limitations of existing approaches, and bring us one step closer to achieve emotional intelligence, by: 1) improving the generalizability of emotion modelling for seen and unseen speakers and emotions; 2) studying sequence-to-sequence emotion modelling to enable spectrum and duration manipulation; 3) explicitly modelling and controlling emotion intensity; and 4) synthesizing and controlling the rendering of mixed emotions.


Kun Zhou received his B.Eng. degree from the School of Information and Communication Engineering, University of Electronic Science and Technology of China (UESTC), Chengdu, China, in 2018, and the M.Sc. degree from the Department of Electrical and Computer Engineering, National University of Singapore (NUS), Singapore, in 2019. He is a PhD student at the National University of Singapore. He is a visiting researcher at the cognitive system lab, University of Bremen, Germany (2023). He was a visiting PhD student at The Center for Robust Speech Systems (CRSS), the University of Texas at Dallas, United States (2022). His research interests mainly focus on emotion analysis and synthesis in speech, including emotional voice conversion and emotional text-to-speech. He is the recipient of the PREMIA best student paper award 2022. He served on the organizing committee for IEEE ASRU 2019, SIGDIAL 2021, IWSDS 2021, O-COCOSDA 2021 and ICASSP 2022. He is a reviewer for multiple leading conferences and journals including INTERSPEECH, ICASSP, IEEE SLT, IEEE Signal Processing Letters, Speech Communication and IEEE/ACM Transactions on Audio, Speech and Language Processing.

Abschlusspräsentation April 2023

Datum: April 17th, 10:00 AM

Ort: Zoom


Philipp Wagner

Audio-based step-count estimation using deep neural networks


Abschlusspräsentation März 2023

Datum: 29.März, 2023, 10:00 Uhr

Ort: Zoom


Jakub Šimon

Transformers-Based Long Document Summarization for Medical Scientific Reports


Abschlusspräsentation Februar 2023

Date: 1. Februar, 2023, 11:30 AM

Location: Lecture Room 004


Regina Kushtanova

Towards AI-Based Recognition of Defensive Communication: A Novel Dataset and Results


Seminarpräsentationen Wintersemester 2022/23

Datum: 1. Februar, 2023, 09:30 AM

Ort: Eichleitnerstraße 30