Vergangene Projekte

© Universität Augsburg


Start date: 01.02.2016


End date: 30.11.2019


Funding body: Horizon 2020, EU




DE-ENIGMA is developing artificial intelligence for a commercial robot (Robokind’s Zeno). The robot will be used for an emotion-recognition and emotion-expression teaching programme to school-aged autistic children. This approach combines the most common interests of children of school age: technology, cartoon characters (that Zeno resembles) and socializing with peers.


During the project, Zeno will go through several design phases, getting ‘smarter’ every time. It will be able to process children’s motions, vocalizations, and facial expressions in order to adaptively and autonomously present emotion activities, and engage in feedback, support, and play. 






Assistenzsystem zur Erkennung des emotionalen Zustandes von Werkstatt­mitarbeiterinnen und -mitarbeitern


© Universität Augsburg


Start date: 01.06.2015


End date: 30.09.2019


Funding body: EU


Engaging children with ASC (Autism Spectrum Conditions) in communication centred activities during educational therapy is one of the cardinal challenges by ASC and contributes to its poor outcome. To this end, therapists recently started using humanoid robots (e.g., NAO) as assistive tools. However, this technology lacks the ability to autonomously engage with children, which is the key for improving the therapy and, thus, learning opportunities. Existing approaches typically use machine learning algorithms to estimate the engagement of children with ASC from their head-pose or eye-gaze inferred from face-videos. These approaches are rather limited for modeling atypical behavioral displays of engagement of children with ASC, which can vary considerably across the children.


The first objective of EngageME is to bring novel machine learning models that can for the first time effectively leverage multi-modal behavioural cues, including facial expressions, head pose, vocal and physiological cues, to realize fully automated context-sensitive estimation of engagement levels of children with ASC. These models build upon dynamic graph models for multi-modal ordinal data, based on state-of-the-art machine learning approaches to sequence classification and domain adaptation, which can adapt to each child, while still being able to generalize across children and cultures. To realize this, the second objective of EngageME is to provide the candidate with the cutting-edge training aimed at expanding his current expertise in visual processing with expertise in wearable/physiological, and audio technologies, from leading experts in these fields.


EngageME is expected to bring novel technology/models for endowing assistive robots with ability to accurately ‘sense’ engagement levels of children with ASC during robot-assisted therapy, while providing the candidate with a set of skills needed to become one of the frontiers in the emerging field of affect-sensitive assistive technology.



Intelligent systems’ Holistic Evolving Analysis of Real-life Universal speaker characteristics


© Universität Augsburg


Start date: 01.01.2014


End date: 31.12.2018


Funding body: EU




Recently, automatic speech and speaker recognition has matured to the degree that it entered the daily lives of thousands of Europe’s citizens, e.g., on their smart phones or in call services. During the next years, speech processing technology will move to a new level of social awareness to make interaction more intuitive, speech retrieval more efficient, and lend additional competence to computer-mediated communication and speech-analysis services in the commercial, health, security, and further sectors. To reach this goal, rich speaker traits and states such as age, height, personality and physical and mental state as carried by the tone of the voice and the spoken words must be reliably identified by machines.


In the iHEARu project, ground-breaking methodology including novel techniques for multi-task and semi-supervised learning will deliver for the first time intelligent holistic and evolving analysis in real-life condition of universal speaker characteristics which have been considered only in isolation so far. Today’s sparseness of annotated realistic speech data will be overcome by large-scale speech and meta-data mining from public sources such as social media, crowd-sourcing for labelling and quality control, and shared semi-automatic annotation.


All stages from pre-processing and feature extraction, to the statistical modelling will evolve in “life-long learning” according to new data, by utilising feedback, deep, and evolutionary learning methods. Human-in-the-loop system validation and novel perception studies will analyse the self-organising systems and the relation of automatic signal processing to human interpretation in a previously unseen variety of speaker classification tasks.


The project’s work plan gives the unique opportunity to transfer current world-leading expertise in this field into a new de-facto standard of speaker characterisation methods and open-source tools ready for tomorrow’s challenge of socially aware speech analysis.



Assistenzsystem zur Erkennung des emotionalen Zustandes von von Werkstatt­mitarbeiterinnen und -mitarbeitern


© Universität Augsburg


Start date: 01.06.2015


End date: 31.05.2018


Funded by: EU


Project Homepage: 


Im Projekt soll ein emotionssensitives, sprachgesteuertes Assistenzsystem entwickelt werden, das den emotionalen Zustand von Werkstattmitarbeiterinnen und -mitarbeitern zuverlässig aus der Interaktion mit dem Sprachassistenten erkennt. Zusätzlich zu den dafür erforderlichen Arbeiten zur Sprach- und Emotionserkennung wird ein psychologisch fundiertes Nutzerprofil erstellt, welches individuelle Eigenschaften abbildet. Damit zusammenhängende Anforderungen an das Persönlichkeitsrecht und den Datenschutz werden vom Konsortium berücksichtigt. Dieses halbautomatische System soll den individuellen Unterstützungsbedarf zuverlässig ableiten. Auf diese Weise wird eine optimale Anpassung der Arbeitsabläufe, z. B. durch Erläuterung und Anpassung einzelner Arbeitsschritte oder Motivation zur Pause, möglich.



Automatic Sentiment Estimation in the Wild


© Universität Augsburg


Start date: 01.02.2015


End date: 31.07.2018


Funded by: EU


Project Homepage: 


The main aim of SEWA is to deploy and capitalise on existing state-of-the-art methodologies, models and algorithms for machine analysis of facial, vocal and verbal behaviour, and then adjust and combine them to realise naturalistic human-centric human-computer interaction (HCI) and computer-mediated face-to-face interaction (FF-HCI). This will involve development of computer vision, speech processing and machine learning tools for automated understanding of human interactive behaviour in naturalistic contexts. The envisioned technology will be based on findings in cognitive sciences and it will represent a set of audio and visual spatiotemporal methods for automatic analysis of human spontaneous (as opposed to posed and exaggerated) patterns of behavioural cues including continuous and discrete analysis of sentiment, liking and empathy.