The paper titled "Towards Learning Monocular 3D Object Localization Using the Physical Laws of Motion" by Daniel Kienzle, Julian Lorenz, Katja Ludwig and Rainer Lienhart was accepted to the International Conference on 3D Vision (3DV) 2024. The paper describes a new method for localizing objects in 3D without the need for 3D ground truth. Instead, the method uses knowledge of physical laws to learn the task.
See https://kiedani.github.io/3DV2024/ for additional information on the paper.
The paper “Haystack: A Panoptic Scene Graph Dataset to Evaluate Rare Predicate Classes” by Julian Lorenz, Florian Barthel, Daniel Kienzle, and Rainer Lienhart is accepted at the First ICCV Workshop on Scene Graphs and Graph Representation Learning (SG2RL). The authors present Haystack, a new dataset for scene graph generation that tackles current shortcomings when evaluating with current scene graph datasets. Most notably, Haystack contains rare predicate classes and explicit negative annotations. Only through these properties can rare relationships be reliably evaluated. Based on the design of Haystack, the authors introduce three new scene graph metrics that can be used to gain more detailed insights about the prediction of rare predicate classes.
The paper with the title "Detecting Arbitrary Keypoints on Limbs and Skis with Sparse Partly Correct Segmentation Masks" from Katja Ludwig, Daniel Kienzle, Julian Lorenz and Rainer Lienhart is accepted for the workshop Computer Vision for Winter Sports on the IEEE/CVF Winter Conference on Applications in Computer Vision (WACV) 2023. In this paper, the authors describe how to detect arbitrary keypoints on the limbs and skis of ski jumpers. Only a few, partly correct segmentation masks are necessary in the dataset for the presented method.
Das Paper mit dem Titel "Pseudo-Label Noise Suppression Techniques for Semi-Supervised Semantic Segmentation" von Sebastian Scherer, Robin Schön und Rainer Lienhart wurde für die British Machine Vision Conference (BMVC) 2023 akzeptiert. In diesem Paper beschreiben die Autoren eine Methode die es ermöglich, den Bedarf an großen gelabelten Datensätzen zu verringern, indem nicht gelabelte Daten in das Training einbezogen werden. Als Anwendung verwenden die Autoren die menschlicher Posenschätzung sowie die semantische Segmentierung, wobei besonderes letzteres interessant ist, da hier die Annotation von Daten äußerst zeitaufwendig ist.
Daniel Kienzle, Julian Lorenz, Katja Ludwig, Robin Schön and Rainer Lienhart from the chair for Machine Learning and Computer Vision achieved the first place in the STOIC challenge. The goal of the challenge was to predict the severe outcome of COVID-19 one month ahead using CT scans. To this end, the researchers employed convolutional neural networks and transfer learning on various tasks. The challenge was organized by Assistance Publique – Hôpitaux de Paris, Radboud University Medical Center, and Amazon Web Services.
The paper "Uplift and Upsample: Efficient 3D Human Pose Estimation with Uplifting Transformers" by Moritz Einfalt, Katja Ludwig and Rainer Lienhart is accepted at IEEE/CVF Winter Conference on Applications in Computer Vision (WACV) 2023. In this paper, the authors present a method to drastically reduce the computational complexity of 3D Human Pose Estimation with Transformers while maintaining smooth and precise 3D motion sequences.
Paper accepted for the Workshop on AI-enabled Medical Image Analysis at European Conference on Computer Vision 2022
The paper titled "COVID detection and severity prediction with 3D-ConvNeXt and custom pretrainings" by Daniel Kienzle, Julian Lorenz, Robin Schön, Katja Ludwig and Rainer Lienhart is accepted at the Workshop on AI-enabled Medical Image Analysis at ECCV 2022.
In this paper, the authors present how the ConvNeXt architecture can be leveraged for the classification of 3D-CT scans. Particularly, various transfer learning methods supporting the application on 3D medical data are explored. With the insights presented in this paper, the authors achieve the 2nd place in the 1st COVID19 Severity
Detection Challenge and the 3rd place in the 2nd COVID19 Detection Challenge.
The paper titled "Synchronized Audio-Visual Frames With Fractional Positional Encoding for Transformers in Video-to-Text Translation" from Philipp Harzig, Moritz Einfalt und Rainer Lienhart is accepted for the IEEE International Conference on Image Processing 2022. This paper presents a novel way to synchronize audio and video features for the automated generation of textutal video descriptions.