Dr. Philipp Harzig

Ehemaliger wissenschaftlicher Mitarbeiter
Lehrstuhl für Maschinelles Lernen und Maschinelles Sehen
Telefon: N/A


  • Computer Vision
  • Machine Learning
  • Visual Question Answering
  • Image Captioning
  • Deep Convolutional Neural Networks
  • Recurrent Neural Networks
  • Machine Learning Optimization


  • Automatic Image Captioning of Scenes Depicting Branded Products
  • Company Logo Detection



Philipp Harzig. Automatic Generation of Natural Language Descriptions of Visual Data: Describing Images and Videos using Recurrent and Self-Attentive Models     
Dissertation, University of Augsburg, February 04, 2022.


2022 | 2020 | 2019 | 2018 | 2016


Philipp Harzig. 2022. Automatic generation of natural language descriptions of visual data: describing images and videos using recurrent and self-attentive models.
PDF | BibTeX | RIS

Katja Ludwig, Philipp Harzig and Rainer Lienhart. 2022. Detecting arbitrary intermediate keypoints for human pose estimation with vision transformers. DOI: 10.1109/WACVW54805.2022.00073
PDF | BibTeX | RIS | DOI

Philipp Harzig, Moritz Einfalt and Rainer Lienhart. 2022. Synchronized audio-visual frames with fractional positional encoding for transformers in video-to-text translation. DOI: 10.1109/ICIP46576.2022.9897804
PDF | BibTeX | RIS | DOI


Stephan Brehm, Philipp Harzig, Moritz Einfalt and Rainer Lienhart. 2020. Learning segmentation from object color. DOI: 10.1109/mipr49039.2020.00036
PDF | BibTeX | RIS | DOI

Philipp Harzig, Moritz Einfalt, Katja Ludwig and Rainer Lienhart. 2020. Transforming Videos to Text (VTT Task) Team: MMCUniAugsburg.
PDF | BibTeX | RIS | URL


Philipp Harzig, Yan-Ying Chen, Francine Chen and Rainer Lienhart. 2019. Addressing data bias problems for chest x-ray image report generation.
PDF | BibTeX | RIS | URL

Philipp Harzig, Moritz Einfalt and Rainer Lienhart. 2019. Automatic disease detection and report generation for gastrointestinal tract examinations. DOI: 10.1145/3343031.3356066
PDF | BibTeX | RIS | DOI

Philipp Harzig, Dan Zecha, Rainer Lienhart, Carolin Kaiser and René Schallner. 2019. Image captioning with clause-focused metrics in a multi-modal setting for marketing. DOI: 10.1109/MIPR.2019.00085
PDF | BibTeX | RIS | DOI


Philipp Harzig, Stephan Brehm, Rainer Lienhart, Carolin Kaiser and Rene Schallner. 2018. Multimodal image captioning for marketing analysis. DOI: 10.1109/mipr.2018.00035
PDF | BibTeX | RIS | DOI

Philipp Harzig, Christian Eggert and Rainer Lienhart. 2018. Visual question answering with a hybrid convolution recurrent model. DOI: 10.1145/3206025.3206054
PDF | BibTeX | RIS | DOI


Philipp Harzig. 2016. Implementation of frequency domain convolution for the caffe-framework.
PDF | BibTeX | RIS | URL


Philipp Harzig.

Implementation of Frequency Domain Convolution for the Caffe-Framework.
Master Thesis, February 2016.