Retrieval-powered Zero-shot Text Classification

Datum: 16.01.2023, 17:30 Uhr - 18:30 Uhr 
Ort: N2045, Universitätsstraße 1, 86159 Augsburg
Veranstalter: Lehrstuhl für Biomedizinische Informatik, Data Mining und Data Analytics
Themenbereiche: Informatik, Gesundheit und Medizin
Veranstaltungsreihe: Medical Information Sciences
Veranstaltungsart: Vortrag
Vortragende: Prof. Dr. Carsten Eickhoff
© Universität Augsburg

Im Wintersemester findet jeweils montags um 17:30h in Hörsaal N2045 die Vortragsreihe Medical Information Sciences statt. Renommierte Wissenschaftlerinnen und Wissenschaftler unterschiedlicher Fachdisziplinen und Forschungsstandorte geben Einblicke in aktuelle Fragestellungen, Forschungsbereiche und Anwendungsgebiete dieses zunehmend bedeutsamen Forschungsfeldes.

Unstructured data, especially encoded in the form of natural language text, is one of the most prevalent and rapidly growing information types available to humankind. Unlocking the (often hidden) potential of such resources via natural language processing and understanding techniques can greatly support, or altogether enable, an exciting range of downstream applications. In this talk, I will give a brief high-level overview of ongoing NLP and IR efforts in my lab, before moving on to an investigation of zero-shot text classification in a diagnostic decision support setting. More than most other clinical inference tasks, primary care diagnosis experiences severely imbalanced long-tailed class distributions, under which some few classes are very well represented (e.g., congestive heart failure) while most others remain sparsely populated or even entirely unobserved in many clinical centers (e.g., rare diseases such as the Danbolt-Cross Syndrome). Such few- or zero-shot settings make a challenging stage on which to field conventional class-conditional machine learning models. In an attempt to address this issue, we draw from massive unsupervised domain resources such as PubMed that are incorporated via an information retrieval step and observe significant performance improvements without the need for additional supervised training data.

Carsten Eickhoff is a Professor of E-Health and Medical Data Science at the University of Tübingen where his lab specializes in the development of machine learning and natural language processing techniques with the goal of improving patient safety, individual health, and quality of medical care. Prior to joining Tübingen, he was the Manning Assistant Professor of Medical and Computer Science at Brown University. He received degrees from the University of Edinburgh and TU Delft, and was a postdoctoral fellow at ETH Zurich and Harvard University. Carsten has authored more than 100 articles in computer science conferences (e.g., ICLR, ACL, SIGIR, WWW, KDD) and clinical journals (e.g., Nature Digital Medicine, The Lancet - Respiratory Medicine, Radiology, European Heart Journal). His research has been supported by the Swiss National Science Foundation, NSF, DARPA, IARPA, Google, Amazon, Microsoft and others. Aside from his academic endeavors, he is a founder and board member of several deep technology startups in the health sector that strive to translate technological innovation to improved safety and quality of life for patients.

Weitere Veranstaltungen der Veranstaltungsreihe "Medical Information Sciences"

Weitere Veranstaltungen: Lehrstuhl für Biomedizinische Informatik, Data Mining und Data Analytics