Rückert, Johannes and Ben Abacha, Asma and Garcia Seco De Herrera, Alba and Bloch, Louise and Brüngel, Raphael and Idrissi-Yaghir, Ahmad and Schäfer, Henning and Müller, Henning and Friedrich, Christoph M (2022) Overview of ImageCLEFmedical 2022 – Caption Prediction and Concept Detection. In: Conference and Labs of the Evaluation Forum (CLEF), 2022-09-05 - 2022-09-08, Bologna, Italy.
Rückert, Johannes and Ben Abacha, Asma and Garcia Seco De Herrera, Alba and Bloch, Louise and Brüngel, Raphael and Idrissi-Yaghir, Ahmad and Schäfer, Henning and Müller, Henning and Friedrich, Christoph M (2022) Overview of ImageCLEFmedical 2022 – Caption Prediction and Concept Detection. In: Conference and Labs of the Evaluation Forum (CLEF), 2022-09-05 - 2022-09-08, Bologna, Italy.
Rückert, Johannes and Ben Abacha, Asma and Garcia Seco De Herrera, Alba and Bloch, Louise and Brüngel, Raphael and Idrissi-Yaghir, Ahmad and Schäfer, Henning and Müller, Henning and Friedrich, Christoph M (2022) Overview of ImageCLEFmedical 2022 – Caption Prediction and Concept Detection. In: Conference and Labs of the Evaluation Forum (CLEF), 2022-09-05 - 2022-09-08, Bologna, Italy.
Abstract
The 2022 ImageCLEFmedical caption prediction and concept detection tasks follow similar challenges that were already run from 2017–2021. The objective is to extract Unified Medical Language System (UMLS) concept annotations and/or captions from the image data that are then compared against the original text captions of the images. The images used for both tasks are a subset of the extended Radiology Objects in COntext (ROCO) data set which was used in ImageCLEFmedical 2020. In the caption prediction task, lexical similarity with the original image captions is evaluated with the BiLingual Evaluation Understudy (BLEU) score. In the concept detection task, UMLS terms are extracted from the original text captions, combined with manually curated concepts for image modality and anatomy, and compared against the predicted concepts in a multi-label way. The F1-score was used to assess the performance. The task attracted a strong participation with 20 registered teams. In the end, 12 teams submitted 157 graded runs for the two subtasks. Results show that there is a variety of techniques that can lead to good prediction results for the two tasks. Participants used image retrieval systems for both tasks, while multi-label classification systems were used mainly for the concept detection, and Transformer-based architectures primarily for the caption prediction subtask.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Divisions: | Faculty of Science and Health Faculty of Science and Health > Computer Science and Electronic Engineering, School of |
SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
Depositing User: | Unnamed user with email elements@essex.ac.uk |
Date Deposited: | 07 Sep 2022 08:27 |
Last Modified: | 23 Sep 2022 19:55 |
URI: | http://repository.essex.ac.uk/id/eprint/33424 |
Available files
Filename: paper-95.pdf
Licence: Creative Commons: Attribution 3.0