Abstract
We present an overview of Task 2 of the seventh edition of the CheckThat! lab at the 2024 iteration of the Conference and Labs of the Evaluation Forum (CLEF). The task focuses on subjectivity detection in news articles and was offered in five languages: Arabic, Bulgarian, English, German, and Italian, as well as in a multilingual setting. The datasets for each language were carefully curated and annotated, comprising over 10,000 sentences from news articles. The task challenged participants to develop systems capable of distinguishing between subjective statements (reflecting personal opinions or biases) and objective ones (presenting factual information) at the sentence level. A total of 15 teams participated in the task, submitting 36 valid runs across all language tracks. The participants used a variety of approaches, with transformer-based models being the most popular choice. Strategies included fine-tuning monolingual and multilingual models, and leveraging English models with automatic translation for the non-English datasets. Some teams also explored ensembles, feature engineering, and innovative techniques such as few-shot learning and in-context learning with large language models. The evaluation was based on macro-averaged F1 score. The results varied across languages, with the best performance achieved for Italian and German, followed by English. The Arabic track proved particularly challenging, with no team surpassing an F1 score of 0.50. This task contributes to the broader goal of enhancing the reliability of automated content analysis in the context of misinformation detection and fact-checking. The paper provides detailed insights into the datasets, participant approaches, and results, offering a benchmark for the current state of subjectivity detection across multiple languages.
Original language | English (US) |
---|---|
Pages (from-to) | 287-298 |
Number of pages | 12 |
Journal | CEUR Workshop Proceedings |
Volume | 3740 |
State | Published - 2024 |
Event | 25th Working Notes of the Conference and Labs of the Evaluation Forum, CLEF 2024 - Grenoble, France Duration: Sep 9 2024 → Sep 12 2024 |
Funding
The work related to the German data has partially been funded by the BMBF (German Federal Ministry of Education and Research) under grant no. 01FP20031J. The responsibility for the contents of this publication lies with the authors. The work of M. Hasanain, R. Suwaileh, F. Alam and W. Zaghouani is partially supported by NPRP 14C-0916-210015 from the Qatar National Research Fund, which is a part of Qatar Research Development and Innovation Council (QRDI). The work of D. Dimitrov, G. Pachov, and I. Koychev is partially financed by the European Union-NextGenerationEU, through the National Recovery and Resilience Plan of the Republic of Bulgaria, project SUMMIT, No BG-RRP-2.004-0008. The work of A. Galassi is funded by European Commission's NextGenerationEU programme, PNRRM4C2-Investimento 1.3, PE00000013 \u201CFAIR\u201D, Spoke 8.
Keywords
- fact-checking
- misinformation detection
- subjectivity classification
ASJC Scopus subject areas
- General Computer Science