Abstract
In this paper, we introduce MADARi, a joint morphological annotation and spelling correction system for texts in Standard and Dialectal Arabic. The MADARi framework provides intuitive interfaces for annotating text and managing the annotation process of a large number of sizable documents. Morphological annotation includes indicating, for a word, in context, its baseword, clitics, part-of-speech, lemma, gloss, and dialect identification. MADARi has a suite of utilities to help with annotator productivity. For example, annotators are provided with pre-computed analyses to assist them in their task and reduce the amount of work needed to complete it. MADARi also allows annotators to query a morphological analyzer for a list of possible analyses in multiple dialects or look up previously submitted analyses. The MADARi management interface enables a lead annotator to easily manage and organize the whole annotation process remotely and concurrently. We describe the motivation, design and implementation of this interface; and we present details from a user study working with this system.
Original language | English (US) |
---|---|
Title of host publication | LREC 2018 - 11th International Conference on Language Resources and Evaluation |
Editors | Hitoshi Isahara, Bente Maegaard, Stelios Piperidis, Christopher Cieri, Thierry Declerck, Koiti Hasida, Helene Mazo, Khalid Choukri, Sara Goggi, Joseph Mariani, Asuncion Moreno, Nicoletta Calzolari, Jan Odijk, Takenobu Tokunaga |
Publisher | European Language Resources Association (ELRA) |
Pages | 2616-2622 |
Number of pages | 7 |
ISBN (Electronic) | 9791095546009 |
State | Published - 2019 |
Event | 11th International Conference on Language Resources and Evaluation, LREC 2018 - Miyazaki, Japan Duration: May 7 2018 → May 12 2018 |
Publication series
Name | LREC 2018 - 11th International Conference on Language Resources and Evaluation |
---|
Conference
Conference | 11th International Conference on Language Resources and Evaluation, LREC 2018 |
---|---|
Country/Territory | Japan |
City | Miyazaki |
Period | 5/7/18 → 5/12/18 |
Funding
This publication was made possible by grant NPRP7-290-1-047 from the Qatar National Research Fund (a member of the Qatar Foundation). The statements made herein are solely the responsibility of the authors.
Keywords
- Annotation
- Arabic
- Morphology
- Spelling Correction
ASJC Scopus subject areas
- Linguistics and Language
- Education
- Library and Information Sciences
- Language and Linguistics