Althobaiti, M and Kruschwitz, U and Poesio, M (2014) AraNLP: A Java-based library for the processing of Arabic text. In: UNSPECIFIED, ? - ?.
Althobaiti, M and Kruschwitz, U and Poesio, M (2014) AraNLP: A Java-based library for the processing of Arabic text. In: UNSPECIFIED, ? - ?.
Althobaiti, M and Kruschwitz, U and Poesio, M (2014) AraNLP: A Java-based library for the processing of Arabic text. In: UNSPECIFIED, ? - ?.
Abstract
We present a free, Java-based library named "AraNLP" that covers various Arabic text preprocessing tools. Although a good number of tools for processing Arabic text already exist, integration and compatibility problems continually occur. AraNLP is an attempt to gather most of the vital Arabic text preprocessing tools into one library that can be accessed easily by integrating or accurately adapting existing tools and by developing new ones when required. The library includes a sentence detector, tokenizer, light stemmer, root stemmer, part-of-speech tagger (POS-tagger), word segmenter, normalizer, and a punctuation and diacritic remover.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Additional Information: | Published proceedings: Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014 |
Uncontrolled Keywords: | Arabic Natural Language Processing; Java; Tools |
Subjects: | P Language and Literature > P Philology. Linguistics Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | Faculty of Science and Health Faculty of Science and Health > Computer Science and Electronic Engineering, School of |
SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
Depositing User: | Unnamed user with email elements@essex.ac.uk |
Date Deposited: | 04 Dec 2014 13:19 |
Last Modified: | 30 Oct 2024 19:58 |
URI: | http://repository.essex.ac.uk/id/eprint/11979 |
Available files
Filename: 621_Paper.pdf