Alnaied, Ali, Elbendak, Mosa and Bulbul, Abdullah (2020) An intelligent use of stemmer and morphology analysis for Arabic information retrieval. Egyptian Informatics Journal, 21 (4). pp. 209-217. ISSN 1110-8665
|
Text
1-s2.0-S1110866519303469-main.pdf - Published Version Available under License Creative Commons Attribution Non-commercial No Derivatives 4.0. Download (1MB) | Preview |
Abstract
Arabic Information Retrieval has gained significant attention due to an increasing usage of Arabic text on the web and social media networks. This paper discusses a new approach for Arabic stem, called Arabic Morphology Information Retrieval (AMIR), to generate/extract stems by applying a set of rules regarding the relationship among Arabic letters to find the root/stem of the respective words used as indexing terms for the text search in Arabic retrieval systems. To demonstrate the usefulness of the proposed algorithm, we highlight the benefits of the proposed rules for different Arabic information retrieval systems. Finally, we have evaluated AMIR system by comparing its performance with LUCENE, FARASA, and no-stemmer counterpart system in terms of mean average precisions. The results obtained demonstrate that AMIR has achieved a mean average precision of 0.34% while LUCENE, FARASA and no stemmer giving 0.27%, 0.28% and 0.21, respectively. This demonstrates that AMIR is able to improve Arabic stemmer and increases retrieval as well as being strong against any type of stem.
Item Type: | Article |
---|---|
Additional Information: | Funding Information: We would like to thank the anonymous reviews for their valuable comments which have helped us to improve this paper. This work is partially supported by the National Natural Science Foundation of China under Grant No. 60775028, the Major Projects of Technology Bureau of Dalian No.2007A14GXD42, and IT Industry Development of Jilin Province. |
Uncontrolled Keywords: | Arabic morphological analysis, Arabic stemmer, Information retrieval systems, Natural language processing |
Subjects: | G400 Computer Science G500 Information Systems G900 Others in Mathematical and Computing Sciences |
Department: | Faculties > Engineering and Environment > Computer and Information Sciences |
Depositing User: | Rachel Branson |
Date Deposited: | 25 May 2022 13:36 |
Last Modified: | 25 May 2022 13:45 |
URI: | http://nrl.northumbria.ac.uk/id/eprint/49193 |
Downloads
Downloads per month over past year