Alharthi, Raneem and Alharthi, Rajwa and Shekhar, Ravi and Zubiaga, Arkaitz (2023) Target-Oriented Investigation of Online Abusive Attacks: A Dataset and Analysis. IEEE Access, 11. pp. 64114-64127. DOI https://doi.org/10.1109/access.2023.3289148
Alharthi, Raneem and Alharthi, Rajwa and Shekhar, Ravi and Zubiaga, Arkaitz (2023) Target-Oriented Investigation of Online Abusive Attacks: A Dataset and Analysis. IEEE Access, 11. pp. 64114-64127. DOI https://doi.org/10.1109/access.2023.3289148
Alharthi, Raneem and Alharthi, Rajwa and Shekhar, Ravi and Zubiaga, Arkaitz (2023) Target-Oriented Investigation of Online Abusive Attacks: A Dataset and Analysis. IEEE Access, 11. pp. 64114-64127. DOI https://doi.org/10.1109/access.2023.3289148
Abstract
Despite a body of research revolving around online abusive language, aiming at different objectives such as detection, diffusion prediction, and mitigation, existing research has seldom looked at factors motivating this behaviour. To further research in this direction, we investigate the motivations behind online abuse by looking at the characteristics of the targets of such abuse, i.e. is the abuse more prominent for specific characteristics of the targets? To enable target-oriented research into online abuse, we introduce the Online Abusive Attacks (OAA) dataset, the first benchmark dataset providing a holistic view of online abusive attacks, including social media profile data and metadata for both targets and perpetrators, in addition to context. The dataset contains 2.3K Twitter accounts, 5M tweets, and 106.9K categorised conversations. Further, we conduct an in-depth statistical analysis of online abuse centred around the targets’ characteristics. We identify two types of abusive attacks: those motivated by characteristics of the targets (identity-based attacks) and others (behavioural attacks). We find that online abusive attacks are predominantly motivated by the targets’ identities (97%), behavioural attacks accounting for a much smaller proportion (3%). Abuse is also more likely to target users who are popular and have a verified status. Interestingly, an analysis of the user bios shows no clear indication that keywords used in the bios are likely to trigger abuse. Additionally, we also look at the frequency with which perpetrators perform online abusive attacks. Our analysis shows a large number of infrequent perpetrators, with only a few recurrent perpetrators. Findings from our study have important implications for the development of abusive language detection models that incorporate an awareness of the targets to improve their potential for prediction.
| Item Type: | Article |
|---|---|
| Uncontrolled Keywords: | Social networking (online), Hate speech, Task analysis, Statistical analysis, Blogs, Data collection, Surveillance |
| Divisions: | Faculty of Science and Health Faculty of Science and Health > Computer Science and Electronic Engineering, School of |
| SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
| Depositing User: | Unnamed user with email elements@essex.ac.uk |
| Date Deposited: | 20 Mar 2026 11:58 |
| Last Modified: | 20 Mar 2026 11:58 |
| URI: | http://repository.essex.ac.uk/id/eprint/35837 |
Available files
Filename: Target-Oriented Investigation of Online Abusive Attacks A Dataset and Analysis.pdf
Licence: Creative Commons: Attribution-Noncommercial-No Derivative Works 4.0