Mind Your Language: Abuse and Offense Detection for Code-Switched Languages

Kapoor, Raghav; Kumar, Yaman; Rajput, Kshitij; Shah, Rajiv Ratn; Kumaraguru, Ponnurangam; Zimmermann, Roger

Computer Science > Computation and Language

arXiv:1809.08652 (cs)

[Submitted on 23 Sep 2018]

Title:Mind Your Language: Abuse and Offense Detection for Code-Switched Languages

Authors:Raghav Kapoor, Yaman Kumar, Kshitij Rajput, Rajiv Ratn Shah, Ponnurangam Kumaraguru, Roger Zimmermann

View PDF

Abstract:In multilingual societies like the Indian subcontinent, use of code-switched languages is much popular and convenient for the users. In this paper, we study offense and abuse detection in the code-switched pair of Hindi and English (i.e. Hinglish), the pair that is the most spoken. The task is made difficult due to non-fixed grammar, vocabulary, semantics and spellings of Hinglish language. We apply transfer learning and make a LSTM based model for hate speech classification. This model surpasses the performance shown by the current best models to establish itself as the state-of-the-art in the unexplored domain of Hinglish offensive text this http URL also release our model and the embeddings trained for research purposes

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1809.08652 [cs.CL]
	(or arXiv:1809.08652v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1809.08652

Submission history

From: Yaman Kumar [view email]
[v1] Sun, 23 Sep 2018 18:19:46 UTC (1,657 KB)

Computer Science > Computation and Language

Title:Mind Your Language: Abuse and Offense Detection for Code-Switched Languages

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Mind Your Language: Abuse and Offense Detection for Code-Switched Languages

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators