Exploring Mechanisms for Detecting Violent Content in Sinhala Image Posts: Rationale with Unsupervised vs Supervised Techniques
View/ Open
Date
2024-01Author
Dikwatta, U
Fernando, TGI
Ariyaratne, MKA
Metadata
Show full item recordAbstract
This research explores the different avenues in machine learning to classify Sinhala image posts. Image posts
in social media are one big weapon that conveys information directly to people. Image posts contain both visuals and text.
English based research work is common in this regard, but only a handful can be seen from other languages. The target
language was a low-resource language, Sinhala. Unsupervised algorithms were used to classify image posts and supervised
algorithms were involved classifying manually extracted text in image posts. The classification decides whether the posts
are violent or nonviolent. The trained supervised models were tested with interpretability models to identify the words that
cause the decision of violent or nonviolent. The findings reveal supervised algorithms perform better than unsupervised
algorithms in classifying image posts. However, improved results can be obtained by increasing the size and the variety of
the dataset.