Ethics of using llms in content moderation on twitter| International Journal of Innovative Science and Research Technology

Ethics of Using LLMs in Content Moderation on Twitter

Authors : Daniyal Ganiuly; Assel Smaiyl

Volume/Issue : Volume 9 - 2024, Issue 10 - October

Google Scholar : https://tinyurl.com/mrxzadmz

Scribd : https://tinyurl.com/2u2dnp9r

DOI : https://doi.org/10.38124/ijisrt/IJISRT24OCT1959

PlumX Metrics

Semantic Scholar

ResearchGate

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Abstract : As the number of users increases on social media each year, the number of posts that are made rises gradually. This is relevant for posts with negative characters including hate speech, misinformation, explicit material, or cyberbullying that influences terribly on users’ experience. This paper puts emphasis on content moderation with LLMs to avoid issues with bias, transparency, free speech, and accountability. Several experiments were conducted with pre-trained models to identify efficiency and arising ethical concerns while moderating posted data. Our findings reveal that LLMs demonstrate bias during the moderation of content from different demographics and minority communities. One of the most significant challenges found was the lack of transparency in the LLM's decision-making process. Despite the ethical concerns, the LLM demonstrated efficiency in processing large volumes of content, and this significantly reduced the time required to flag potentially harmful posts. This research highlights the need for a balanced approach to protecting freedom of speech while ensuring the ethical and responsible use of NLP on online platforms.

Keywords : LLM; NLP; Content Moderation; Social Media.

References :

J. Wu, M. Zhang, H. Sun, Y. Zhang, and X. Li, “Legilimens: Practical and unified content moderation for large language model services,” ACM SIGSAC Conference, 2024.
P. Jha, S. Kumar, A. Bhatnagar, and R. Patel, “MemeGuard: An LLM and VLM-based framework for advancing content moderation via meme intervention,” arXiv, 2024.
N. Vishwamitra, R. Gupta, T. Singh, and S. Nanda, “Moderating new waves of online hate with chain-of-thought reasoning in large language models,” IEEE Symposium on Security and Privacy, 2024.
A. Bhatia, “Advancing policy insights: Opinion data analysis and discourse structuring using LLMs,” University of Central Florida Thesis, 2024.
S. Ghosh, M. Verma, R. Choudhury, and T. Ahuja, “AEGIS: Online adaptive AI content safety moderation with ensemble of LLM experts,” arXiv, 2024.
M. Franco, L. Rossi, S. Moreno, and G. Pérez, “Analyzing the use of large language models for content moderation with ChatGPT examples,” OASIS, 2023.
T. Huang, “Content moderation by LLM: From accuracy to legitimacy,” arXiv, 2024.
J. Cai, Y. Liu, Q. Zhao, and H. Lin, “Language evolution for evading social media regulation via LLM-based multi-agent simulation,” IEEE, 2024.
N. P. Kumar, K. Srinivasan, and D. Ramesh, “Analyzing public sentiment towards LLM: A Twitter-based sentiment analysis,” Proc. 2023 Int. Conf. Confluence Adv. Robotics, Vision, and Interdisciplinary Technology Management (IC-RVITM), IEEE, 2023.
P. Vanpech, K. Peerabenjakul, N. Suriwong, and S. Fugkeaw, “Detecting cyberbullying on social networks using language learning model,” Proc. 2024 Int. Conf. Knowledge and Smart Technology (KST), IEEE, 2024.
H. T. Otal, E. Stern, and M. A. Canbaz, “LLM-assisted crisis management: Building advanced LLM platforms for effective emergency response and public collaboration,” Proc. IEEE Conf. Artificial Intelligence (CAI), 2024.
M. Sadeghi, B. Egger, R. Agahi, R. Richer, K. Capito, and L. H. Rupp, “Exploring the capabilities of a language model-only approach for depression detection in text data,” Proc. 23rd IEEE EMB Int. Conf. Biomedical and Health Informatics (BHI), 2023.
B. Saha and U. Saha, “Enhancing international graduate student experience through AI-driven support systems: A LLM and RAG-based approach,” Proc. 2024 Int. Conf. Data Science and Its Applications (ICoDSA), 2024.
P. S. Ramteke and S. Khandelwal, “Comparing conventional machine learning and large-language models for human stress detection using social media posts,” Proc. 2023 2nd Int. Conf. Futuristic Technologies (INCOFT), 2023.
K. Sabaneh, M. A. Salameh, F. Khaleel, M. M. Herzallah, J. Y. Natsheh, and M. Maree, “Early risk prediction of depression based on social media posts in Arabic,” Proc. 2023 IEEE 35th Int. Conf. Tools with Artificial Intelligence (ICTAI), 2023.

As the number of users increases on social media each year, the number of posts that are made rises gradually. This is relevant for posts with negative characters including hate speech, misinformation, explicit material, or cyberbullying that influences terribly on users’ experience. This paper puts emphasis on content moderation with LLMs to avoid issues with bias, transparency, free speech, and accountability. Several experiments were conducted with pre-trained models to identify efficiency and arising ethical concerns while moderating posted data. Our findings reveal that LLMs demonstrate bias during the moderation of content from different demographics and minority communities. One of the most significant challenges found was the lack of transparency in the LLM's decision-making process. Despite the ethical concerns, the LLM demonstrated efficiency in processing large volumes of content, and this significantly reduced the time required to flag potentially harmful posts. This research highlights the need for a balanced approach to protecting freedom of speech while ensuring the ethical and responsible use of NLP on online platforms.

Keywords : LLM; NLP; Content Moderation; Social Media.