Machine Learning Based Detection of Deceptive Tweets on Covid-19
Amisha Sinha1, Mohnish Raval2, S Sindhu3
1Amisha Sinha*, Department of Information Technology, SRM Institute of Science and Technology, Chennai (Tamil Nadu), India.
2Mohnish Raval, Department of Information Technology, SRM Institute of Science and Technology, Chennai (Tamil Nadu), India.
3S Sindhu, Assistant Professor, Department of Information Technology SRM Institute of Science and Technology, Chennai (Tamil Nadu), India.
Manuscript received on June 14, 2021. | Revised Manuscript received on June 19, 2021. | Manuscript published on June 30, 2021. | PP: 375-380 | Volume-10 Issue-5, June 2021. | Retrieval Number: 100.1/ijeat.E28310610521 | DOI: 10.35940/ijeat.E2831.0610521
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Social media plays a vital role in connecting people around world and developing relationships. Social Media has a huge potential audience and the circulation of any information does impact a huge population. With the surge of Covid-19, we can see a lot offake news and tweets circulating about remedies, medicine, and general information related to pandemics. In this paper, we set out machine learning-based detection of deceptive information around Covid-19. With this paper, we have described our project which could detect whether a tweet is fake or real automatically. The labeled dataset is used in the process which is extracted from the arXiv repository. Dataset has tweets, upon which various methods are applied for cleaning, training, and testing. Preprocessing, Classification, tokenization, and stemming/removal of stop words are performed to extract the most relevant information from the dataset and to achieve better accuracy in comparison with the existing system. For classification, we have used two classification techniques- Tf-Idf and Bags of words. To achieve better accuracy, we have used two other methodology-SVM and Random Forest and have achieved an F1-score of 0.94 using SVM.
Keywords: Artificial Intelligence, Fake News, Social Media, SVM
Scope of the Article: Machine Learning