Abstract:

A new means of communication known as short messaging services (SMS) has evolved alongside the proliferation of mobile devices, networks, and data transmission in the last several decades. Even SMS users face the issue of spam. Bulk texts or SMS spam is any unimportant message sent through a mobile network [2]. There are a lot of causes for the excess of spam messages, The sheer number of mobile phone users raises the stakes when it comes to spam, or unsolicited bulk messages [1]. Also, the attackers could be ecstatic to hear that sending spam is cheap. In particular, there are a number of well-established algorithms for spam identification, which is a very active area of research [15]. This approach investigates a Multinomial Naive Bayes-Linear SVC methodology [4] to accurately detect spam data or communications. In order to remove irrelevant or inappropriate characters and information from the input dataset, pre-processing is performed [5]. The model is trained using a Multinomial Naive Bayes-Linear SVC approach for spam message prediction [10]. At 98.38% accuracy, the Multinomial Naive Bayes-Linear SVC model outperforms previous models in spam detection, including LSTM, SVM, and naive bayes.

Keywords— Hash Vectorizer, Deep Learning Algorithms, LSTM, Naïve Bayes, SMS Spam, SVM.