Abstract:

: With the raised quality of online social networks, spammers realize these platforms are simple to lure users into malicious activities by posting spam messages in the comments section of the videos. In this work, YouTube comments have been taken and spam detection is performed. To stop spammers, Google Safe Browsing and YouTube Bookmaker tools detect and block spam YouTube. These tools will block malicious links, however they cannot protect the user in real-time as early as possible. Thus, industries and researchers have applied completely different approaches to form spam free social network platform. The survey for the spam comments detection methodology has been carried out The bag-of-words model does exactly we want, that is to convert the phrases or sentences and counts the number of times a similar word appears. In the world of computer science, a bag refers to a data structure that keeps track of objects like an array or list does, but in such cases the order does not matter and if an object appears more than once, we just keep track of the count rather we keep repeating them. The most notable procedures and of their suitability to the issue of spam if we still wanted to reduce very common words and highlight the rare ones, what we would need to do is record the relative importance of each word rather than its raw count. This is known as term frequency inverse document frequency (TF-IDF), which measures how common a word or term is in the document. .