Spam Detection by Machine Learning-Based Content Analysis

Davino, D.; Camastra, F.; Ciaramella, A.; Staiano, A.

doi:10.1007/978-981-15-5093-5_37

The paper aims to present a Spam Detection system by a Content Analysis based on Machine Leaning. The system is composed of six units: Tokenization and Cleaning words, Lemmatization, Stopping Word Removal and Synonym Replacement, Term Selection, Bag-of-Words Representer, and Classifier. Experiments performed on two different datasets, i.e., SpamAssassin and Trec2007 show satisfactory results, comparable with the state of the art.

Spam Detection by Machine Learning-Based Content Analysis

Davino D.;Camastra F.;Ciaramella A.;Staiano A.

2020-01-01

Abstract

The paper aims to present a Spam Detection system by a Content Analysis based on Machine Leaning. The system is composed of six units: Tokenization and Cleaning words, Lemmatization, Stopping Word Removal and Synonym Replacement, Term Selection, Bag-of-Words Representer, and Classifier. Experiments performed on two different datasets, i.e., SpamAssassin and Trec2007 show satisfactory results, comparable with the state of the art.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2020
			
	Codice ISBN
	
				978-981-15-5092-8
978-981-15-5093-5
			
	Appare nelle tipologie:
	
				2.1 Contributo in volume (Capitolo o Saggio)

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11367/87250

Citazioni

ND

5

ND

Spam Detection by Machine Learning-Based Content Analysis

Davino D.;Camastra F.;Ciaramella A.;Staiano A.

2020-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)