This paper presents a study of the effect of the size of the time window from which network features are derived on the predictive ability of a Random Forest classifier implemented as a network intrusion detection component. The network data is processed using granular computing principles, gradually increasing the time windows to allow the detection algorithm to find patterns in the data at different levels of granularity. Experiments were conducted iteratively with time windows ranging in size from 2 to 1024 seconds. Each iteration involved time-based clustering of the data, followed by splitting into training and test sets at a ratio of 67% - 33%. The Random Forest algorithm was applied as part of a 10-fold cross-validation. Assessments included standard detection metrics: accuracy, precision, F1 score, BCC, MCC and recall. The results show a statistically significant improvement in the detection of cyber attacks in network traffic with a larger time window size (p-value 0.001953125). These results highlight the effectiveness of using longer time intervals in network data analysis, resulting in increased anomaly detection.
Enhancing Network Security Through Granular Computing: A Clustering-by-Time Approach to NetFlow Traffic Analysis
D'Antonio S.;
2024-01-01
Abstract
This paper presents a study of the effect of the size of the time window from which network features are derived on the predictive ability of a Random Forest classifier implemented as a network intrusion detection component. The network data is processed using granular computing principles, gradually increasing the time windows to allow the detection algorithm to find patterns in the data at different levels of granularity. Experiments were conducted iteratively with time windows ranging in size from 2 to 1024 seconds. Each iteration involved time-based clustering of the data, followed by splitting into training and test sets at a ratio of 67% - 33%. The Random Forest algorithm was applied as part of a 10-fold cross-validation. Assessments included standard detection metrics: accuracy, precision, F1 score, BCC, MCC and recall. The results show a statistically significant improvement in the detection of cyber attacks in network traffic with a larger time window size (p-value 0.001953125). These results highlight the effectiveness of using longer time intervals in network data analysis, resulting in increased anomaly detection.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.