Generation of the database gurekddcup
Date
2017-02-10Author
Martín, José Ignacio
Muguerza Rivero, Javier Francisco
Metadata
Show full item recordAbstract
The database gureKDDCup has been generated within the UADI project (Unsupervised Anomaly Detection for Intrusion detection system) in which a classifier that detects intrusions or attacks in network based systems was developed. To develop this classifier we are going to use unsupervised classification techniques. The main distinctive feature of this project is that it uses the payload (body part of network packages) to detect attacks in network connections. The analysis of the payload to classify the connections is not a deeply analysed field, however, it seems that it is essential to detect attacks such as R2L (Remote to Local, its goal is to use resources without permission) and U2R (User to Root, its goal is to get root or administrative privileges without having them).
In the classification process we have to handle with a huge amount of connections and discover useful patterns among them. Therefore, this leads us to the Data Mining field. Moreover, we want our UADI system to be able to discover patterns or generate the model of network traffic automatically, that is, we want the learning process to be automatic, and to do it possible, we are going to use Machine Learning techniques.
But first it is essential to generate the apropriate database to work upon it. So the aim of this report is to explain the process we have followed to generate the database we used in the UADI project. The objective is to generate a
database with similar characteristics to KDDCup99 which is broadly used database in the scientific environment, taking as starting point the Darpa98 (DARPA Intrusion Detection Data Sets). The generated database is called gureKDDCup and it has similar features to the ones in KDDCup99, but we added to it payload information and other features related to the connection such as IP address and port numbers. Next lines explains the steps followed to generate the KDDCup99 database because our aim is to repeat those steps as accurately as possible, to create KDDCup99 the database we need in UADI project, in other words, a new extension of the (KDDCup99+payload) that we called it gureKDDCup.