02095nas a2200193 4500000000100000008004100001653003700042653002200079653001500101653001600116653002400132653003200156100001700188700001800205700001200223245007400235490000700309520158500316 2021 d10aZhanfang Chen; Yan Liu big data10afeature selection10ahybrid IDS10aIDS dataset10aintrusion detection10amachine learning algorithms1 aMubarak Umar1 aZhanfang Chen1 aYan Liu00aA Hybrid Intrusion Detection with Decision Tree for Feature Selection0 v493 a

Intrusion detection systems (IDS) typically take high computational complexity to examine data features and identify intrusion patterns due to the size and nature of the current intrusion detection datasets. Data pre-processing techniques (such as feature selection) are being used to reduce such complexity by eliminating irrelevant and redundant features in such datasets. The objective of this study is to analyse the effectiveness and efficiency of some feature selection approaches, namely wrapper-based and filter-based modelling approaches. To achieve that, machine learning models are designed in a hybrid approach with either wrapper or filter selection processes. Five machine learning algorithms are used on the wrapper and filter-based feature selection methods to build the IDS models using the UNSW-NB15 dataset. The wrapper-based hybrid intrusion detection model comprises a decision tree algorithm to guide the selection process and three filter-based methods, namely information gain, gain ratio, and relief, are used for comparison to determine the efficiency and effectiveness of the wrapper approach. Furthermore, a comparison with other state-of-the-art intrusion detection approaches is performed. The experimental results show that the wrapper-based method is quite effective in comparison to state-of-the-art works; however, it requires high computational time in comparison to the filter-based methods while achieving similar results. Our work also revealed unobserved issues on the conformity of the UNSW-NB15 dataset.