On the role of information compaction to intrusion detection
Chapter in Scopus
- Additional Document Info
- View All
An intrusion detection system (IDS) usually has to analyse Giga-bytes of audit information. In the case of anomaly IDS, the information is used to build a user profile characterising normal behaviour. Whereas for misuse IDSs, it is used to test against known attacks. Probabilistic methods, e.g. hidden Markov models, have proved to be suitable to profile formation but are prohibitively expensive. To bring these methods into practise, this paper aims to reduce the audit information by folding up subsequences that commonly occur within it. Using n-grams language models, we have been able to successfully identify the n-grams that appear most frequently. The main contribution of this paper is a n-gram extraction and identification process that significantly reduces an input log file keeping key information for intrusion detection. We reduced log files by a factor of 3.6 in the worst case and 4.8 in the best case. We also tested reduced data using hidden Markov models (HMMs) for intrusion detection. The time needed to train the HMMs is greatly reduced by using our reduced log files, but most importantly, the impact on both the detection and false positive ratios are negligible. © Springer-Verlag Berlin Heidelberg 2005.