CCS2011: Enemy of the Good

From Soma-notes
Revision as of 13:54, 4 April 2011 by Soma (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

NOTE: This page is obsolete, please visit the mercurial repository.

Title

The Enemy of the Good: Re-evaluating Research Directions in Intrusion Detection

Abstract

Research in intrusion detection is in decline---there is less and less work being published in the field in competitive venues. Here we argue that a key reason for this decline is because of a misunderstanding of the significance and nature of false positive rates. False positives---legitimate behavior that is mis-classified as being potentially malicious---have a huge impact on the viability of any intrusion detection method in the real world. A survey of the literature, however, shows that false positive rates have remained persistently high in published reports. In this paper we argue that this persistence is due to the nature of the data sources used by intrusion detection systems. In support of this position, we present the requirements for viable intrusion detection systems, correlate those requirements with those of accurate detection methods, and then show that existing data sources cannot be so accurately modeled. To address these observations, we argue that research in intrusion detection must move away from the pure study of detection methods and towards the study of deployable detection/response mechanisms that directly accommodate relatively high false positive rates.

Introduction

As a research field, intrusion detection should be booming. Attackers are using increasingly sophisticated attacks, allowing them to compromise systems and exfiltrate data from major corporations and governments. At the same time, regular users are threatened daily with everything from drive-by downloads to phishing attacks to cross-site scripting exploits. In this environment where attackers are routinely circumventing existing defenses, we clearly need better ways to detect security violations. Intrusion detection should be an obvious approach to addressing this problem. Despite the clear need, however, there are signs everywhere that the field is in decline. General security venues publish few papers in intrusion detection, and even specialized venues such as RAID are publishing fewer and fewer intrusion detection papers.

It is not hard to realize why interest in the field is fading: existing intrusion detection systems are just not that useful, and research systems don't seem to improve on this situation. Mainstream signature-based IDSs require extensive tuning to deliver reasonable rates of false alarms, and even then they are extremely limited in their ability to detect novel attacks. Research systems, particularly ones based on machine learning methods, are often computationally expensive and/or can be circumvented by sophisticated attackers. But what is worse with these systems is that they, too, have high false alarm rates, but their rates are not so easily tuned. Given this dismal situation, is it any wonder that researchers would look for greener pastures?

In this paper we aim to reframe the intrusion detection problem such that the limitations of past work can be seen not as failures, but merely as incomplete portions of a larger whole. To this end, we focus on the problem of false alarms. Expensive detection methods can be addressed by better algorithms and data collection methods. Limited attack coverage or susceptibility to evasive attackers can be addressed by using addition defense mechanisms. High false alarm rates, however, are crippling: they turn potentially useful detectors into mechanisms that system administrators and users will turn off or simply ignore. Simply put, intrusion detection is not a viable defense strategy unless the false alarm problem can be adequately solved.

Much past work in intrusion detection has glossed over the false positive problem by arguing that they can be arbitrarily reduced through appropriate engineering, e.g., by correlating them with the output of other sensors and detectors. While we agree that there are a variety of ways false positives can be reduced in theory, it is remarkable to note that in published work, virtually nobody is able to get false alarm rates low enough to make intrusion detection viable, even with the use of false alarm mitigation strategies.

The false positive problem is the central problem of intrusion detection, and it is a problem that arises not from the limitations of algorithms but from the nature of the problem itself. In particular, at the high data rates that most intrusion detection systems deal with, "rare events" (events of low probability) happen at a surprisingly large absolute frequency---they can sometimes occur thousands of times a day. And, when they do happen, they can generate alarms in any kind of IDS, whether it be a signature, specification, or anomaly-based system. The challenge, then, is not to model these rare events (that's impossible) but to instead develop intrusion detection and response systems that perform well even in the face of rare, unmodeled events.

To explain why the direction of IDS research needs to change, we first start with a set of requirements for a viable (deployable) intrusion detection system. Next, in Section 3, we explore machine learning methods and examine under what circumstances machine learning methods can perform with the accuracy required for traditional framings of the intrusion detection problem. In Section 4 we then examine the nature of the standard data sources used by IDSs and show that they are fundamentally not amenable to the kind of accurate modeling required to give sufficiently low false positive rates. Section 5 surveys the literature and shows that past work in IDS is consistent with our conclusions. We explore the implications of our observations on the field in Section 6. Section 7 concludes.

Intrusion Detection Requirements

State of the Art in Machine Learning

Colin's section

Characteristics of IDS Data

Luc's section

The False Alarm Problem

Stolfo's comment from his papers titled: "Anomalous Payload-Based Network Intrusion Detection"

"The False Positive rate of Anomaly Detection systems are typically regarded as an inhibitor to their wide spread use. In this work, for example, 0.1% FP rate means that about 1 per thousand packets are flagged as anomalous. Such a rate might be considered untenable, rendering anomaly detection systems unusable. This argument is not quite correct. We shall not argue that the False Negative rate of signature-based misuse detection systems causes far more problems than a false alarm. Rather, we make the assertion that it may be better to generate more anomaly detector alerts (and consequently possibly more false alerts) to provide more evidence to correlate with other sensors to better detect a true attack. Those anomaly detector alerts that have no other confirmatory evidence of an attack from another sensor might be ignored. Those that correlate with other anomalous events would tend to strengthen the likelihood that a security event has indeed occurred hence generating very interesting alarms. This means that one should not view an anomaly detection system as a singular monolithic detector, but a component in a correlated set of detectors, including possibly misuse detectors"



Paper 1: Geometric Framework for unsupervised Anomaly Detection: Detecting Intrusions in unlabeled data


  1. Citations:

Cited by 455

BibTex: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.119.5533

@INPROCEEDINGS{Eskin02ageometric,

   author = {Eleazar Eskin and Andrew Arnold and Michael Prerau and Leonid Portnoy and Sal Stolfo},
   title = {A Geometric Framework for Unsupervised Anomaly Detection: Detecting Intrusions in Unlabeled Data},
   booktitle = {Applications of Data Mining in Computer Security},
   year = {2002},
   publisher = {Kluwer}

}

Abstract: Most current intrustion detection systems emplay signature-based methods or data mining-based methods which rely on labeled training data. This training data is typically expensive to produce. We present a new geometric framework for unsupervised anomaly detection, which are algorithms that are designed to process unlabaled data. In our framework, data elements are mapped to a feature space which is typically a vector space o.....


Algorithm:

Geomeric framework for unsupervised anmaly detection (unlabeled data + outlier detection). 
3 Algorithms evaluated seperatly:

- First algorithm: cluster-based algorithm presented in Portnoy et al. - Second algorithm: k-nearest neighbor based algorithm - Third algorith: One Class Support Vector Machine-based algorithm Data:

 Network records from KDD CUP 1999 data set. Contains wide variety of intrusions simulated in a military network environment.
 System call traces from the 1999 Lincoln Labs DARPA evaluation (BSM, Basic security module)

Results: KDD Cup 1999 data.

 Algorithm		Detection rate	False positive rate
 
Cluster			93%					10%
Cluster			66%					2
Cluster			47%					1%
Cluster			28%					0.5%

K-NN				91%					8%
K-NN				23%					6%
K-NN				11%					4%
K-NN				5%					2%

SVM				98%					10%
SVM				91%					6%
SVM				67%					4%
SVM				5%					3%





Paper 2:Modeling System Calls for Intrusion Detection with Dynamic Window Sizes


  1. Citations:

106 Eleazar Eskin, Salvatore J. Stolfo, Wenke Lee, "Modeling System Calls for Intrusion Detection with Dynamic Window Sizes," DARPA Information Survivability Conference and Exposition,, p. 0165, DARPA Information Survivability Conference and Exposition (DISCEX II'01) Volume I-Volume 1, 2001

BibTex:

@article{10.1109/DISCEX.2001.932213, author = {Eleazar Eskin and Salvatore J. Stolfo and Wenke Lee}, title = {Modeling System Calls for Intrusion Detection with Dynamic Window Sizes}, journal ={DARPA Information Survivability Conference and Exposition,}, volume = {1}, isbn = {0-7695-1212-7}, year = {2001}, pages = {0165}, doi = {http://doi.ieeecomputersociety.org/10.1109/DISCEX.2001.932213}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }

Abstract:

We extend prior research on system call anomaly detection modeling methods for intrusion detection by incorporating dynamic window sizes. The window size is the length of the subsequence of a system call trace which is used as the basic unit for modeling program or process behavior. In this work we incorporate dynamic window sizes and show marked improvements in anomaly detection. We present two methods for estimating the optimal window size based on the available training data. The first method is an entropy modeling method which determines the optimal single window size for the data. The second method is a probability modeling method that takes into account context dependent window sizes. A context dependent window size model is motivated by the way that system calls are generated by processes. Sparse Markov transducers (SMTs) are used to compute the context dependent window size model. We show over actual system call traces that the entropy modeling methods lead to the optimal single window size. We also show that context dependent window sizes outperform traditional system call modeling methods.

Algorithm: To model system calls with context dependent window sizes, we use sparse Markov transducers (SMTs) [3]1. SMTs are an extension of probabilistic suffix trees. SMTs use sparse prediction trees which explicitly define the context dependent choices of window sizes and wild card placements.

Data: We perform experiments over the 1999 DARPA Intrusion Detection Evaluation data created by MIT Lincoln Labs [12]. The second set of data was obtained from Stephanie Forrest’s group at the University of New Mexico. This data contains up to 15 months of normal traces for certain programs as well as intrusion traces.

Results: mm, how am I going to sum that up?. See page 9 PDF ---------------------------------------------------------------->




Paper 3: Intrusion Detection using Sequences of System Calls


  1. Citations:

809

RIS:

TY - JOUR JF - Journal of Computer Security T1 - Intrusion detection using sequences of system calls VL - 6 IS - 3 SP - 151 EP - 180 PY - 1998/01/01/ UR - http://iospress.metapress.com/content/M19JJ5LNHBEB0BVF AU - , AU - Steven A. Hofmeyr AU - , AU - Stephanie Forrest AU - , AU - Anil Somayaji N2 - A method is introduced for detecting intrusions at the level of privileged processes. Evidence is given that short sequences of system calls executed by running processes are a good discriminator between normal and abnormal operating characteristics of several common UNIX programs. Normal behavior is collected in two ways: Synthetically, by exercising as many normal modes of usage of a program as possible, and in a live user environment by tracing the actual execution of the program. In the former case several types of intrusive behavior were studied; in the latter case, results were analyzed for false positives. ER -


Abstract: A method is introduced for detecting intrusions at the level of privileged processes. Evidence is given that short sequences of system calls executed by running processes are a good discriminator between normal and abnormal operating characteristics of several common UNIX programs. Normal behavior is collected in two ways: Synthetically, by exercising as many normal modes of usage of a program as possible, and in a live user environment by tracing the actual execution of the program. In the former case several types of intrusive behavior were studied; in the latter case, results were analyzed for false positives.

Algorithm:

-Comparing short sequences of system calls

Data:

-Live data for fp test, by tracing the actual execution of the program

Results:

0.01 +- 0.004
1 in every 100 (print jobs of lpr)


Paper 4: A data mining framework for building intrusion detection models


http://ieeexplore.ieee.org.proxy.library.carleton.ca/xpls/abs_all.jsp?arnumber=766909

  1. Citations:

844 BibTex:

@INPROCEEDINGS{766909, author={Wenke Lee and Stolfo, S.J. and Mok, K.W.}, booktitle={Security and Privacy, 1999. Proceedings of the 1999 IEEE Symposium on}, title={A data mining framework for building intrusion detection models}, year={1999}, month=, volume={}, number={}, pages={120 -132}, abstract={There is often the need to update an installed intrusion detection system (IDS) due to new attack methods or upgraded computing environments. Since many current IDSs are constructed by manual encoding of expert knowledge, changes to IDSs are expensive and slow. We describe a data mining framework for adaptively building Intrusion Detection (ID) models. The central idea is to utilize auditing programs to extract an extensive set of features that describe each network connection or host session, and apply data mining programs to learn rules that accurately capture the behavior of intrusions and normal activities. These rules can then be used for misuse detection and anomaly detection. New detection models are incorporated into an existing IDS through a meta-learning (or co-operative learning) process, which produces a meta detection model that combines evidence from multiple models. We discuss the strengths of our data mining programs, namely, classification, meta-learning, association rules, and frequent episodes. We report on the results of applying these programs to the extensively gathered network audit data for the 1998 DARPA Intrusion Detection Evaluation Program}, keywords={1998 DARPA Intrusion Detection Evaluation Program;IDS update;anomaly detection;association rules;auditing programs;co-operative learning;data mining framework;expert knowledge;frequent episodes;host session;intrusion detection models;manual encoding;meta detection model;meta-learning;misuse detection;multiple models;network connection;upgraded computing environments;auditing;authorisation;computer network management;data mining;expert systems;learning (artificial intelligence);}, doi={10.1109/SECPRI.1999.766909}, ISSN={},}

Algorithm:

- Several Data Mining programs: classification, meta-learning, association rules, and frequent episodes.

Data:

- Audit data for the 1998 DARPA Intrusion Detection Evaluation Program.

Results:

- Page 10/13 pdf. I dont really understand how there could be a straight line. However I would say around 2% 	


Paper 5: Intrustion detection with unlabeled data using clustering


http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.126.2131 http://freeworld.thc.org/root/docs/intrusion_detection/nids/ID-with-Unlabeled-Data-Using-Clustering.pdf

  1. Citations: 472

BibTex: @INPROCEEDINGS{Portnoy01intrusiondetection,

   author = {Leonid Portnoy and Eleazar Eskin and Sal Stolfo},
   title = {Intrusion detection with unlabeled data using clustering},
   booktitle = {In Proceedings of ACM CSS Workshop on Data Mining Applied to Security (DMSA-2001},
   year = {2001},
   pages = {5--8}

} Abstract: Intrusions pose a serious security risk in a network environment. Although systems can be hardened against many types of intrusions, often intrusions are successful making systems for detecting these intrusions critical to the security of these system. New intrusion types, of which detection systems are unaware, are the most difficult to detect. Current signature based methods and learning algorithms which rely on labeled data to train, generally can not detect these new intrusions. In addition, labeled training data in order to train misuse and anomaly detection systems is typically very expensive. We present a new type of clustering-based intrusion detection algorithm, unsupervised anomaly detection, which trains on unlabeled data in order to detect new intrusions. In our system, no manually or otherwise classified data is necessary for training. Our method is able to detect many different types of intrusions, while maintaining a low false positive rate as verified over the KDD CUP 1999 dataset.. 1 Introduction A network intrusion attack can be any use of a network that compromises its stability or the security of information that is stored on computers connected to it. A very wide range of activity falls under this definition, including attempts to destabilize the network as a whole, gain unauthorized access to files or privileges, or simply mishandling and misuse of software. Added security measures can not stop all such attacks. The goal of intrusion detection is to build a system which would automatically scan network activity and detect such intrusion attacks. Once an attack is detected, the system administrator is informed and can take corrective action


Algorithm:

- Clustering (a variant of single-linkage clustering) using simple distance-based metric (unlabeled data)

Data:

- KDD CUP 1999 dataset

Results:

- 1.3% - 2.3%

Paper 6: Learning Program Behavior Profiles for Intrusion Detection


http://portal.acm.org.proxy.library.carleton.ca/citation.cfm?id=1267886

  1. Citations: 333

BibTex: @inproceedings{Ghosh:1999:LPB:1267880.1267886,

author = {Ghosh, Anup K. and Schwartzbard, Aaron and Schatz, Michael},
title = {Learning program behavior profiles for intrusion detection},
booktitle = {Proceedings of the 1st conference on Workshop on Intrusion Detection and Network Monitoring - Volume 1},
year = {1999},
location = {Santa Clara, California},
pages = {6--6},
numpages = {1},
url = {http://portal.acm.org/citation.cfm?id=1267880.1267886},
acmid = {1267886},
publisher = {USENIX Association},
address = {Berkeley, CA, USA},

}

Abstract: Profiling the behavior of programs can be a useful reference for detecting potential intrusions against systems. This paper presents three anomaly detection techniques for profiling program behavior that evolve from memorization to generalization. The goal of monitoring program behavior is to be able to detect potential intrusions by noting irregularities in program behavior. The techniques start from a simple equality matching algorithm for determining anomalous behavior, and evolve to a feed-forward backpropagation neural network for learning program behavior, and finally to an Elman network for recognizing recurrent features in program execution traces. In order to detect future attacks against systems, intrusion detection systems must be able to generalize from past observed behavior. The goal of this research is to employ machine learning techniques that can generalize from past observed behavior to the problem of intrusion detection. The performance of these systems is compared by testing them with data provided by the DARPA Intrusion Detection Evaluation program.

Algorithm:

- Simple equality matching algorithm for determining anomalous behavior.
- feed-forward backpropagation neural network for learning program behavior
- Elman network for recognizing recurrent features in program execution traces.

Data:

- DARPA Intrusion Detection Evaluation program

Results:

- Equality matching : at 68.2% detection rate: 1.4% fp,at  86.4% detection rate: 4.3%
- feed-forward backpropagation neural network: at 77.3% detection rate , 2.2% fp
- Elman networksL 77.3% with 0% fp !!! --------------------------------> with a leak rate of 0.7% (what is that?)

Paper 7: Detecting Intrusions Using System Calls: Alternative Data Models


http://www.computer.org/portal/web/csdl/doi/10.1109/SECPRI.1999.766910

  1. Citations:774

BibTex: @article{10.1109/SECPRI.1999.766910, author = {C. Warrender and S. Forrest and B. Pearlmutter}, title = {Detecting Intrusions using System Calls: Alternative Data Models}, journal ={Security and Privacy, IEEE Symposium on}, volume = {0}, issn = {1540-7993}, year = {1999}, pages = {0133}, doi = {http://doi.ieeecomputersociety.org/10.1109/SECPRI.1999.766910}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }

Abstract: Intrusion detection systems rely on a wide variety of observable data to distinguish between legitimate and illegitimate activities. In this paper we study one such observable---sequences of system calls into the kernel of an operating system. Using system-call data sets generated by several different programs, we compare the ability of different data modeling methods to represent normal behavior accurately and to recognize intrusions. We compare the following methods: Simple enumeration of observed sequences, comparison of relative frequencies of different sequences, a rule induction technique, and Hidden Markov Models (HMMs). We discuss the factors affecting the performance of each method, and conclude that for this particular problem, weaker methods than HMMs are likely sufficient.

Algorithm:

- Hidden Markov models (actually they're comparing multiple methods, HMM are the best results)

Data:

- live traces of production system programs

Results:

- ---------------------------------->Confused !!!

Paper 8: A Fast Automaton-Based Method for Detecting Anomalous Program Behaviors


  1. Citations: 370

BibTex: @INPROCEEDINGS{924295, author={Sekar, R. and Bendre, M. and Dhurjati, D. and Bollineni, P.}, booktitle={Security and Privacy, 2001. S P 2001. Proceedings. 2001 IEEE Symposium on}, title={A fast automaton-based method for detecting anomalous program behaviors}, year={2001}, month=, volume={}, number={}, pages={144 -155}, abstract={Anomaly detection on system call sequences has become perhaps the most successful approach for detecting novel intrusions. A natural way for learning sequences is to use a finite-state automaton (FSA). However previous research indicates that FSA-learning is computationally expensive, that it cannot be completely automated or that the space usage of the FSA may be excessive. We present a new approach that overcomes these difficulties. Our approach builds a compact FSA in a fully automatic and efficient manner, without requiring access to source code for programs. The space requirements for the FSA is low - of the order of a few kilobytes for typical programs. The FSA uses only a constant time per system call during the learning as well as the detection period. This factor leads to low overheads for intrusion detection. Unlike many of the previous techniques, our FSA-technique can capture both short term and long term temporal relationships among system calls, and thus perform more accurate detection. This enables our approach to generalize and predict future behaviors from past behaviors. As a result, the training periods needed for our FSA based approach are shorter. Moreover false positives are reduced without increasing the likelihood of missing attacks. This paper describes our FSA based technique and presents a comprehensive experimental evaluation of the technique}, keywords={FSA-learning;anomalous program behavior detection;constant time per system call;fast automaton-based method;finite-state automaton;intrusion detection;learning sequences;system calls;training periods;finite state machines;learning automata;security of data;system monitoring;}, doi={10.1109/SECPRI.2001.924295}, ISSN={},}

Algorithm:

 - FSA, finite state automaton

Data:

 - training scripts (simluated requests)
 - live data

Results:

- not very sure ~0.01 for the shortest training period

Paper 9: A multi-model approach to the detection of web-based attacks


http://www.sciencedirect.com.proxy.library.carleton.ca/science?_ob=ArticleURL&_udi=B6VRG-4FJXJN8-1&_user=4799849&_coverDate=08%2F05%2F2005&_rdoc=1&_fmt=high&_orig=gateway&_origin=gateway&_sort=d&_docanchor=&view=c&_searchStrId=1698446568&_rerunOrigin=scholar.google&_acct=C000051236&_version=1&_urlVersion=0&_userid=4799849&md5=20368fc5ed26ad7085246763bc9ae0c1&searchtype=a

  1. Citations: 128

BibTex:

@article{Kruegel2005717, title = "A multi-model approach to the detection of web-based attacks", journal = "Computer Networks", volume = "48", number = "5", pages = "717 - 738", year = "2005", note = "Web Security", issn = "1389-1286", doi = "DOI: 10.1016/j.comnet.2005.01.009", url = "http://www.sciencedirect.com/science/article/B6VRG-4FJXJN8-1/2/d73d1c84a28acd03936b083c68759482", author = "Christopher Kruegel and Giovanni Vigna and William Robertson", keywords = "Intrusion Detection", keywords = "World-Wide Web", keywords = "Machine Learning", keywords = "Anomaly Models", abstract = " Web-based vulnerabilities represent a substantial portion of the security exposures of computer networks. In order to detect known web-based attacks, misuse detection systems are equipped with a large number of signatures. Unfortunately, it is difficult to keep up with the daily disclosure of web-related vulnerabilities, and, in addition, vulnerabilities may be introduced by installation-specific web-based applications. Therefore, misuse detection systems should be complemented with anomaly detection systems. This paper presents an intrusion detection system that uses a number of different anomaly detection techniques to detect attacks against web servers and web-based applications. The system analyzes client queries that reference server-side programs and creates models for a wide-range of different features of these queries. Examples of such features are access patterns of server-side programs or values of individual parameters in their invocation. In particular, the use of application-specific characterization of the invocation parameters allows the system to perform focused analysis and produce a reduced number of false positives. The system derives automatically the parameter profiles associated with web applications (e.g., length and structure of parameters) and relationships between queries (e.g., access times and sequences) from the analyzed data. Therefore, it can be deployed in very different application environments without having to perform time-consuming tuning and configuration." }

Algorithm: Data:

- Apache log files from a production web server at Google, Inc.
- Log files of two computer science dept. universities

Results:

- 0.06% ( they reported other fp rates that where used on data without attacks, they were alot smaller than this).

Paper 10: Anagram: A Content Anomaly Detector Resistant to Mimickry Attack


http://www.springerlink.com.proxy.library.carleton.ca/content/t0808158078h3w42/

  1. Citations:

BibTex: @incollection {springerlink:10.1007/11856214_12,

  author = {Wang, Ke and Parekh, Janak and Stolfo, Salvatore},
  affiliation = {Computer Science Department, Columbia University, 500 West 120th Street, New York, NY 10027},
  title = {Anagram: A Content Anomaly Detector Resistant to Mimicry Attack},
  booktitle = {Recent Advances in Intrusion Detection},
  series = {Lecture Notes in Computer Science},
  editor = {Zamboni, Diego and Kruegel, Christopher},
  publisher = {Springer Berlin / Heidelberg},
  isbn = {},
  pages = {226-248},
  volume = {4219},
  url = {http://dx.doi.org/10.1007/11856214_12},
  note = {10.1007/11856214_12},
  abstract = {In this paper, we present Anagram, a content anomaly detector that models a mixture ofhigh-order n-grams (n > 1) designed to detect anomalous and  suspicious  network packet payloads. By using higher-order n-grams, Anagram can detect significant anomalous byte sequences and generate robust signatures of validated malicious packet content. The Anagram content models are implemented using highly efficient Bloom filters, reducing space requirements and enabling privacy-preserving cross-site correlation. The sensor models the distinct content flow of a network or host using a semi-supervised training regimen. Previously known exploits, extracted from the signatures of an IDS, are likewise modeled in a Bloom filter and are used during training as well as detection time. We demonstrate that Anagram can identify anomalous traffic with high accuracy and low false positive rates. Anagram’s high-order n-gram analysis technique is also resilient against simple mimicry attacks that blend exploits with  normal  appearing byte padding, such as the blended polymorphic attack recently demonstrated in [1]. We discuss randomized n-gram models, which further raises the bar and makes it more difficult for attackers to build precise packet structures to evade Anagram even if they know the distribution of the local site content flow. Finally, Anagram’s speed and high detection rate makes it valuable not only as a standalone sensor, but also as a network anomaly flow classifier in an instrumented fault-tolerant host-based environment; this enables significant cost amortization and the possibility of a  symbiotic  feedback loop that can improve accuracy and reduce false positive rates over time.},
  year = {2006}

}

Algorithm:

- Anagram: Uses a Bloom filter and a binary-based detection model.

Data:

Results:

- 0.01% 

Paper 11: Learning DFA representations of HTTP for protecting web applications


http://www.sciencedirect.com.proxy.library.carleton.ca/science?_ob=ArticleURL&_udi=B6VRG-4M813XK-1&_user=4799849&_coverDate=04%2F11%2F2007&_rdoc=1&_fmt=high&_orig=gateway&_origin=gateway&_sort=d&_docanchor=&view=c&_searchStrId=1698577181&_rerunOrigin=scholar.google&_acct=C000051236&_version=1&_urlVersion=0&_userid=4799849&md5=25216ad5e615851284bfc80ab4fcffd8&searchtype=a

  1. Citations:52

BibTex: @article{Ingham20071239, title = "Learning DFA representations of HTTP for protecting web applications", journal = "Computer Networks", volume = "51", number = "5", pages = "1239 - 1255", year = "2007", note = "From Intrusion Detection to Self-Protection", issn = "1389-1286", doi = "DOI: 10.1016/j.comnet.2006.09.016", url = "http://www.sciencedirect.com/science/article/B6VRG-4M813XK-1/2/2b207df754252a0173b5a2ab7a340d6f", author = "Kenneth L. Ingham and Anil Somayaji and John Burge and Stephanie Forrest", keywords = "Anomaly intrusion detection", keywords = "Finite automata induction", keywords = "Web server security", abstract = " Intrusion detection is a key technology for self-healing systems designed to prevent or manage damage caused by security threats. Protecting web server-based applications using intrusion detection is challenging, especially when autonomy is required (i.e., without signature updates or extensive administrative overhead). Web applications are difficult to protect because they are large, complex, highly customized, and often created by programmers with little security background. Anomaly-based intrusion detection has been proposed as a strategy to meet these requirements. This paper describes how DFA (Deterministic Finite Automata) induction can be used to detect malicious web requests. The method is used in combination with rules for reducing variability among requests and heuristics for filtering and grouping anomalies. With this setup a wide variety of attacks is detectable with few false-positives, even when the system is trained on data containing benign attacks (e.g., attacks that fail against properly patched servers)." }

Algorithm:

- DFA induction

Data:

- a data set consisting of 65 attacks againest web servers.
- normal data collected from 4 different web sites

Results: 40 false alarms per day. ----------------------------------------------------------------------------->>> ASK ???


Paper 12: Behavioral Distance Measurement Using Hidden Markov Models


http://www.springerlink.com.proxy.library.carleton.ca/content/cr2425318t240n4l/

  1. Citations:43

BibTex: @incollection {springerlink:10.1007/11856214_2,

  author = {Gao, Debin and Reiter, Michael and Song, Dawn},
  affiliation = {Carnegie Mellon University},
  title = {Behavioral Distance Measurement Using Hidden Markov Models},
  booktitle = {Recent Advances in Intrusion Detection},
  series = {Lecture Notes in Computer Science},
  editor = {Zamboni, Diego and Kruegel, Christopher},
  publisher = {Springer Berlin / Heidelberg},
  isbn = {},
  pages = {19-40},
  volume = {4219},
  url = {http://dx.doi.org/10.1007/11856214_2},
  note = {10.1007/11856214_2},
  abstract = {The behavioral distance between two processes is a measure of the deviation of their behaviors. Behavioral distance has been proposed for detecting the compromise of a process, by computing its behavioral distance from another process executed on the same input. Provided that the two processes are diverse and so unlikely to fall prey to the same attacks, an increase in behavioral distance might indicate the compromise of one of them. In this paper we propose a new approach to behavioral distance calculation using a new type of Hidden Markov Model. We also empirically evaluate the intrusion detection capability of our proposal when used to measure the distance between the system-call behaviors of diverse web servers. Our experiments show that it detects intrusions with substantially greater accuracy and with performance overhead comparable to that of prior proposals.},
  year = {2006}

}

Algorithm:

- a new type of Hidden Markov Model

Data:

- 2 machines running webservers.Requests generated by generic test suite WebBench5.0 then system calls of web servers recorded

Results:

- 0% - 1.44% -------------------------------------------------------------------------->Need to rexamine that result.

Paper 13: A Framework for Constructing Features and Models for Intrusion Detection Systems


http://portal.acm.org/citation.cfm?id=382914

  1. Citations: 540

BibTex: @article{Lee:2000:FCF:382912.382914,

author = {Lee, Wenke and Stolfo, Salvatore J.},
title = {A framework for constructing features and models for intrusion detection systems},
journal = {ACM Trans. Inf. Syst. Secur.},
volume = {3},
issue = {4},
month = {November},
year = {2000},
issn = {1094-9224},
pages = {227--261},
numpages = {35},
url = {http://doi.acm.org/10.1145/382912.382914},
doi = {http://doi.acm.org/10.1145/382912.382914},
acmid = {382914},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {data mining, feature construction, intrusion detection},

}

Abstract: Intrusion detection (ID) is an important component of infrastructure protection mechanisms. Intrusion detection systems (IDSs) need to be accurate, adaptive, and extensible. Given these requirements and the complexities of today’s network environments, we need a more systematic and automated IDS development process rather than the pure knowledge encoding and engineering approaches. This article describes a novel framework, MADAM ID, for Mining Audit Data for Automated Models for Intrusion Detection. This framework uses data mining algorithms to compute activity patterns from system audit data and extracts predictive features from the patterns. It then applies machine learning algorithms to the audit records that are processed according to the feature definitions to generate intrusion detection rules. Results from the 1998 DARPA Intrusion Detection Evaluation showed that our ID model was one of the best performing of all the participating systems. We also briefly discuss our experience in converting the detection models produced by off-line data mining programs to real-time modules of existing IDSs.


Algorithm:

- data mining algorithms (MADAM ID (for Mining Audit Data for Automated Models for Intrusion Detection), to compute activity patterns from system audit data and extracts predictive features from the patterns. It then applies machine learning algorithms to the audit records

that are processed according to the feature definitions to generate intrusion detection rules.

Data:

- 1998 DARPA Intrusion Detection Evaluation
- We trained our intrusion detection models 
(i.e., the base models and the meta-level classifier) using the 7 weeks of labeled data, and used them to make predictions on the 2 weeks of
unlabeled test data 
Results:
- look at page 21 pdf. I would say 0.02

Paper 14: Anomalous Payload-based Network Intrusion Detection1


  1. Citations: 391

BibTex: @incollection {springerlink:10.1007/978-3-540-30143-1_11,

  author = {Wang, Ke and Stolfo, Salvatore J.},
  affiliation = {Computer Science Department, Columbia University, 500 West 120th Street, New York, NY, 10027  },
  title = {Anomalous Payload-Based Network Intrusion Detection},
  booktitle = {Recent Advances in Intrusion Detection},
  series = {Lecture Notes in Computer Science},
  editor = {Jonsson, Erland and Valdes, Alfonso and Almgren, Magnus},
  publisher = {Springer Berlin / Heidelberg},
  isbn = {},
  pages = {203-222},
  volume = {3224},
  url = {http://dx.doi.org/10.1007/978-3-540-30143-1_11},
  note = {10.1007/978-3-540-30143-1_11},
  abstract = {We present a payload-based anomaly detector, we call PAYL, for intrusion detection. PAYL models the normal application payload of network traffic in a fully automatic, unsupervised and very effecient fashion. We first compute during a training phase a profile byte frequency distribution and their standard deviation of the application payload flowing to a single host and port. We then use Mahalanobis distance during the detection phase to calculate the similarity of new data against the pre-computed profile. The detector compares this measure against a threshold and generates an alert when the distance of the new input exceeds this threshold. We demonstrate the surprising effectiveness of the method on the 1999 DARPA IDS dataset and a live dataset we collected on the Columbia CS department network. In once case nearly 100% accuracy is achieved with 0.1% false positive rate for port 80 traffic.},
  year = {2004}

}

Algorithm:

 - For Detection : Mahalanobis distance

Data:

- 1999 DARPA IDS dataset
- a live dataset we collected on the Columbia CS department network
Results:
 - 0.1% fp

Paper 15: Data Mining Methods for Detection of New Malicious Executables


http://www.computer.org/portal/web/csdl/doi/10.1109/SECPRI.2001.924286

  1. Citations: 247

BibTex:

Abstract: A serious security threat today is malicious executables, especially new, unseen malicious executables often arriving as email attachments. These new malicious executables are created at the rate of thousands every year and pose a serious security threat. Current anti-virus systems attempt to detect these new malicious programs with heuristics generated by hand. This approach is costly and oftentimes ineffective. In this paper, we present a data-mining framework that detects new, previously unseen malicious executables accurately and automatically. The data-mining framework automatically found patterns in our data set and used these patterns to detect a set of new malicious binaries. Comparing our detection methods with a traditional signaturebased method, our method more than doubles the current detection rates for new malicious executables.

Algorithm:

- 3 learning algorithms:
   - RIPPER: An inductive rule-based learner that generates boolean rules based on feature attributes

- Naive Bayes: A probabilistic method that generated probabilities that an example was in a class give a set of features - Multi Naive Bayes:multi-classifier system that combines the outputs from several classifiers to generate a prediction.

Data:

- Training and test data are live data labeled using an antivirus program.

Results:

- 0.39%

Paper 16: Anomalous Payload-based Worm Detection and Signature Generation


http://www.springerlink.com.proxy.library.carleton.ca/content/75h308806288v3p1/

  1. Citations: 159

BibTex: @incollection {springerlink:10.1007/11663812_12,

  author = {Wang, Ke and Cretu, Gabriela and Stolfo, Salvatore},
  affiliation = {Computer Science Department, Columbia University, 500 West 120th Street, New York, NY 10027, USA},
  title = {Anomalous Payload-Based Worm Detection and Signature Generation},
  booktitle = {Recent Advances in Intrusion Detection},
  series = {Lecture Notes in Computer Science},
  editor = {Valdes, Alfonso and Zamboni, Diego},
  publisher = {Springer Berlin / Heidelberg},
  isbn = {},
  pages = {227-246},
  volume = {3858},
  url = {http://dx.doi.org/10.1007/11663812_12},
  note = {10.1007/11663812_12},
  abstract = {New features of the PAYL anomalous payload detection sensor are demonstrated to accurately detect and generate signatures for zero-day worms. Experimental evidence demonstrates that site-specific packet content models are capable of detecting new worms with high accuracy in a collaborative security system. A new approach is proposed that correlates ingress/egress payload alerts to identify the worm’s initial propagation. The method also enables automatic signature generation that can be deployed immediately to network firewalls and content filters to proactively protect other hosts. We also propose a collaborative privacy-preserving security strategy whereby different hosts can exchange PAYL signatures to increase accuracy and mitigate against false positives. The important principle demonstrated is that correlating multiple alerts identifies true positives from the set of anomaly alerts and reduces incorrect decisions producing accurate mitigation.},
  year = {2006}

}

Algorithm: PAYL with tweaks.

Data:

- 3 real life data sets

Results: 0.1% ( this is 5.8 alerts per hour for one of the datasets). (why is this lower than everyone else?)


Paper 17: Unsupervised Anomaly Detection in Network Intrusion Detection Using Clusters


http://portal.acm.org.proxy.library.carleton.ca/citation.cfm?id=1082198

  1. Citations: 80

BibTex: @inproceedings{Leung:2005:UAD:1082161.1082198,

author = {Leung, Kingsly and Leckie, Christopher},
title = {Unsupervised anomaly detection in network intrusion detection using clusters},
booktitle = {Proceedings of the Twenty-eighth Australasian conference on Computer Science - Volume 38},
series = {ACSC '05},
year = {2005},
isbn = {1-920-68220-1},
location = {Newcastle, Australia},
pages = {333--342},
numpages = {10},
url = {http://portal.acm.org/citation.cfm?id=1082161.1082198},
acmid = {1082198},
publisher = {Australian Computer Society, Inc.},
address = {Darlinghurst, Australia, Australia},

}

Abstract: Most current network intrusion detection systems employ signature-based methods or data mining-based methods which rely on labelled training data. This training data is typically expensive to produce. Moreover, these methods have difficulty in detecting new types of attack. Using unsupervised anomaly detection techniques, however, the system can be trained with unlabelled data and is capable of detecting previously "unseen" attacks. In this paper, we present a new density-based and grid-based clustering algorithm that is suitable for unsupervised anomaly detection. We evaluated our methods using the 1999 KDD Cup data set. Our evaluation shows that the accuracy of our approach is close to that of existing techniques reported in the literature, and has several advantages in terms of computational complexity.

Algorithm:

- Clustering Algorithm fpMAFIA

Data:

- 1999 KDD CUP data set (density-based and grid-based high dimensional clustering

algorithm for large data sets) Results:

- Confusing !!! --> looks really high on the ROC curve -------------->

Paper 18: Specification-based Anomaly Detection: A New Approach for Detecting Network Intrusions


  1. Citations: 222

BibTex:

@inproceedings{Sekar:2002:SAD:586110.586146,
author = {Sekar, R. and Gupta, A. and Frullo, J. and Shanbhag, T. and Tiwari, A. and Yang, H. and Zhou, S.},
title = {Specification-based anomaly detection: a new approach for detecting network intrusions},
booktitle = {Proceedings of the 9th ACM conference on Computer and communications security},
series = {CCS '02},
year = {2002},
isbn = {1-58113-612-9},
location = {Washington, DC, USA},
pages = {265--274},
numpages = {10},
url = {http://doi.acm.org/10.1145/586110.586146},
doi = {http://doi.acm.org/10.1145/586110.586146},
acmid = {586146},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {anomaly detection, intrusion detection, network monitoring},

}


Abstract:

Unlike signature or misuse based intrusion detection techniques, anomaly detection is capable of detecting novel attacks. However, the use of anomaly detection in practice is hampered by a high rate of false alarms. Specification-based techniques have been shown to produce a low rate of false alarms, but are not as effective as anomaly detection in detecting novel attacks, especially when it comes to network probing and denial-of-service attacks. This paper presents a new approach that combines specification-based and anomaly-based intrusion detection, mitigating the weaknesses of the two approaches while magnifying their strengths. Our approach begins with state-machine specifications of network protocols, and augments these state machines with information about statistics that need to be maintained to detect anomalies. We present a specification language in which all of this information can be captured in a succinct manner. We demonstrate the effectiveness of the approach on the 1999 Lincoln Labs intrusion detection evaluation data, where we are able to detect all of the probing and denial-of-service attacks with a low rate of false alarms (less than 10 per day). Whereas feature selection was a crucial step that required a great deal of expertise and insight in the case of previous anomaly detection approaches, we show that the use of protocol specifications in our approach simplifies this problem. Moreover, the machine learning component of our approach is robust enough to operate without human supervision, and fast enough that no sampling techniques need to be employed. As further evidence of effectiveness, we present results of applying our approach to detect stealthy email viruses in an intranet environment.

Algorithm:

-  

Data:

- 1999 Lincoln Labs intrusion detection evaluation

data Results:


Other Critiques of IDS

Discuss past work on criticizing IDS research

Potential Solutions

Discussion

synthetic versus real data issue attack distribution issue

Conclusion

References