CCS2011: Enemy of the Good

From Soma-notes

ToDo

  • Gather data from different IDS observables to show they aren't Gaussian
    • system calls (Luc)
    • network traffic
    • log files
  • Machine learning
    • standard machine learning methods approximate distributions
    • approximation works best if Gaussian but has limits (show mathematically)
    • non-Gaussian distributions place much harsher restrictions on error rates, they don't go down proportionally to sample size? (more math)
  • Survey of results in IDS literature


Title

The Enemy of the Good: Re-evaluating Research Directions in Intrusion Detection

Abstract

Introduction

  • For IDS to work, we need very accurate detectors
    • base rate fallacy
    • specifically, very low false alarm rates
  • To date, nobody has achieved sufficiently low false alarm rates to be universally applicable
    • signature and spec methods can be ad-hoc tuned to be good enough but then have poor coverage of new attacks
    • adaptive methods cannot be sufficiently tuned
  • We argue that we can't get low enough false alarm rates, that there are fundamental limits on IDS performance due to the underlying distributions of legitimate and attacker behavior.
  • Reasons:
    • legit behavior is non-Gaussian, largely power-law like, meaning they have fat tails
    • attacker behavior cannot be sampled sufficiently to learn distribution
    • and besides, attacker behavior keeps changing to follow new attack innovations (more like spread of disease than Gaussian, fundamentally not stationary) and to behave more like legitimate behavior to avoid defenders
    • if we could get good samples of both classes, we might be able to separate them; but instead we must do one-class learning and one-class learning cannot deal well with very long tails.


    • Classifier technology and the illusion of progress[1]

Sections:

  • Problem

What Goes Wrong

  • Poor results
    • datasets do not represent real-world usage or scenarios accurately
    • insufficient or misleading tests of false positive rates
    • Even when rates are accurate, they are misinterpreted: high FP rates are not considered to be high (wrong time scale, lack of attention to scalability)
    • misleading integration of attacks into legitimate behavior
  • Administrative overhead
    • rules that can only be created by experts, but system requires end users to create custom rules
    • experts required to interpret output
    • insufficient context for even experts to interpret output
    • assumption of existence of security personnel that won't even exist in many important contexts
  • Computational overhead
    • can system keep up with normal workloads?
    • can system keep up with attacker-generated workloads?
  • Anomalies versus attacks
    • why is one a good proxy for the other?
    • why is chosen feature(s) particularly good at detecting attacks?
  • Out of the box algorithms applied w/o understanding security problem
  • Attacker evasion: how can attacker manipulate system? Can system lead to environment that is easier to attack?

Discussion

Conclusion

References