This paper is designed to provide a basic understanding of what heuristics are and how they are used in the anti-malware industry.
Topics covered include signature based detection, generic signatures, passive heuristics, and active heuristics or emulation. A very basic compression algorithm is developed and taught so as to enhance understanding of how compression works and why it poses problems for signature based detection. Encryption and polymorphism are also explained in easy to understand terms and examples.
:: Understanding and
Teaching Heuristics
Randy Abrams Director of Technical Education ESET
This paper was originally presented at the AVAR conference in Seoul in 2007,and included in the Conference Proceedings.Table of Contents
Abstract 2So who is Hugh Ristic anyway? 2What is the problem to be solved? 3Signature based detection 4A note on false positives 4Some di? erent types of heuristics 5Some Signatures or Detection 5Passive Heuristics 6Active Heuristics 7Understanding Compression 7Understanding Encryption 8Poly want a cracker? 9Active Heuristics Continued . . . 9Minimizing Heuristic False Positives 10Detections of threats 10How e? ective are heuristics? 11A little Trivia! 11
White Paper: Understanding and Teaching Heuristics1Abstract This paper is designed to provide a basic understanding of what heuristics are and how they are used in the anti-malware industry. Topics covered include signature based detection, generic signatures, passive heuristics, and active heuristics or emulation. A very basic compression algorithm is developed and taught so as to enhance understanding of how compression works and why it poses problems for signature based detection. Encryption and polymorphism are also explained in easy to understand terms and examples.A variety of false positives from a variety of unspeci? ed products are used to reveal some of the types of thinking that go into creating heuristic approaches.For those who already understand the subject, the approach used should provide insight into e? ective methods of teaching complex technical subjects to less technical students, or even to technical people who are simply unfamiliar with the subject.
So who is Hugh Ristic anyway?Heuristics are used by virtually every anti-malware product in one form or another, but most people do not know what the word means. For those who look up the various de? nitions they are left without a functional understanding of how heuristics relate to security or why they are important. Highly technical de? nitions leave non-technical and semi-technical users substantially uninformed. One successful approach to teaching the subject is through the extensive use of analogies to demonstrate the concepts.The Oxford English Dictionary de? nes heuristics as "Enabling a person to discover or learn something for themselves." The American Heritage Dictionary of the English Language de? nes heuristics as "Of or relating to a usually speculative formulation serving as a guide in the investigation or solution of a problem." Neither of these de? nitions is very helpful in describing why heuristics are important and how they work. Visual analogies can be quite useful. The goal of this exercise is to identify a breed of dog that most people will not have a "signature for; that is, most people have never seen one. The "Catahoula Leopard Dog" works well for this purpose.As we try to decide which is the animal we seek we will probably rule out the ? sh as we probably unconsciously have decided that we are looking for a mammal. We have "ruled out" the ? sh. The bear is an interesting example. Computers have to be given very detailed instructions about anything they do. Consider how you would de? ne a dog in such a manner that a person who has never seen either a dog or a bear would not mistake a bear for a
White Paper: Understanding and Teaching Heuristics2dog upon viewing a picture of a bear. The point is that heuristics can be quite di? cult. The Doberman below the ? sh is generally familiar to people and so it is eliminated from the search. Few people are fooled by the cat, even though the word "Catahoula" has the word "cat" in it. Most people are expecting a dog with spots since leopards have spots, but the beagle is a known quantity. We arrive at the Catahoula Leopard dog in the lower right corner through the process of elimination, but in order to use this approach we had to use some heuristics.
The National Center for Supercomputing Applications (NCSA) describes heuristics as "Guidelines that a system administrator uses to intervene where the two-phase commit or abort would otherwise fail."Signature based detection is an example of where a two-phase commit or abort is likely to fail. If a ? le contains an unknown threat, the signature will fail. ... [download for more]