This presentation will explain how one-to-many, many-to-one, or many-to-many models are at least as likely as the old one-detection-per-variant model, why "Do you detect in 32/UnpleasantVirus.EG?" is such a difficult question to answer, and explain why exact indication is not a pre-requisite for detection and remediation of malware, and actually militates against the most effective use of analysis and development time and resources. But what is the information that the end-user or end-site really needs to know about an incoming threat?
The Game of the Name
Malware Naming, Shape Shifters and Sympathetic Magic
David Harley BA CISSP FBCS CITP Director of Malware Intelligence, ESET
ESET LLC, 610 West Ash Street, Suite 1900, San Diego, CA 92101 dharley@eset.com; +1 619 204 6461
CFET 2009 rd3 International Conference on
Cybercrime Forensics Education & Training
Abstract
Once upon a time, one infection by specific malware looked much like another infection, to an antivirus scanner if not to the naked eye. Even back then, virus naming wasn't very consistent between vendors, but at least virus encyclopaedias and third-party resources like vgrep made it generally straightforward to map one vendor's name for a virus to another vendor's name for the same malware.
In 2009, though, the threat landscape looks very different. Viruses and other replicative malware, while far from extinct, pose a comparatively manageable problem compared to other threats with the single common characteristic of malicious intent. Proof-of-Concept code with sophisticated self-replicating mechanisms is of less interest to today's malware authors than shape-shifting Trojans that change their appearance frequently to evade detection and are intended to make money for criminals rather than getting adolescent admiration and bragging rights.
Sheer sample glut makes it impossible to categorize and standardize on naming for each and every unique sample out of tens of thousands processed each day.
Detection techniques such as generic signatures, heuristics and sandboxing have also changed the ways in which malware is detected and therefore how it is classified, confounding the old assumptions of a simple one-to-one relationship between a detection label and a malicious program. This presentation will explain how one-to-many, many-to-one, or many-to-many models are at least as likely as the old one-detection-per-variant model, why "Do you detect Win32/UnpleasantVirus.EG?" is such a difficult question to answer, and explain why exact indication is not a pre-requisite for detection and remediation of malware, and actually militates against the most effective use of analysis and development time and resources. But what is the information that the end-user or end-site really needs to know about an incoming threat?
Introduction Damon Knight's short story "Babel II" (a science fiction story from 1953: strange how often Sci-Fi crops up in this field!) tells of a world where the protagonist's encounter with an alien he calls the "Hooligan" results in a state of affairs where speech and writing is scrambled so that no human being can understand the speech of any other human being: all written material has also been rendered unintelligible.
Unfortunately, the way in which we (the anti-malware industry) identify malware in terms of naming has become more and more like the North American city of Knight's story.
In the early days of anti-virus, it didn't matter so much. One infection by specific malware looked much like another infection: not to a human observer perhaps (unless you happened to be one of the relatively few people with the knowledge and resources to inspect a disk's boot sector and see that something wasn't right, for instance), but certainly to an antivirus scanner.
It's perfect true that there were complaints even in the early 1990s or earlier about inconsistent virus naming between vendors, but at least virus encyclopaedias and third-party resources like vgrep [1] made it generally straightforward to map one vendor's name for a virus to another vendor's name for the same malware. In fact, vendors still try to maintain a correlation in their descriptions databases between their naming and that used by other vendors (see Figure 1). Furthermore, vgrep (Figure 2), a utility made available under the auspices of Virus Bulletin (the most influential periodical in the anti-malware industry) for online and offline correlation of virus names, is still in existence, though of debate value in today's threatscape . (There are also other tools which have never been publicly available.)
Figure 1: Cross-Reference Between Vendor Detection Names
So what has changed? In the early 1990s, the virus problem was pretty well contained, and Trojans were hardly a problem at all. Most malware spread fairly slowly, and didn't change shape too often. The... [download for more]