Find White Papers
Home
About Us
List Your Papers
    
> GFI > Why Bayesian Filtering is the Most Effective Anti-Spam Technology

Why Bayesian Filtering is the Most Effective Anti-Spam Technology

White Paper Published By: GFI

This white paper describes how Bayesian mathematics can be applied to the spam problem, resulting in an adaptive, 'statistical intelligence' technique that achieves very high spam detection rates.



Tags : 
anti spam, anti-spam, filter, block, blocking, email security, phishing, baysian

GFI
Published:  Jun 12, 2007
Type:  White Paper
Length:  7 pages









Why Bayesian filtering is the most
effective anti-spam technology
Achieving a 98%+ spam detection rate using a
mathematical approach
This white paper describes how Bayesian filtering works and explains why it is the best way to combat spam.

WWW.GFI.COM Why Bayesian filtering is the most effective anti-spam technology . 2
Introduction This white paper describes how Bayesian mathematics can be applied to the spam problem, resulting in an adaptive, 'statistical intelligence' technique that achieves very high spam detection rates.
It also explains why the Bayesian approach is the best way to tackle spam once and for all, as it overcomes the obstacles faced by more static technologies such as blacklist checking, comparing to databases of known spam and keyword checking. These technologies are not obsolete, but cannot be relied upon without a Bayesian filter.
Introduction....................................................................................................................................2 Current spam detection techniques...............................................................................................2 How the Bayesian spam filter works .............................................................................................2 Why Bayesian filtering is better.....................................................................................................4 About GFI MailEssentials ..............................................................................................................6 About GFI ......................................................................................................................................7
Current spam detection techniques Spam is an ever-increasing problem. The number of spam mails is increasing daily - studies show that over 50% of all current email is spam; the Radicati Group predicts this will reach 70% by 2007. Added to this, spammers are becoming more sophisticated and are constantly managing to outsmart 'static' methods of fighting spam.
The techniques currently used by most anti-spam software are static, meaning that it is fairly easy to evade by tweaking the message a little. To do this, spammers simply examine the latest anti-spam techniques and find ways how to dodge them.
To effectively combat spam, an adaptive new technique is needed. This method must be familiar with spammers' tactics as they change over time. It must also be able to adapt to the particular organization that it is protecting from spam. The answer lies in Bayesian mathematics.
How the Bayesian spam filter works Bayesian filtering is based on the principle that most events are dependent and that the probability of an event occurring in the future can be inferred from the previous occurrences of that event. (More information about the mathematical basis of Bayesian filtering is available at Bayesian Parameter Estimation -
WWW.GFI.COM Why Bayesian filtering is the most effective anti-spam technology . 3
http://www-ccrma.stanford.edu/~jos/bayes/Bayesian_Parameter_Estimation.html
and An Introduction to Bayesian Networks and their Contemporary Applications - http://www.niedermayer.ca/papers/bayesian/bayes.html).
This same technique can be used to classify spam. If some piece of text occurs often in spam but not in legitimate mail, then it would be reasonable to assume that this email is probably spam.
Creating a tailor-made Bayesian word database Before mail can be filtered using this method, the user needs to generate a database with words and tokens (such as the $ sign, IP addresses and domains, and so on), collected from a sample of spam mail and valid mail (referred to as 'ham').
Creating a word database for the filter
A probability value is then assigned to each word or token; the probability is based on calculations that take into account how often that word occurs in spam as opposed to legitimate mail (ham). This is done by analyzing the users' outbound mail and by analyzing known spam: All the words and tokens in both pools of mail are analyzed to generate the probabil... [download for more]

Browse Technology Topics

Data Center

Virtualization, Cloud Computing, Infrastructure, Design and Facilities, Power and Cooling, Green Computing  
    

Data Management

Application Integration, Analytical Applications, Business Intelligence, Configuration Management, Database Development, Data Integration, Data Mining, Data Protection, Data Quality, Data Replication, Database Security, EDI, SOAP, Service Oriented Architecture, Web Service Management, Data Warehousing  
    

Enterprise Applications

Application Integration, Application Performance Management, Best Practices, Business Activity Monitoring, Business Analytics, Business Integration, Business Intelligence, Business Management, Business Metrics, Business Process Automation, Business Process Management, Call Center Management, Call Center Software, Change Management, Corporate Governance, Customer Interaction Service, Customer Relationship Management, Customer Satisfaction, Customer Service, EBusiness, Enterprise Resource Planning, Enterprise Software, EProcurement, Extranets, Groupware Workflow, HIPAA Compliance, IP Faxing, IT Spending, Marketing Automation, Performance Testing, Product Lifecycle Management, Project Management, Return On Investment, Risk Management, Sales & Marketing Software, Sales Automation, Server Virtualization, Simulation Software, Supply Chain Management, System Management Software, Total Cost of Ownership, Video Conferencing, Voice Recognition, Voice Over IP, Workforce Management, Incentive Compensation, Spend Management, Manufacturing Execution Systems, International Computing  

Human Resource Technology

Human Resources Services, Payroll Software, Time and Attendance Software, Workforce Management Software, Financial Management, Employee Monitoring Software, Employee Training Software, Recruiting Software/Services, Employee Performance Management, ELearning, Benefits Management, Expense Management  
    

IT Career Advancement

Cisco Certification, Microsoft Certification, Linux Certification, Network Security Certification, Software Development Certification  

IT Management

Employee Performance, ITIL, Productivity, Project Management, Software Compliance, Sarbanes Oxley Compliance, Service Management, Desktop Management  
    

Knowledge Management

Collaboration, Collaborative Commerce, Contact Management, Content Delivery, Content Integration, Content Management System, Corporate Portals, Customer Experience Management, Document Management, Information Management, Intranets, Messaging, Records Management, Search And Retrieval, Search Engines, Secure Content Management, SLA  

Networking

Active Directory, Bandwidth Management, Convergence, Distributed Computing, Ethernet Networking, Fibre Channel, Gigabit Networking, Governance, Grid Computing, Infrastructure, Internetworking Hardware, Interoperability, IP Networks, IP Telephony, Local Area Networking, Load Balancing, Migration, Monitoring, Network Architecture, Network Management, Network Performance, Network Performance Management, Network Provisioning, Network Security, OLAP, Optical Networking, Quality Of Service, Remote Access, Remote Network Management, Server Hardware, Servers, Small Business Networks, TCP/IP Protocol, Test And Measurement, Traffic Management, Tunneling, Utility Computing, VPN, Wide Area Networks, Green Computing, Cloud Computing, Power and Cooling, Data Center Design and Management, Colocation and Web Hosting  
    

Platforms

AS/400, Domino, Linux, Microsoft Exchange, Oracle, PeopleSoft, SAP, Siebel, Solaris, Tivoli, Unix, Web Sphere, Windows, Windows Server  

Security

Access Control, Anti Spam, Anti Spyware, Anti Virus, Application Security, Auditing, Authentication, Biometrics, Business Continuity, Compliance, DDoS, Disaster Recovery, Email Security, Encryption, Firewalls, Hacker Detection, High Availability, Identity Management, Internet Security, Intrusion Detection, Intrusion Prevention, IPSec, Network Security Appliance, Password Management, Patch Management, Phishing, PKI, Policy Based Management, Security Management, Security Policies, Single Sign On, SSL, Secure Instant Messaging, Web Service Security, PCI Compliance, Vulnerability Management  
    

Software Development

.NET, C++, Database Development, Java, Middleware, Open Source, Software Outsourcing, Quality Assurance, Scripting, SOAP, Software Testing, Visual Basic, Web Development, Web Services, Web Service Security, XML  

Storage

Backup And Recovery, Blade Servers, Clustering, IP Storage, ISCSI, Network Attached Storage, RAID, Storage Area Networks, Storage Management, Storage Virtualization, Email Archiving, Data Deduplication  
    

Wireless

802.11, Bluetooth, CDMA, GPS, Mobile Computing, Mobile Data Systems, Mobile Workers, PDA, RFID, Smart Phones, WiFi, Wireless Application Software, Wireless Communications, Wireless Hardware, Wireless Infrastructure, Wireless Messaging, Wireless Phones, Wireless Security, Wireless Service Providers, WLAN  
Search