Machine Learning for Spam Detection Resources

This is my (very) small contribution to solving the spam problem. I maintain this selected resources page for those interested on using Machine Learning techniques to detect and filter spam email (building Bayesian filters). Comments and resource proposals are welcome, send them to jmgomez at uem dot es. More about me at mi home page.

[Collections] [Community] [Conferences] [Features of Spam] [Lists and Newsletters] [(Opensource) software] [Papers and Tutorials]


Collections

Message collections for training or testing an automatic spam filter.

[Collections] [Community] [Conferences] [Features of Spam] [Lists and Newsletters] [(Opensource) software] [Papers and Tutorials]


Community

Researchers with interest and relevant ideas about spam filtering, alphabetical ordering.

[Collections] [Community] [Conferences] [Features of Spam] [Lists and Newsletters] [(Opensource) software] [Papers and Tutorials]


Conferences

Conferences that deal with spam from a research perspective, much of them with special focus on Bayesian filters. Natural Language, Internet, Machine Learning, and other conferences may accept or have accepted research papers on the topic.

[Collections] [Community] [Conferences] [Features of Spam] [Lists and Newsletters] [(Opensource) software] [Papers and Tutorials]


Features of Spam

Source of information about spammers tactics and features to check if a message is spam or not.

[Collections] [Community] [Conferences] [Features of Spam] [Lists and Newsletters] [(Opensource) software] [Papers and Tutorials]


Lists and Newsletters

E-mail lists and newsletters devoted to the topic.

[Collections] [Community] [Conferences] [Features of Spam] [Lists and Newsletters] [(Opensource) software] [Papers and Tutorials]


(Opensource) Software

(Opensource) libraries for Machine Learning.

(Opensource) software systems that learn to detect spam (aka Bayesian filters). You may find others at SourceForge (search for spam).

[Collections] [Community] [Conferences] [Features of Spam] [Lists and Newsletters] [(Opensource) software] [Papers and Tutorials]


Papers and Tutorials

I maintain an outdated, searchable Bibliography on Machine Learning for Spam Detection at The Collection of Computer Science Bibliographies, that you cand download in BiBTeX, or HTML. This list of references includes not only papers that use Machine Learning technics to detect spam messages, but also research papers on other methods or discussing the spam problem, and even industry evaluations of anti-spam filters. I do believe that these latter kind of references may be of help for writing a research paper on the topic, or as a source of integrated approaches (e.g. testing a Bayesian filter with user-defined white lists and communitary clack lists in a corporate environment).

Please, help me to keep it up to date by suggesting me additions and corrections to jmgomez at uem dot es.

Disclaimer: I only list online references.

[Collections] [Community] [Conferences] [Features of Spam] [Lists and Newsletters] [(Opensource) software] [Papers and Tutorials]


José María Gómez Hidalgo