March 4, 2009

Gmail and Spam Filtering

"Many Google teams provide pieces of the spam-protection puzzle, from distributed computing to language detection. For example, we use optical character recognition (OCR) developed by the Google Book Search team to protect Gmail users from image spam. And machine-learning algorithms developed to merge and rank large sets of Google search results allow us to combine hundreds of factors to classify spam," explains Google. "Gmail supports multiple authentication systems, including SPF (Sender Policy Framework), DomainKeys, and DKIM (DomainKeys Identified Mail), so we can be more certain that your mail is from who it says it's from. Also, unlike many other providers that automatically let through all mail from certain senders, making it possible for their messages to bypass spam filters, Gmail puts all senders through the same rigorous checks."

See also:

- Official Gmail Blog: How our spam filter works
- A Distributed Bayesian Spam Filtering using Hadoop Map/Reduce
- or Parallelizing Support Vector Machines on Distributed Computers
- Sender Reputation in a Large Webmail Service
- Spam Filtering using Google/GMAIL

5 comments:

  1. Anonymous9/3/09 13:56

    The problem for me is chain mail that really comes from who it says it's coming. Doesn't make it wanted mail anyway.

    ReplyDelete
  2. Did you mean the chain letter? Yeah, It seems really tricky. Hence, Naive Bayes classifier and SVM filter will be helpful than rule-based filter.

    ReplyDelete
  3. I've noticed over the past few weeks that a LOT more spam is reaching my inbox. I dutifully report it all, but since I don't look at it I have no idea whether the spammers have come up with new techniques. Is this a general problem? FWIW, I've been in Australia rather than UK for the past month, but that may just be a coincidence!

    ReplyDelete
  4. Well, I guess the spam filter became more and more intelligent by reporting spam mails so it seems surprising that spam increment.

    Do you use GMail? I thought GMail is very good in filtering spam mails.

    ReplyDelete
  5. help---- I am getting a LOT___LOT of spam mail with MY----someone else's user name and MY email address----how can that be-------when I try creating a filter----I am filtering out my own emails--------HELP

    ReplyDelete