Gmail’s Spam Fighting Abilities Get a Major Upgrade

A new upgrade is making Gmail much better at fighting spam, thanks to an innovation Google has been testing for the past year....
Gmail’s Spam Fighting Abilities Get a Major Upgrade
Written by Staff
  • A new upgrade is making Gmail much better at fighting spam, thanks to an innovation Google has been testing for the past year.

    In a blog post, the company explains that platforms like Gmail rely on text classification to identify spam and other harmful content. Google has been working on a new type of text classification called RETVec.

    To help make text classifiers more robust and efficient, we’ve developed a novel, multilingual text vectorizer called RETVec (Resilient & Efficient Text Vectorizer) that helps models achieve state-of-the-art classification performance and drastically reduces computational cost. Today, we’re sharing how RETVec has been used to help protect Gmail inboxes.

    In the company’s internal testing, RETVec improved spam detection by 38% while reducing false positives by 19.4%. RETVec also reduced TPU usage by 83%.

    RETVec achieves these improvements by combining a novel, highly-compact character encoder, an augmentation-driven training regime, and the use of metric learning. The architecture details and benchmark evaluations are available in our NeurIPS 2023 paper and we open-source RETVec on Github.

    Due to its novel architecture, RETVec works out-of-the-box on every language and all UTF-8 characters without the need for text preprocessing, making it the ideal candidate for on-device, web, and large-scale text classification deployments. Models trained with RETVec exhibit faster inference speed due to its compact representation. Having smaller models reduces computational costs and decreases latency, which is critical for large-scale applications and on-device models.

    Perhaps best of all, Google is making RETVec available as an open source project that organizations can customize and use.

    RETVec is a novel open-source text vectorizer that allows you to build more resilient and efficient server-side and on-device text classifiers. The Gmail spam filter uses it to help protect Gmail inboxes against malicious emails.

    If you would like to use RETVec for your own use cases or research, we created a tutorial to help you get started.

    Get the WebProNews newsletter delivered to your inbox

    Get the free daily newsletter read by decision makers

    Subscribe
    Advertise with Us

    Ready to get started?

    Get our media kit