Machine Learning Protects Businesses and Consumers from Cyber Threats

With the overwhelming amount of news coverage surrounding the novel coronavirus, cybercriminals have taken to exploiting public fears about the virus. Since January 1, the FTC has received more than 18 thousand reports related to COVID-19 scams which have resulted in upwards of $13 million in fraud.

In a blog post from Google executives, Neil Kumaran and Sam Lugani shared some staggering statistics regarding recent cyber threats. Gmail blocks more than 100 million phishing emails every day. In early April, they saw 18 million daily malware and phishing emails related to COVID-19. This is in addition to more than 240 million COVID-related daily spam messages. Their machine learning models have evolved to understand and filter these threats, which block more than 99.9% of spam, phishing, and malware from reaching their users’ inboxes.

How do these scams work? Cybercriminals send emails claiming to be from legitimate organizations with information about COVID-19. This can take many forms: a scammer may imitate a government institution to phish small businesses hoping to capitalize on government stimulus packages, they may take advantage of work-from-home employees to ask them to open a malware-laced attachment to verify payroll information, while others impersonate authoritative government organizations like the World Health Organization to solicit fraudulent donations.

Google stays ahead of these threats by using document scanners that rely on deep learning to improve malware detection capabilities across over 300 billion attachments scanned each week. Even though over half of the malicious documents blocked by Gmail are different from day to day, these machine learning capabilities help maintain a high rate of detection.

At Extract, we have developed machine learning capabilities in order to better capture discrete data fields from unstructured documents. Our software is “smart” enough that it doesn’t need the documents to be all in the same format or file type. Extract machine learning can be used for all data and document types, including classification, pagination, indexing, and redaction. Our software is constantly learning, which means constantly improving. Models are automatically trained based upon operator verification output to continuously improve without extended supervision from Extract's Data Capture team

To learn more about how we capture data and continuously improve our systems, please reach out today.


About the Author: Andrew Loeffler

Andrew is a Project Manager and Customer Support Specialist at Extract with 10 years’ experience in the healthcare IT industry. He worked at Epic Systems on the technical services and vendor relations teams for a decade and has partnered with IT teams and clinicians at major healthcare organizations. Andrew is experienced in managing development, working with customer teams on version upgrades and implementation projects to deliver healthcare software solutions on time and under budget. He is passionate about developing products and innovative solutions to tackle healthcare challenges.