Skip Navigation
Madison, Wisconsin
Extract Systems
Healthcare

Machine Learning Protects Businesses and Consumers from Cyber Threats

April 23, 2020

With the overwhelming amount of news coverage surrounding the novel coronavirus, cybercriminals have taken to exploiting public fears about the virus. Since January 1, the FTC has received more than 18 thousand reports related to COVID-19 scams which have resulted in upwards of $13 million in fraud.

In a blog post from Google executives, Neil Kumaran and Sam Lugani shared some staggering statistics regarding recent cyber threats. Gmail blocks more than 100 million phishing emails every day. In early April, they saw 18 million daily malware and phishing emails related to COVID-19. This is in addition to more than 240 million COVID-related daily spam messages. Their machine learning models have evolved to understand and filter these threats, which block more than 99.9% of spam, phishing, and malware from reaching their users’ inboxes.

How do these scams work? Cybercriminals send emails claiming to be from legitimate organizations with information about COVID-19. This can take many forms: a scammer may imitate a government institution to phish small businesses hoping to capitalize on government stimulus packages, they may take advantage of work-from-home employees to ask them to open a malware-laced attachment to verify payroll information, while others impersonate authoritative government organizations like the World Health Organization to solicit fraudulent donations.

Google stays ahead of these threats by using document scanners that rely on deep learning to improve malware detection capabilities across over 300 billion attachments scanned each week. Even though over half of the malicious documents blocked by Gmail are different from day to day, these machine learning capabilities help maintain a high rate of detection.

At Extract, we have developed machine learning capabilities in order to better capture discrete data fields from unstructured documents. Our software is “smart” enough that it doesn’t need the documents to be all in the same format or file type. Extract machine learning can be used for all data and document types, including classification, pagination, indexing, and redaction. Our software is constantly learning, which means constantly improving. Models are automatically trained based upon operator verification output to continuously improve without extended supervision from Extract’s Data Capture team

To learn more about how we capture data and continuously improve our systems, please reach out today.

Meet The Author
Chris Mack
Chris is a Marketing Manager at Extract with experience in product development, data analysis, and both traditional and digital marketing. Chris received his bachelor’s degree in English from Bucknell University and has an MBA from the University of Notre Dame. A passionate marketer, Chris strives to make complex ideas more accessible to those around him in a compelling way.
Speak to a solution consultant