Machine Learning Best Practices

Machine learning is used in everything from healthcare to shopping to even flavor mapping beers.  It has existed for over 50 years, but is omnipresent at this point.  There’s a reason Netflix knows that you’re really going to like the new Pixar movie they just added, and there’s a reason Amazon.com knows that a new scarf would really complement the clothes you just bought.  It’s because they’re always learning, using extensive swaths of data to understand the interconnectivity of preference and definitive rules.

The difficulty in implementing machine learning has often come down to the raw horsepower it takes to run algorithms on massive amounts of data.  The speed of technology growth alleviates this concern and allows a whole new wave of companies to integrate this type of modernization.

Understanding Machine Learning Challenges

While machine learning is the wave of the future, and even present, it’s not without its drawbacks.  If machines are learning from limited data sets, the opportunity for prejudice increases significantly.  At Extract, our experience with billions of pages of often unstructured documents gives us the confidence that when we’re using machine learning, partiality is out the window and we have truly agnostic learning.

So how can you implement best practices for machine learning?

  • Use all your relevant data: Machine learning is all about creating the highest confidence levels of accurate predictions so include all the applicable data you have.  This isn’t to say that just anything should make your dataset, but the more data you have, the faster you can learn.
  • Account for variables: While you’re using all the data you can, make sure you’re aware of the data you don’t have.  It can take a full year’s worth of data or more to understand seasonality, and whether your information changes when the seasons do, or when the days of the week do.
  • Don’t let the machine be the only one learning: Machine learning isn’t a “set it and forget it” endeavor.  It requires analysis to be sure you’re not overfitting your model and that concepts are being learned as opposed to only examples.
Artificial Intelligence

Why Machine Learning?

McKinsey posits that machine learning has brought the error rate of image identification to that of humans.  When Extract supplements this with our AI enabled algorithms and rulesets we reduce that error rate even further while reducing data collection times.

Learn more about how humans stack up against intelligent technology.

This begs the question, why wouldn’t you want your own company or your vendors investing in this technology?  Understanding machine learning is a proven method to eliminate errors, improve turn times, and free up your staff to be using their talents in the most productive way.

At Extract, we’ve been putting machine learning to good use to automate tasks that would have previously required heavy doses of high touch effort.  With over four billion pages sent through our redaction software, our algorithms get put through the ringer and get ample opportunity for continuous improvement.  Our software allows us to learn what strict rules to implement while at the same time giving clients breathing room for preference.

Interested in learning more about, well, learning?  Shoot us a note and we’ll show you how understanding machine learning can impact your business, and how we use it at Extract.


About the author: Chris Mack

Chris is a Marketing Manager at Extract with experience in product development, data analysis, and both traditional and digital marketing.  Chris received his bachelors degree in English from Bucknell University and has an MBA from the University of Notre Dame.  A passionate marketer, Chris strives to make complex ideas more accessible to those around him in a compelling way.