What is Machine Learning? 

Machine learning is a type of artificial intelligence that automates analytical models to continuously improve upon themselves. It is based on the idea that systems can identify patterns, learn from data, and make decisions with minimal human input. The goal is to try to make computers "think" and act the same way humans do without relying on explicit pattern recognition and algorithms.


Machine learning systems are made up of three major parts:

  1. Model - the system making predictions or identifications

  2. Parameters - the criteria used by the model to formulate its decisions

  3. Learner - the system that adjusts the model by looking at the differences in actual outcomes vs. predictions

Machine learning processes

There are four approaches to machine learning, they are broken into categories based on how the machine learns:

  1. Supervised Learning - teaches model by example, with human input informing model changes

  2. Unsupervised Learning - finding identifiable patterns in data and results to spot similarities and outliers to allow a model to self correct

  3. Semi-Supervised Learning - a small amount of human supervision to label small, specific pieces of data to empower a model to produce better results

  4. Reinforcement Learning - a trial and error process in which the proper actions are emphasized

How Does Extract Use Machine Learning?

Extract machine learning can be used for all data and document types, including: classification, pagination, indexing, and redaction. Models are automatically trained based upon operator verification output to continuously improve without extended supervision from Extract's Data Capture team.  Your
rules are enhanced and we better learn your document types.

Extract regularly updates our software, and has made multiple enhancements to its machine learning algorithms that make machine learning easier to apply, more accurate, and have reduced system resource requirements.

We continue to make improvements across the board, including to our Natural Language Processing capabilities, pagination accuracy, and reductions in memory usage while running machine learning.

With machine learning improvements, rules can update automatically, allowing for data capture to become more accurate over time.  Capture rates will also continue to improve after deployment. As verifiers move through documents, our algorithm learns from their actions.

For further information, check out some of our recent blogs that highlight machine learning: