Extract uses Machine learning to enhance current algorithms.
We've all heard of Machine Learning... but what does that really mean?
While machine learning may sound like science-fiction technology, the reality is that the term describes a process that is really quite mundane - performing a lot of math problems very quickly in order to automate processes. Machine learning is a tool not much different than a spreadsheet application is to an accounting clerk. Where machine learning differs from a spreadsheet application, however, is that it is using math to generate predictions rather than concrete answers (and, for bragging rights, it also uses calculus rather than the algebra your spreadsheet application is generally limited to using).
At its most basic level, machine learning is able to predict answers for two types of questions:
- Multiple Choice
To come up with that prediction, it must be fed inputs the prediction is to be based on where each input is - you guessed it - either multiple choice or numerical. The predictions machine learning generates can help us in many ways; yet it is far from replacing all our jobs and taking over the world (for now).
What machine learning's predictions do accomplish is to replace the need to write explicit code for a large number of specific scenarios. Classically computer programming is a process of writing explicit logic, much like writing a precise recipe – every step in a sequence of actions is explicitly written for the computer to follow step-by-step. This work is necessarily very detailed and often hard to make correct. With machine learning, as long as a common set on input factors can be found that can distinguish answers to a particular question, machine learning can fill in the gaps for us making this detailed logic unnecessary.
Here's an Example
Let's consider a "smart" home thermostat. There are really two separate questions a thermostat needs to answer in order to automatically set the temperature:
- Is anyone home? (multiple choice)
- If they are home, what temperature would the owners be likely to set right now? (numerical)
To use machine learning to predict when the owners are home, we need to provide inputs specific to each owner. A good starting point is inputting (yes/no) as to whether an owner has been detected at home any given hour for all previous days. This will allow machine learning to learn about daily schedules, but it needs more to be able to account for people that work during the week, but not on the weekends.
We can "teach" the machine about weekly schedules by also inputting the day of the week for all previous days (multiple choice). Now the thermostat can account for the weekend, but now what about when an owner gets a given Friday off of work? We wouldn't want to keep turning off the thermostat repeatedly based on the fact that every previous Friday they had been off. Thus, inputting the count of hours (numerical) an owner has been detected at home each day may be a good input for the machine to predict that the user has the day off and to override the normal weekday schedule.
The process for predicting the temperature to be used would be similar except inputs would revolve around the temperatures owners have set (numerical). Additional considerations may be the season or outside temperature (multiple choice).
From there machine learning takes over. There is no need for explicit logic to be added to the thermostat such as "If the owners were gone 9am-5pm for the previous 3 Mondays and they have not been detected by 10am today, are probably gone now"... The machine "learns" these rules on its own. More importantly, the machine "learns" rules that are specific to each owner, rather than requiring explicit programming that would try to manually account for all owners.
At Extract Systems, the rules engine we have been building and improving for more than a decade has been developed to identify a huge number of factors on scanned or faxed documents. What machine learning does for us is to correlate those factors in order to make accurate judgements about data on the document. This frees our data capture analysts to focus on defining better inputs and, just as the home thermostat helps customize climate control to each home, helps customize our engine to work best on your documents.
So what does machine learning really mean? It means Extract Systems delivers better results to you.
About the Author: Steve Kurth
Steve Kurth is the Software Development Manager at Extract. With 15 years of experience, Steve is an expert in the design, development and testing of primarily Windows-based enterprise software. Steve is always eager to find creative software solutions for complex customer requests.