When we ask a computer for help with a task, what are we asking for?
1) Help with automating a repetitive task
2) Help with a decision
1) Help with automating a repetitive task
There are various ways you can automate a repetitive task. You can:
a) Ask your computer to do the same thing again and again, regardless of input (display the home page)
b) Give it some simple rules to follow (if they try to navigate to a non-existent page, show them a 404)
c) Give it some complex or not fully understood rules to follow (based on our tests, these are the solutions you should attempt, in this order)
d) Give it inputs, and have it adapt (‘Watch me perform this industrial assembly task, now you do it’)
2) Help with a decision
There are various different ways you can use a computer to help with a decision. You can:
a) Display data in various interesting ways (Data Visualization)
b) Give it the data and some rules to follow (Standard decision automation)
c) Give it the data and a desired output/scoring function (Supervised/Reinforcement Learning)
d) Give it the data and nothing else* (Unsupervised Learning)
This is somewhat of a false dichotomy, as adding new types of decisions allows more and more automation.
– Search (inputting words, pictures, video into a search engine and asking for a result) generally started with 2.a) (Data Display), and seems to be trying to move up the decision hierarchy, anticipating questions and the rules the user would want it to follow. This seems to be generally done with statistics, but I expect this would be switching over to pattern-finding neural nets
– Clustering (throwing a bunch of data into the hopper and getting groupings back) is also mostly in the Data Visualization bucket. It could also be an input into a machine learning algorithm, which would then be trained to make decisions based on these clusters
– Machine Learning (giving a bunch of data and getting a decision or pattern out) can be used for most or all of the options above, and similar to how computers have gotten ‘fast enough’, Machine Learning is becoming ‘good enough’ or ‘easy enough’ to replace many of the above.
So, as a human, when do you choose each of these? Assuming the options get more difficult going down the list, you would:
1) Start by googling various things (mostly to see what has been done before***).
2) You could then look at the data, clean it, and try clustering it into groups, to see if any of them made sense for the decision you wanted to make.
3) If neither of these worked, or if you wanted more, you could derive a scoring function for the output you wanted, then supply a Machine Learning algorithm with a substantial amount of data, and see how optimal it could make the decision.
4) If you don’t even know what decision you want, or are having difficulty making a scoring function, you could throw the data into an unsupervised learning hopper and see what comes out.
At each of the steps above, you can hive off parts and automate them, either using rules derived from the patterns you’ve found, or using flexible rules from the Machine Learning algorithm. You may find you can accomplish most of your task without having to resort to complex or incompletely understood algorithms.
More examples in subsequent posts. Stay tuned. As you can tell, the categories above have not fully crystallized.
*Unsupervised Learning has a number of levels** in it, such as ‘Find Features’, ‘What is the Question?’, ‘Why?’, etc…
**Not that everything is hierarchical, but this is convenient for discussion
***This is the ‘literature review’ portion of anything we do now
One thought on “Automation and Machine Learning”