Resources

Machine Learning FAQ

“Machine Learning is a field of study that gives computers the ability to learn without being explicitly programmed.” This definition, often attributed to computer pioneer Arthur L. Samuel, is actually a paraphrase of his work from a 1959 paper, “Some Studies in Machine Learning Using the Game of Checkers” in IBM Journal of Research and Development. This notion that computers could learn from data and outcomes does hold up as a useful description of Machine Learning today. Samuel correctly predicted, “Programming computers to learn from experience should eventually eliminate the need for much of this detailed programming effort”

There are three types of Machine Learning, Supervised Learning, Unsupervised Learning, and Reinforcement Learning.

  • Supervised Learning – Learning from data sets containing labels or known outcomes, where the algorithms build models based on the patterns in that “training” data. The resulting models are generalized and can be applied to new, never-before-seen data. Supervised Learning is used for classification and regression problems.
  • Unsupervised Learning – Learning from the inherent structure of an unlabeled data set, where the algorithms build models based on commonalities in the data itself. Unsupervised Learning is used for clustering, association, anomaly detection, and recommendation engines.
  • Reinforcement Learning – Leaning from unlabeled data based on reward-punishment feedback with successive tries at stochastic (random) solutions to problems. Reinforcement Learning is useful when there are rules, but no pre-defined methods to approach problems, such as in games or autonomous navigation.

Automated Machine Learning (AutoML) is the technique of automating the entire machine learning process from end to end. AutoML does not require data science knowledge or programming ability. AutoML is simple to apply to classic business prediction problems, such as personalization and forecasting, with easy point-and-click user interfaces.

AutoML performs complicated, underlying processes automatically, including:

  • Data Preparation
  • Feature Extraction
  • Feature Selection
  • Algorithm Selection
  • Hyperparameter Optimization
  • Cross-validation
  • Leakage Detection

Artificial Intelligence is a general term that describes efforts to allow computers to perform tasks such as vision, speech recognition, and goal achievement that resemble human intelligence.

Machine Learning is a subfield of Artificial Intelligence that allows computers to solve problems for which they were not explicitly programmed through processes of learning and iteration.

Machine Learning is a subfield of Artificial Intelligence that allows computers to solve problems for which they were not explicitly programmed through processes of learning and iteration.

Deep Learning is a subfield of Machine Learning that employs hierarchical layers of artificial neural networks that perform better on large, complicated data sets than simple neural networks.

Machine Learning is most commonly applied to predicting future outcomes or finding patterns in complicated data.

Supervised Learning Use Cases

Binary Classification (yes-no, in-out) or Multi-Class Classification (likelihood of two or more possible outcomes):

  • Which customers are in danger of churning?
  • What ad will perform best for this individual?
  • Which opportunity has the best chance of closing?

Regression (values of a scalar integers or decimal amounts):

  • What is the expected lifetime value of this customer?
  • What is the sales forecast for next period?
  • What price is optimal for this insurance policy relative to risk?

Unsupervised Learning Use Cases

  • Which groups of my customers exhibit similar buying patterns?
  • When someone buys product X, do they tend to buy product Y?
  • Are any of these transactions anomalous?

Reinforcement Learning Use Cases

  • What is the best strategy for winning this game?
  • Which traffic light pattern minimizes delays?
  • What path through this course will improve student performance?

Machine Learning (ML) is implemented either by dedicating people and computer resources to building custom instances, or by using Automated Machine Learning (AutoML) tools that simplify the process and reduce learning requirements.

AutoML is the choice for organizations that:

  • Need to make better business decisions using predictive analytics
  • Do not have spare data science and programming resources
  • Have access to structured information in systems such as CRM, ERP, or BI

Custom ML is the choice for organizations that:

  • Have highly customized requirements
  • Have deep data science and programming teams
  • Solve academic or research-oriented problems as opposed to business predictions

Process for AutoML

  1. Obtain an AutoML tool license.
  2. Identify and load training data set with known outcomes.
  3. Identify and load production data set with the same columns as the training data set except for the value(s) that will be predicted on this new data.
  4. Run the AutoML tool, and retrieve the predictions. If the AutoML tool automatically compares and uses the best-fitting algorithm, you have the predictions. If the AutoML tool requires you to specify what algorithm(s) to use, iterate through the options and choose the most accurate for predictions.
  5. Put the predictive model into production.

Process for Custom ML

  1. Choose a production platform with appropriate computational power and AI programming resource stack.
  2. Select programming language.
  3. Select algorithm(s).
  4. Select canonical test problem.
  5. Research algorithm performance and select optimal.
  6. Test functionality.
  7. Experiment/iterate.
  8. Specialize algorithm and feature engineering protocols.
  9. Generalize the algorithm implementations for new data.
  10. Put the model into production.

Data science skills are appropriate for many ML projects, but may not be required for all. Automated Machine Learning (AutoML) is a variety of ML that does not require coding, and consequently relies less on data scientists. AutoML users who are data-savvy and have access the required data sets can produce good results without any data science training. Data scientists may benefit from AutoML as well by obtaining fast test results that help guide custom work.

Organizations that develop custom ML applications, especially for academic research or for embedding in other proprietary technologies, will require data scientists.

Copyright © All Rights Reserved. Squark Is A Unit Of Vizadata, LLC   |   Privacy Policy   |   Site By Radar Media