How AutoML Beats Scoring Formulas

AutoML’s ability to detect patterns and predict can out-perform algebraic formulas and Boolean logic in common tasks.

Anyone who has written a formula in Excel or adjusted parameters in an online application knows how even the smallest change can produce dramatically different results. The power of mass calculation is exactly what puts algebraic and Boolean logic at the core of nearly every productivity tool you use. But there are costs and risks.

“I just blew up my whole workbook.” “That one little setting changed the whole forecast?” “You should have told me that was an important criterion when we wrote the code.” Sound familiar? Programmed logic is limited by our ability to understand the problem and represent it in reliable formulas. Precision and timeliness are critical, so hard-coded logic requires careful maintenance too.

AutoML works by letting data speak for itself. By finding patterns and applying them to new data, machine learning can obviate coded logic and go straight to the answers. Here is just one example, lead scoring:

CRM and marketing automation systems have features to rank leads by creating a “score.” Sales teams use lead score to prioritize activities. But scores are generated by applying weights that users set manually. How many points for a whitepaper download? How do page visits and time-on-page bump the number? Do three clicks on email calls to action mean I have a hot prospect? In practice, it is nearly impossible to fine-tune scoring models well and quickly enough for them to be useful. (More on this scoring example in this blog post.)

By learning from patterns of known outcomes in existing data, AutoML can make predictions on new data. Lead tables contain tens or hundreds of columns of data, and AutoML discovers which are most predictive automatically. That means a prioritized list of leads can be generated in minutes based on the very latest trends with no coding. Squark AutoML even reports which variables were most important, so you learn buyer persona as a nice side benefit.

The takeaway: Find opportunities where AutoML can free you from the shackles of programming logic.

What Are Machine Learning Hyperparameters and How Are They Used?

Parameters are functions of training data. Hyperparameters are settings used to tune model algorithm performance.

In Automated Machine Learning (AutoML), data sets containing known outcomes are used to train models to make predictions. The actual values in training data sets never directly become parts of models. Instead, AutoML algorithms learn patterns in the features (columns) and instances (rows) of training data and express them as parameters that are the basis for the model’s predictions on new data. Parameters are always a funtion of the data itself, and are never set externally.

Hyperparameters are variables external to and not directly related to the training data. They are configuration variables that are used to optimize model performance. Think of them as instructions to the ML algorithms on how to approach model building. Each modeling algorithm can be set with hyperparameters appropriate to the particular classification or regression prediction problem.

Hyperparameter tuning in Squark is automatic. Squark makes multiple training passes, keeps track of the results of each trial run, and makes hyperparameter adjustments for subsequent runs. The progressive improvement in configuration values results in faster time to resolution of the best model accuracy.

The takeaway: Squark uses hyperparameters to learn how to learn better as it works through each model—inventing shortcuts and best practices—the way people do when attacking problems.

What is Bias in Machine Learning?

Bias occurs when ML does not separate the true signal from the noise in training data.

Biases in AI systems make headlines for results such as favoring gender in hiring, recommending loans based on ethnicity, or recognizing faces differently based on race. Some of these cases were due to biases baked into the algorithms written by (human) data scientists, but the majority merely learned from data that was itself biased.

How do you know if your business predictions are biased? Testing against broader sets of known outcomes is the best way. Since you don’t necessarily know which factors may be introducing bias, examination of the predictive importance placed on data features can help reveal them. Squark shows lists of Variable Importance for the models it generates. Click on the model name link In the Squark Leaderboard to see them. Different algorithms can produce different ranks for variable importance, which may lend insight.

Bias in Training Data
Selecting training data wisely is the best way to reduce bias. For instance, if the training data set you select is dominated by outcomes that you expect, it should be no surprise that the model will include confirmation bias

Bias in Algorithms
Algorithmic bias occurs when model building takes too few training variables into account. In data sets with large numbers of features (columns), algorithms that can handle only fixed or limited numbers of training variables show high bias and result in underfitting. Certain algorithms such as Linear Regression, Linear Discriminant Analysis, and Logistic Regression are prone to high bias.

The takeaway: If you think your predictions may show bias, experiment. Go back to the variable selection and select/deselect suspicious columns. Iterate as many times as you need to understand your data. At that point you may decide to revise the training and production files to reflect reality with less of a “thumb on the scale.”

Monte Carlo Simulation vs. Machine Learning

Simulation uses models constructed by experts to predict probabilities. Machine Learning builds its own models to predict future outcomes.

Monte Carlo (the place) is the iconic capital of gambling—an endeavor that relies exclusively on chance probabilities to determine winners and losers. Monte Carlo (the method) employs random inputs to models to make predictions on how a system will behave.

When subject matter experts create good Simulation models, they can be valuable in revealing probabilities in complex systems with large numbers of variables—such as predicting human behaviors in markets. “What if?” scenarios can be tested because individual data points or sets of data points can be manipulated to show their effects on the entirety.

Machine Learning builds its own models based on data sets of known outcomes. Predictions are done automatically by applying these models to new sets of data. This methodology is perfect for business analyses such as identifying customers who will churn or predicting customer lifetime value. No human input or modelling skill is required. “The cards call themselves,” as you might say for hands at the Baccarat table.

The take-away: Simulation excels where domain expertise can be captured to build accurate models to enable experimentation—even creating data inputs to see what happens. Machine Learning is best for fast, automatic predictions on new data based on observations of known outcomes. They are not mutually exclusive. In fact, Machine Learning can be handy to test and refine Simulation models.

Data Mining vs. Machine Learning

Data Mining describes patterns, correlations, and anomalies in data.

Mines are not the best analogies for the processes referred to as Data Mining. Never mind that we call data storage places bases, warehouses, and lakes. Extraction of raw data material is not the goal of data mining, but rather identifying characteristics within data sets that can be used to make decisions and predictions.

Think of Data Mining as applying statistics to make it easier for humans to understand past events recorded in data. By making assumptions and testing them, insights may be generated to help make decisions or predict general behavior in the future. Since all its variables are known and static, data mining itself cannot predict specific behavior on new variables.

Data Mining Processes
Here are some of the commonly used terms for tasks in data mining:

  • Anomaly Detection – identifying records that are different enough from others to be checked as errors or outliers.
  • Dependency Modelling – Identifying relationships among variables, such as market basket analysis for items frequently bought together.
  • Clustering – Identifying characteristics of groups of records that are more similar to each other than to other groups.
  • Classification – Calculating the probability that a record matches one or more sets of variables.
  • Regression – Estimating the relationship among an independent variable and one or more dependent variables.
  • Summarization – Creating a shortened example set of data, including reports and graphical representations.

Data Mining is good for preparing data and understanding variables that may be useful for predictions. The constraints of time and human analytical capacity to query, join, parse, and process large data sets makes Data Mining ill-suited to production predictive analysis.

Machine Learning to the Rescue
Automated Machine Learning (AutoML) automatically makes assumptions and iterates the models until it understands patterns—without the need for human intervention. This means that programming to account for every possible data relationship is unnecessary. The speed of results—even for large data sets—is remarkable. Best of all, the AI models can be applied to fresh data automatically, which is the essence of prediction.

The take-away: Data mining is useful to gain insights and to prepare data for predictive analytics, including AutoML. Machine Learning uses data patterns to predict future outcomes for new records.

Deep Learning vs. Machine Learning

Deep Learning is a category of machine learning with special advantages for some tasks and disadvantages for others.

Machine learning workflows begin by identifying features within data sets. For structured information with relatively few columns and rows, this is straightforward. Most practical business predictions such as classification and regression fall into this category.

Unstructured data, such as image and voice, have vast numbers of “features” in the form of individual pixels or wave forms. Identifying those features to structured AI algorithms is tedious or impossible. Deep Learning is a technique where the AI algorithm itself extracts progressively higher levels of feature recognition, passing information through potentially hundreds of neural network layers. Deep learning algorithms power image and speech recognition for driverless cars and hand-free speakers

Plusses of Deep Learning

  • Scale – Deep learners can handle vast amounts of data, and they always improve with more data. Shallow learning converges and stops improving with additional data.
  • Dimensions – Deep learners can move past the limitations of a few hundred columns to perform well on very wide structured data set.
  • Non-Numeric – Deep learning brings AI into the human realm of speech and vision, which serve people in new and valuable ways.

Minuses of Deep Learning

  • Training Data – Deep learners need labeled data from which to learn. Amassing sufficient examples for recognition accuracy to be learned can be daunting.
  •  Not for Small Data Sets – Data sets that are too simple or too small cause deep learners to fail by overfitting
  • Resource Consumption – Deep learning on vast data stores can require days or weeks of processing on a single problem. 

The take-away: Deep learners are great for unstructured data and may be useful for classification with large and detailed structured data sets. Squark includes deep learners in the stack of algorithms it uses for AutoML. You will know from the Squark Leaderboard whether deep learning was a winner.

Classification Types and Uses

Classifications are the most frequently used—and most useful—prediction types.

Classifications are predictions that separate data into groups. Binary Classification produces “yes-no” or “in-out” answers when there are only two choices. Multi-Class Classification applies when there are three or more possibilities and shows probabilities for each.

Binary Examples

  • Churn – Which customers are in danger of leaving? If you knew, you could implement targeted retention tactics such as special offers or customer service outreach.
  • Conversion – Which prospects are most likely to be ready to move to the next stem in the buying cycle? Knowing means focusing sales resources on the best leads.
  • Risk – Which populations are likely to experience negative outcomes? Understanding helps guide actions to mitigate risks.

Multi-Class Examples

  • Cross-Sell/Up-Sell – Which customers are most likely to buy which additional products or services? Targeting them with the right offers lifts sales with high efficiency.
  • Personalization – Which content will resonate with which person? Optimizing websites, social media, and email is easy when you know.
  • Ad Targeting – Which prospects are most likely to respond to your multiplicity of ads and media? Spending is more effective when you know your audiences.

The take-away: Classifications are among the most accessible and highest-return prediction types for AutoML. Predictions on each row include not only the classes, but the probabilities associated with each. Think of a burning classification question and Squark can help you begin predicting right away.

Operationalizing AutoML: ML Ops

Predictions are interesting on their own. They are valuable when put into production.

Operationalizing AutoML – often called “ML Ops” – means putting AutoML predictions into regular workflows to change business outcomes. Here are a few ways do it.

Graphical Interface
Saas AutoML tools have a GUI that enables training data ingestion, data prep, feature engineering, and model building. Once an optimal model is made, predictions are created on production data. Simply exporting the predictions from the GUI delivers a data file for people or other systems to act upon. For example, predictions the prioritize sales lead follow-up could be handed to sales ops as a daily call list, or could be imported to a marketing automation system as an email segmentation list.

Application Programming Interfaces (APIs) are specifications that describe how dissimilar systems can communicate reliably. AutoML APIs can be hooked to enterprise systems to automate export of predictions to eliminate manual file handling.

Model Export
AutoML produces executable code, typically Java bytecode, that can be run wholly outside the AutoML tool itself. Models so exported can be run again and again on production data as often as required. When models are improved based on new training data, ML ops can simply replace the executable code with the new models.

Custom Deployment
Some applications, such as real-time predictions, require close integration with cooperating systems. Customized data pipelines can be created to manage these processes.

The take-away: You can begin using AutoML results to improve business performance right now. As needs expand, there are many options for blending ML Ops into automated workflows, and Squark can help with all of them.

AI Changed Dramatically in Only 9 Months

Productive uses for AI are closer at hand than ever due to the rise of AutoML.

Beginning in 2019, advancements in AI have replaced obstacles with tools to benefit from its power. Foremost among them is Automated Machine Learning (AutoML), which does not require programming or scripting of any kind. Here are some examples.

Business Analysts
“AI is the new BI” is a theme repeated for years by journalists and product marketers. AI is now simple enough that business analysts can make reliable predictions without being data scientists or programmers. Moving from visualizing trends with BI to predicting the future—record-by-record—with AI is transforming the way businesses run. AutoML is what makes that possible.

Citizen Data Scientists
Researchers and designers who need to understand patterns know their problems and data very well, but not necessarily the AI algorithms that can extract information they need. AutoML abstracts algorithm selection and use to produce results quickly.

Data Science Professionals
AutoML does not replace the need for true data scientists, whose expertise in solving complicated AI problems cannot be replicated. Nevertheless, AI pros use AutoML to glean insights on problems to make custom work more efficient. In addition, offloading simpler BI problems to AutoML keeps them focused where they are most needed.

Conclusion: AutoML makes achieving the benefits of AI simpler for everyone.

Why AI for Marketing and Sales?

Follow the money to see why marketing and sales are the most common applications for AI.

Instant Payback
Small improvements in marketing and sales can produce large returns quickly. Think of the impact of gaining a few percentage points on lead conversions, forecast accuracy, content targeting, and ad performance. Knowing which customers will buy, what they will buy, and when they will buy delivers value on both revenue and cost sides of the ledger.

Plenty of Data
More information than ever is available in CRM, marketing automation, and customer data platforms. AI—in the form of Automated Machine Learning (AutoML)—is really good at finding patterns in all that data to predict the future.

AutoML does not require programming or formula creation in order to make accurate predictions. Models can be made and refined rapidly. This is particularly important in supporting nimble marketing and sales processes.

AutoML insights for marketing and sales are easy to monetize and straightforward to execute. That makes them great places to amplify the benefits of AI.