Linear Algebra

A branch of mathematics concerning vector spaces and linear mappings between them. It includes the study of lines, planes, and subspaces, but is also concerned with properties common to all vector spaces.

Logloss

Logloss (or Logarithmic Loss) measures classification performance; specifically, uncertainty. This metric evaluates how closely a model’s predicted values are to the actual target value. For example, does a model tend to assign a high predicted value like .90 for the positive class, or does it show a poor ability to identify the positive class and assign a lower predicted value like .40? Logloss ranges between 0 and 1, with 0 meaning that the model correctly assigns a probability of 0% or 100%. Logloss is sensitive to low probabilities that are erroneous.

Machine learning

“A field of study that gives computers the ability to learn without being explicitly programmed.” (Arthur Samuel, 1959)

Max F1

F1 is a score between 1 (best) and zero (worst) that shows how well a classification algorithm did at training on your dataset. It is a check different from accuracy that measures how well the model performed at identifying the differences among groups. For instance, if you are classifying 100 types of wine – 99 red and one white – and your model predicted 100 are red, then it is 99% accurate. But the high accuracy veils the model’s inability to detect the difference between red and white wines.

F1 is particularly revelatory when there are imbalances in class frequency, as in the wine example. F1 calculations consider both Precision and Recall in the model:

Precision = How likely is a positive classification to be correct? = True Positives/(True Positives + False Positives)

Recall = How likely is the classifier to detect a positive? = True Positives/(True Positives + False Negatives)

F1 = 2 * ((Precision * Recall) / (Precision + Recall))

Max F1 is the cut-off point for probabilities in predictions. When a row’s P1 (will occur) value is at or above the Max F1, the outcome will be predicted to happen in the future. If a row’s P0 (won’t occur) value is below the Max F1, the outcome will be predicted not to happen.  This explains why the cutoff point is not always 50% as you might expect.

Mean Absolute Error or MAE

MAE or the Mean Absolute Error is an average of the absolute errors. The smaller the MAE the better the model’s performance. The MAE units are the same units as your data’s dependent variable/target (so if that’s dollars, this is in dollars), which is useful for understanding whether the size of the error is meaningful or not. MAE is not sensitive to outliers. If your data has a lot of outliers, then examine the Root Mean Square Error (RMSE), which is sensitive to outliers.

Mean Per Class Error

Mean Per Class Error (in Multi-class Classification only) is the average of the errors of each class in your multi-class data set. This metric speaks toward misclassification of the data across the classes. The lower this metric, the better.

Mean Square Error or MSE

MSE is the Mean Square Error and is a model quality metric.  Closer to zero is better.  The MSE metric measures the average of the squares of the errors or deviations. MSE takes the distances from the points to the regression line (these distances are the “errors”) and then squares them to remove any negative signs. MSE incorporates both the variance and the bias of the predictor. MSE gives more weight to larger differences in errors than MAE.

Pragmatic AI

Pragmatic AI is designed to solve well-defined problems, as opposed to being allowed to seek its own purpose.

Predictive Analytics

Statistical techniques gathered from predictive modeling, machine learning, and data mining that analyze current and historical facts to make predictions about future or otherwise unknown events.