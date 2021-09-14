This article is excerpted from the course "Fundamental Machine Learning," part of the Machine Learning Specialist certification program from Arcitura Education. It is the final part of the 13-part series, "Using machine learning algorithms, practices and patterns."

In this, the conclusion of our machine learning series, we cover two more machine learning model optimization techniques -- specifically, the lightweight model implementation and incremental model learning techniques. These final two techniques conclude this series' coverage of model optimization practices. As explained in Part 4, these techniques are documented in a standard pattern profile format.

Lightweight model implementation: Overview How can prediction latency be kept to a minimum in a real-time data processing system while guaranteeing acceptable accuracy?

How can prediction latency be kept to a minimum in a real-time data processing system while guaranteeing acceptable accuracy? A complex model with high accuracy generally incurs high prediction latency when deployed in real-time systems, leading to degraded system performance with the further consequences of potential business loss.

A complex model with high accuracy generally incurs high prediction latency when deployed in real-time systems, leading to degraded system performance with the further consequences of potential business loss. A lightweight model with acceptable accuracy is trained and deployed to make real-time predictions.

A lightweight model with acceptable accuracy is trained and deployed to make real-time predictions. Application. A statistical or a simple machine learning algorithm, such as naive Bayes, is used to train a predictive model that is then deployed in the real-time system to make low latency predictions.

Lightweight model implementation: Explained Problem A real-time data processing system, such as an event stream processing (ESP) system, requires minimum possible delay from the time the data gets ingested to the time the predictions become available. However, with the aim of achieving maximum possible accuracy, which generally results in a complex model, prediction time increases. Although the system generates accurate results, by the time the results are accessible the window of opportunity may already have closed. In a very high-velocity data processing system, such as with an IoT-driven system, not only does this incur prediction lag but also results in excessive use of memory and processing resources. (See Figure 1.) Figure 1. A training data set is prepared (1). It is then used to train a model (2). The result is a complex model (3). The model is deployed to a real-time prediction system (4). Streaming data enters the real-time prediction system at time t-0 (5). At time t-1, the real-time prediction system makes the first prediction related to the first data point, and at time t-2 the real-time prediction system makes the second prediction, and so on (6). However, the time taken to make each prediction results in high latency (7). Solution A real-time system suffers from the limitations described by the speed consistency volume (SCV) principle. To be able to operate at high velocity (S) while making sure that all data gets processed (V), the accuracy of the predictions needs to be relaxed (C) by employing a lightweight model. Such models are either probability-based or use a linear classifier to keep the memory and computation overhead to a minimum. Although the accuracy of such models is not on par with complex models, their ability to predict faster offsets the reduced accuracy. To reduce false positives or to gain further insight into the data points, the data points can then be fed to a more complex model operating at near-real-time that may make use of other contextual data not available to the real-time prediction pipeline. Application Based on whether the nature of a problem is regression or classification, a lightweight algorithm is chosen. With regression, linear regression or a pruned decision tree can be used; with classification, logistic regression, linear support vector machine (SVM) or naive Bayes can be used to train a model in an offline manner by making use of as much data as possible. The trained model can then be deployed in the real-time system. With IoT devices, once the lightweight classification model predicts that a data value falls within a class of interest or if the regression model's prediction falls within the range of interest, one strategy to gain insight or to confirm the model's prediction is to reconfigure the IoT device to gather more granular data that can then be fed to complex models, such as kernel SVM. The same strategy can also be employed for non-IoT systems. The use of complex models as a second stage may also be required when there is a requirement for the prediction to be explainable to a decision maker or a third-party, such as the detection of fraudulent bank transactions. Models based on rule-centered algorithms, such as decision trees and classification rules, can be used in such scenarios. The lightweight model implementation pattern can further benefit from the application of the incremental model learning pattern as it helps keep retraining time and resources to a minimum, which further helps when updating the productionalized model more frequently, thereby achieving more accurate predictions. (See Figure 2.) Figure 2. A training data set is prepared (1). It is then used to train a model using a lightweight algorithm (2). This results in a lightweight model (3). The model is then deployed to a real-time prediction system (4). Streaming data enters the real-time prediction system at time t-0 (5). At time t-1, the real-time prediction system makes the first prediction related to the first data point, and at time t-2 the real-time prediction system makes the second prediction, and so on (6). The latency that occurs as a result of making predictions is minimal (7).