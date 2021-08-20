This article is excerpted from the course "Fundamental Machine Learning," part of the Machine Learning Specialist certification program from Arcitura Education. It is the eleventh part of the 13-part series, "Using machine learning algorithms, practices and patterns."

Model optimization techniques are used to improve the performance of machine learning models. This and the next, final article in this series cover a set of optimization techniques that are normally applied toward the end of a machine learning problem-solving task, after a given model has been trained but when there exist opportunities to make it more effective.

This article describes the first two of four optimization practices: the ensemble learning and the frequent model retraining techniques. As explained in Part 4, these techniques are documented in a standard pattern profile format.

Ensemble learning: Overview How can the accuracy of a prediction task be increased when different prediction models provide varying levels of accuracy?

Ensemble learning: Explained Problem In the context of a classification or regression task, different models carry different strengths depending on whether they are trained using the same or different algorithms, and capture different aspects and relationships hidden in a data set. Even if the model with the highest accuracy is chosen, there is no guarantee that it will generalize well when exposed to unseen data in the production environment. Consequently, relying on a single model generally results in fluctuating accuracy based on whether the model sees the same type of unseen data that it is good at predicting. (See Figure 1.) Figure 1: A training data set is prepared to tackle a classification problem for the purpose of training three different models and then selecting the most effective model of the three (1). The K-NN algorithm is chosen to train the first model (2, 3). The Naïve Bayes algorithm is chosen to train the second model (4, 5). The decision trees algorithm is chosen to train the third model (6, 7). After training, it is determined based on the accuracy metric that all three models provide the same level of performance. However, upon comparison of a few test instances, the predicted labels vary between the three models. Based on the results, it is unclear which model to choose (8). Solution A single meta-model is generated that builds upon the strengths of each of its constituent models. The constituent models can be generated either by using the same type of algorithm for all models (each model differs in terms of its parameters) or different algorithms (applicable to the same type of machine learning problem, such as regression or classification) for each model. The results of all the constituent models are combined together using strategies such as voting or averaging. The meta-model works best when the constituent models carry a high accuracy but disagree among themselves as voting or averaging the same set of results does not bring any added value when compared with the results obtained from a single model. Application This pattern can be applied via one of the following strategies for creating a meta-model: Bagging . Bagging is short for bootstrap aggregating. The same algorithm is used to train multiple models in parallel such that each is trained on a subset of the training data constructed by bootstrap sampling. Bootstrap sampling is a sampling method where a sample is generated by randomly selecting, with replacement items, from a data set. That is, each time an item is selected, it is returned to the data set. Therefore, it is possible that the same item can be selected more than once for the same sample. The meta-model is generated by aggregating the results of different models either via voting (classification task) or averaging (regression task). A specialized form of bagging is random forests, whereby the underlying algorithm for constituent models is a decision tree. However, each tree is exposed to a different subset of features that results in randomized trees, which upon aggregation provides a better accuracy than each of the constituent trees.

Frequent model retraining: Overview How can the efficacy of a model be guaranteed after its initial deployment?

How can the efficacy of a model be guaranteed after its initial deployment? After a model is deployed to the production environment, there is a strong possibility that the accuracy of the model may decrease over time, resulting in degraded system performance.

After a model is deployed to the production environment, there is a strong possibility that the accuracy of the model may decrease over time, resulting in degraded system performance. The machine learning model is kept in sync with the changing data by keeping the model up to date.

The machine learning model is kept in sync with the changing data by keeping the model up to date. The model is retrained at regular intervals by preparing a training data set that includes both historic data as well as the current data.