Model development is not one-size-fits-all affair -- there are different types of machine learning algorithms for different business goals and data sets. For example, the relatively straightforward linear regression algorithm is easier to train and implement than other machine learning algorithms, but it may fail to add value to a model requiring complex predictions.
The nine machine learning algorithms that follow are among the most popular and commonly used to train enterprise models. The models each support different goals, range in user friendliness and use one or more of the following machine learning approaches: supervised learning, unsupervised learning, semi-supervised learning or reinforcement learning.
Supervised machine learning algorithms
Supervised learning models require data scientists to provide the algorithm with data sets for input and parameters for output, as well as feedback on accuracy during the training process. They are task-based, and test on labeled data sets.
The most popular type of machine learning algorithm is arguably linear regression. Linear regression algorithms map simple correlations between two variables in a set of data. A set of inputs and their corresponding outputs are examined and quantified to show a relationship, including how a change in one variable affects the other. Linear regressions are plotted via a line on a graph.
This article is part of
Linear regression's popularity is due to its simplicity: The algorithm is easily explainable, relatively transparent and requires little to no parameter tuning. Linear regression is frequently used in sales forecasting and risk assessment for enterprises that seek to make long-term business decisions.
Linear regression is best for when "you are looking at predicting your value or predicting a class," said Shekhar Vemuri, CTO of technology service company Clairvoyant, based in Chandler, Ariz.
Support vector machines
Support vector machines, or SVM, is a machine learning algorithm that separates data into classes. During model training, SVM finds a line that separates data in a given set into specific classes and maximizes the margins of each class. After learning these classification lines, the model can then apply them to future data.
This algorithm works best for training data that can clearly be separated by a line, also referred to as a hyperplane. Nonlinear data can be programmed into a facet of SVM called nonlinear SVMs. But, with training data that's hyper-complex -- faces, personality traits, genomes and genetic material -- the class systems become smaller and harder to identify and require a bit more human assistance.
SVMs are used heavily in the financial sector, as they offer high accuracy on both current and future data sets. The algorithms can be used to compare relative financial performance, value and investment gains virtually.
Companies with nonlinear data and different kinds of data sets often use SVM, Vemuri said.
A decision tree algorithm takes data and graphs it out in branches to show the possible outcomes of a variety of decisions. Decision trees classify response variables and predict response variables based on past decisions.
Decision trees are a visual method of mapping out decisions. Their results are easy to explain and can be accessible to citizen data scientists. A decision tree algorithm maps out various decisions and their likely impact on an end result and can even be used with incomplete data sets.
Decision trees, due to their long-tail visuals, work best for small data sets, low-stakes decisions and concrete variables. Because of this, common decision tree use cases involve augmenting option pricing -- from mortgage lenders classifying borrowers to product management teams quantifying the shift in market that would occur if they changed a major ingredient.
Decision trees remain popular because they can outline multiple outcomes and tests without requiring data scientists to deploy multiple algorithms, said Jeff Fried, director of product management for InterSystems, a software company based in Cambridge, Mass.
Unsupervised machine learning algorithms
Unsupervised machine learning algorithms are not trained by data scientists. Instead, they use deep learning to identify patterns in data by combing through sets of unlabeled training data and observing correlations. Unsupervised learning models receive no information about what to look for in the data or which data features to examine.
The Apriori algorithm, based on the Apriori principle, is most commonly used in market basket analysis to mine item sets and generate association rules. The algorithms check for a correlation between two items in a data set to determine if there's a positive or negative correlation between them.
The Apriori algorithm is primed for sales teams that seek to notice which products customers are more likely to buy in combination with other products. If a high percentage of customers who purchase bread also purchase butter, the algorithm can conclude that purchase of A (bread) will often lead to purchase of B (butter). This can be cross-referenced in data sets, data points and purchase ratios.
Apriori algorithms can also determine that purchase of A (bread) is only 10% likely to lead to the purchase of C (corn). Marketing teams can use this information to inform things like product placement strategies. Besides sales functions, Apriori algorithms are favored by e-commerce giants, like Amazon and Alibaba, but are also used to understand searcher intent by sites like Bing and Google to predict searches by correlating associated words.
The K-means algorithm is an iterative method of sorting data points into groups based on similar characteristics. For example, a K-means cluster algorithm would sort web results for the word civic into groups relating to Honda Civic and civic as in municipal or civil.
K-means clustering has a reputation for accurate, streamlined groupings processed in a relatively short period of time, compared to other algorithms. K-means clustering is popular among search engines to produce relevant information and enterprises looking to group user behaviors by connotative meaning, or IT performance monitoring.
Semi-supervised machine learning algorithms
Semi-supervised learning teaches an algorithm through a mix of labeled and unlabeled data. This algorithm learns certain information through a set of labelled categories, suggestions and examples. Semi-supervised algorithms then create their own labels by exploring the data set or virtual world on their own, following a rough outline or some data scientist feedback.
Generative Adversarial Networks
GANs are deep generative models that have gained popularity. GANs have the ability to imitate data in order to model and predict. They work by essentially pitting two models against each other in a competition to develop the best solution to a problem. One neural network, a generator, creates new data while another, the discriminator, works to improve on the generator's data. After many iterations of this, data sets become more and more lifelike and realistic. Popular media uses GANs to do things like face creation and audio manipulation. GANs are also impactful for creating large data sets using limited training points, optimizing models and improving manufacturing processes.
Self-trained Naïve Bayes classifier
Self-trained algorithms are all examples of semi-supervised learning. Developers can add to these models a Naïve Bayes classifier, which allows self-trained algorithms to perform classification tasks simply and easily. When developing a self-trained model, researchers train the algorithm to recognize object classes on a labeled training set. Then the researchers have the model classify unlabeled data. Once that cycle is finished, researchers upload the correct self-categorized labels to the training data and retrain. Self-trained models are popular in natural language processing (NLP) and among organizations with limited labeled data sets.
Reinforcement learning algorithms are based on a system of rewards and punishments learned through trial and error. The model is given a goal and seeks maximum reward for getting closer to that goal based on limited information and learns from its previous actions. Reinforcement learning algorithms can be model-free -- creating interpretations of data through constant trial and error -- or model-based -- adhering more closely to a set of predefined steps with minimal trial and error.
Q-learning algorithms are model-free, which means they seek to find the best method of achieving a defined goal by seeking the maximum reward by trying the maximum amount of actions. Q-learning is often paired with deep learning models in research projects, including Google's DeepMind. Q-learning further breaks down into various algorithms, including deep deterministic policy gradient (DDPG) or hindsight experience replay (HER).
Model-based value estimation
Unlike model-free approaches like Q-learning, model-based algorithms have a limited depth of freedom to create potential states and actions and are statistically more efficient. Such algorithms, like the popular MBVE, are fitted with a specific data set and base action using supervised learning. Designers of MBVE note that "model-based methods can quickly arrive at near-optimal control with learned models under fairly restricted dynamics classes." Model-based methods are designed for specific use cases.