View All Series Articles

Explore the foundations of artificial neural network modeling

Dive into Giuseppe Bonaccorso's recent book 'Mastering Machine Learning Algorithms' with a chapter excerpt on modeling neural networks.

By

Published: 31 Aug 2020

Deep learning neural networks are usually rife with challenges. For all their layered capabilities, the algorithms themselves are hard to create and even harder to manage. From the demand for millions of data points used in model training to the black box decision-making process, data scientists are fighting an uphill battle right from the start of neural network creation. However, with bigger risk comes bigger reward: Deep learning artificial neural networks can produce state-of-the-art performance in regression, image classification and business applications.

This anxiety around training methods and limitations of the algorithms is not lost on Giuseppe Bonaccorso, who recently wrote a 700-page how-to manual. For enterprises that want to take a dive into artificial neural network modeling, Bonaccorso, who is the global head of innovative data science at Bayer, offers his take on popular issues, training strategies and why building a model can work.

What are the benefits of creating an artificial neural network from scratch, especially when there are so many prepackaged vendor offerings?

Giuseppe Bonaccorso: The rationale behind the choice of a new model or an existing one should be rooted in the nature of the problem. For example, in image recognition, there are several high-performance networks that can be adapted to specific roles, but there are also problems that require more customized solutions. In these cases, building a model from scratch is likely to be the optimal strategy.

Neural networks are very flexible models. There are cases when existing architectures can simplify the work, as well as pretrained models where only some layers are retrained to meet specific requirements in transfer learning. Start with simple networks. If the results are poor, it's possible to increase complexity. However, the simplest model that guarantees both accuracy and generalization ability [is the right one].

Giuseppe Bonaccorso

Giuseppe Bonaccorso

Which toolkits are best suited to model and create a neural network?

Bonaccorso: My primary choice is TensorFlow 2, which now includes Keras, which is a high-level module. Using TensorFlow, the data scientist can easily start with Keras models based on predefined layer structures and, in case it's necessary, she can switch to more advanced features. There are also other frameworks, like PyTorch. I believe there are no silver bullets, but it's important that once a framework is chosen, all its features are thoroughly studied and evaluated. Even if not immediately helpful, some features can, in fact, become essential to solving some problems in the most effective way.

Neural networks are famous for being difficult and hard to manage. What are the most common problems when modeling a deep learning network?

Bonaccorso: Deep neural networks are extremely complex models with tens of millions of parameters. Training them means finding the optimal set of parameters to achieve a predefined goal -- and the training can easily remain stuck in suboptimal solutions. In order to mitigate this problem, several optimization algorithms have been proposed. The role of a data scientist is to pick the most appropriate algorithm and tune up its hyperparameters to maximize both the training speed and the final accuracy. Moreover, these kinds of models have an intrinsically large capacity; the more parameters you introduce, the more complex the system becomes and, consequently, its ability to learn the training set increases very quickly.

Mastering Machine Learning Algorithms

Click on this book cover
image to learn more about
the book from Packt
Publishing.

When small data sets are employed, deep learning models can easily overfit and learn to associate each training input with the correct label but lose the ability to generalize. Generalizing is a key concept in learning because we'd like to model systems that can abstract from some examples to derive a generic 'concept' representing a specific class.

Unfortunately, when working with deep neural networks, overfitting is a very common issue. However, data scientists can employ regularization, dropout and batch normalization techniques to correct issues.

How can data scientists keep their models accurate, fast and optimized over time with a model that is hard to retrain?

Bonaccorso: Once a model is properly trained, it becomes stable in its underlying data-generation process. However, many models are based on training sets that represent time-changing processes.

In fact, one common problem when retraining networks is that they tend to forget past knowledge when a new one is submitted. In order to avoid this problem, the new training sets must contain data sampled from the new data-generating process. For example, if we have trained a model to distinguish between cats and dogs and we want to extend it also to tigers, we cannot simply create a tiger data set -- we need to create a new set containing all three classes to learn to distinguish among features.

Current learning algorithms are very sensitive to drastic changes in the training sets, so it's important to keep this concept in mind when you need to update or retrain an algorithm. A more complex problem arises when the current architecture doesn't have enough capacity to learn more classes. In this case, the model would underfit, showing a very low accuracy, and the data scientist would have to consider a deeper or more complex architecture and a larger data set is needed to avoid overfitting the model.

Images or volumetric data are based on easily searched features using special operators (like convolutions), which work with the geometric structure of the samples. This intuition is based on direct observations of the structure of biological vision systems, where subsequent layers are responsible to extract more and more detailed features.

Convolutional deep neural networks are the starting point of every image-related problem, and, given the advancements in neural computation, their complexity is becoming easier to manage. Of course, convolutional layers are not enough to solve all the problems. Other helpful layers (like pooling, padding and up/down-sampling) are necessary to achieve specific goals. Aspiring data scientists should study and learn how to apply all layers in the most accurate and reasonable way.

Dive into Mastering Machine Learning Algorithms

Click here to read Chapter 17, "Modeling Neural Networks," of Bonaccorso's book.

Dig Deeper on AI technologies

Business Analytics

Logi analytics suite to add new GenAI, SaaS capabilities
Insightsoftware, parent company of the embedded BI specialist, unveiled a new generative AI assistant and SaaS version of ...
Snowflake targets enterprise AI with launch of Arctic LLM
The data cloud vendor's open source LLM was designed to excel at business-specific tasks, such as generating code and following ...
AI-fueled efficiency a focus for SAS analytics platform
The vendor's latest product development plans include an AI assistant and prebuilt AI models that enable workers to be more ...

CIO

Election might decide fate of FTC noncompetes ban
If the FTC's ban on noncompete agreements survives legal challenges, it might still face problems should there be an ...
FTC bans noncompete agreements in split vote
Now that the FTC has issued its final rule banning noncompete clauses, it's likely to face a bevy of legal challenges.
Ally's generative AI strategy eyes multiple LLMs, AI agents
The digital bank plans to privately host multiple LLMs on its GenAI platform, explore autonomous agent technology and evaluate ...

Data Management

AI boosts efficiency in data management
AI can automate tasks across every aspect of the data management process, enabling data teams to focus on models, not labeling ...
AtScale adds semantic layer support for AI, GenAI models
The vendor's new platform update centers around decision-making flexibility, collaboration and community, and includes a metadata...
Open source vs. proprietary database management
Open source and commercial databases are alternative options to help streamline data management processes. Examine the pros and ...

ERP

Microsoft, SAP add more AI to manufacturing, supply chain
At the Hannover Messe industrial show, Microsoft debuted Fabric AI for OT and IT data, and the copilot template for factory ...
How to create a simple supply chain map
A simple supply chain map can give insight into various areas, such as critical business challenges. Learn why manufacturing ...
Certinia adds AI capabilities to PSA cloud suite
The PSA vendor adds AI functionality to its professional services cloud applications that are designed to help services firms ...

Close