Tech Accelerator What is generative AI? Everything you need to know

Definition

generative adversarial network (GAN)

By

Kinza Yasar, Technical Writer
Sarah Lewis

What is a generative adversarial network (GAN)?

A generative adversarial network (GAN) is a machine learning (ML) model in which two neural networks compete with each other by using deep learning methods to become more accurate in their predictions. GANs typically run unsupervised and use a cooperative zero-sum game framework to learn, where one person's gain equals another person's loss.

The two neural networks that make up a GAN are referred to as the generator and the discriminator. The generator is a convolutional neural network and the discriminator is a deconvolutional neural network. The goal of the generator is to artificially manufacture outputs that could easily be mistaken for real data. The goal of the discriminator is to identify which of the outputs it receives have been artificially created.

Essentially, generative models create their own training data. While the generator is trained to produce false data, the discriminator network is taught to distinguish between the generator's manufactured data and true examples. If the discriminator rapidly recognizes the fake data that the generator produces -- such as an image that isn't a human face -- the generator suffers a penalty. As the feedback loop between the adversarial networks continues, the generator will begin to produce higher-quality and more believable output and the discriminator will become better at flagging data that has been artificially created. For instance, a generative adversarial network can be trained to create realistic-looking images of human faces that don't belong to any real person.

How GANs work

GANs are typically divided into the following three categories:

Generative. This describes how data is generated in terms of a probabilistic model.
Adversarial. A model is trained in an adversarial setting.
Networks. Deep neural networks can be used as artificial intelligence (AI) algorithms for training purposes.

The first step in establishing a GAN is to identify the desired end output and gather an initial training data set based on those parameters. This data is then randomized and input into the generator until it acquires basic accuracy in producing outputs.

This article is part of

What is generative AI? Everything you need to know

Which also includes:
8 top generative AI tool categories for 2024
Will AI replace jobs? 9 job types that might be affected
18 of the best large language models in 2024

Next, the generated samples or images are fed into the discriminator along with actual data points from the original concept. After the generator and discriminator models have processed the data, optimization with backpropagation starts. The discriminator filters through the information and returns a probability between 0 and 1 to represent each image's authenticity -- 1 correlates with real images and 0 correlates with fake. These values are then manually checked for success and repeated until the desired outcome is reached.

A GAN typically takes the following steps:

The generator outputs an image after accepting random numbers.
The discriminator receives this created image in addition to a stream of photos from the real, ground-truth data set.
The discriminator inputs both real and fake images and outputs probabilities -- a value between 0 and 1 -- where 1 indicates a prediction of authenticity and 0 indicates a fake.

This creates a double feedback loop where the discriminator is in a feedback loop with the ground truth of the images and the generator is in a feedback loop with the discriminator.

How GAN works. — An image showing how GAN works.

Types of GANs

GANs come in a variety of forms and can be used for various tasks. The following are the most common GAN types:

Vanilla GAN. This is the simplest of all GANs and its algorithm tries to optimize the mathematical equation using stochastic gradient descent, which is a method of learning an entire data set by going through one example at a time. It consists of a generator and a discriminator. The classification and creation of generated images is done using the generators and discriminators as straightforward multi-layer perceptrons. The discriminator seeks to determine the likelihood that the input belongs to a particular class while the generator collects the distribution of the data.
Conditional GAN. By applying class labels, this kind of GAN enables the conditioning of the network with new and specific information. As a result, during GAN training, the network receives the images with their actual labels, such as "rose," "sunflower" or "tulip" to help it learn how to distinguish between them.
Deep convolutional GAN. This GAN uses a deep convolutional neural network for producing high-resolution image generation that can be differentiated. Convolutions are a technique for drawing out important information from the generated data. They function particularly well with images, enabling the network to quickly absorb the essential details.
CycleGAN. This is the most common GAN architecture and is generally used to learn how to transform between images of various styles. For instance, a network can be taught how to alter an image from winter to summer or from an image of a horse to a zebra. One of the most well-known applications of CycleGAN is FaceApp, which alters human faces into various age groups.
StyleGAN. Researchers from Nvidia released StyleGAN in December 2018 and proposed significant improvements to the original generator architecture models. StyleGAN can produce photorealistic, high-quality photos of faces, but users can modify the model to alter the appearance of the images that are produced.
Super resolution GAN. With this type of GAN, a low-resolution image can be changed into a more detailed one. Super-resolution GANs increase the image resolution by filling in blurry spots.

Popular use cases for GANs

GANs are becoming a popular ML model for online retail sales because of their ability to understand and recreate visual content with increasingly remarkable accuracy. They can be used for a variety of tasks, including anomaly detection, data augmentation, picture synthesis, and text-to-image and image-to-image translation.

Common use cases of GANs include the following:

Filling in images from an outline.
Generating a realistic image from text.
Producing photorealistic depictions of product prototypes.
Converting black and white imagery into color.

Photo translations from image sketches or semantic images that are especially useful in the healthcare industry for diagnoses.

In video production, GANs can be used to perform the following:

Model patterns of human behavior and movement within a frame.
Predict subsequent video frames.
Create a deepfake.

Other use cases of GANs include text-to-speech for the generation of realistic speech sounds. Furthermore, GAN-based generative AI models can generate text for blogs, articles and product descriptions. These AI-generated texts can be used for a variety of purposes, including advertising, social media content, research and communication.

GAN examples

GANs are used to generate a wide range of data types, including images, music and text. The following are popular real-world examples of GAN:

Generating human faces. GANs can produce accurate representations of human faces. For example, StyleGAN2 from Nvidia can produce excellent, photorealistic images of people that don't exist. These pictures are so lifelike that many people believe they're actual individuals.
Developing new fashion designs. GANs can be used to create new fashion designs that reflect existing ones. For instance, clothing retailer H&M used GANs to create new apparel designs for its merchandise.
Creating realistic animal images. GANs can also generate realistic images of animals. For example, BigGAN, a GAN model developed by Google researchers, can produce high-quality images of animals such as birds and dogs.
Video game character creation. GANs can be used to create new characters for video games. For example, Nvidia created new characters using GANs for the well-known video game Final Fantasy XV.
Generating realistic three-dimensional (3D) objects. GANs are also capable of producing actual 3D objects. For example, researchers at Massachusetts Institute of Technology have created 3D models of chairs and other furniture that appear to have been created by people using GANs. These models can be applied to architectural visualization or video games.

Both convolutional neural networks (CNNs) and recurrent neural networks (RNNs) have played a big role in the advancement of AI. Learn how CNNs and RNNs differ from each other and explore their strengths and weaknesses.

This was last updated in March 2023

Continue Reading About generative adversarial network (GAN)

Assessing different types of generative AI applications

Successful generative AI examples worth noting

AI vs. machine learning vs. deep learning: Key differences

GANs vs. VAEs: What is the best generative AI approach?

GAN vs. transformer models: Comparing architectures and uses

Dig Deeper on Machine learning platforms

Business Analytics

Logi analytics suite to add new GenAI, SaaS capabilities
Insightsoftware, parent company of the embedded BI specialist, unveiled a new generative AI assistant and SaaS version of ...
Snowflake targets enterprise AI with launch of Arctic LLM
The data cloud vendor's open source LLM was designed to excel at business-specific tasks, such as generating code and following ...
AI-fueled efficiency a focus for SAS analytics platform
The vendor's latest product development plans include an AI assistant and prebuilt AI models that enable workers to be more ...

CIO

Election might decide fate of FTC noncompetes ban
If the FTC's ban on noncompete agreements survives legal challenges, it might still face problems should there be an ...
FTC bans noncompete agreements in split vote
Now that the FTC has issued its final rule banning noncompete clauses, it's likely to face a bevy of legal challenges.
Ally's generative AI strategy eyes multiple LLMs, AI agents
The digital bank plans to privately host multiple LLMs on its GenAI platform, explore autonomous agent technology and evaluate ...

Data Management

AI boosts efficiency in data management
AI can automate tasks across every aspect of the data management process, enabling data teams to focus on models, not labeling ...
AtScale adds semantic layer support for AI, GenAI models
The vendor's new platform update centers around decision-making flexibility, collaboration and community, and includes a metadata...
Open source vs. proprietary database management
Open source and commercial databases are alternative options to help streamline data management processes. Examine the pros and ...

ERP

Microsoft, SAP add more AI to manufacturing, supply chain
At the Hannover Messe industrial show, Microsoft debuted Fabric AI for OT and IT data, and the copilot template for factory ...
How to create a simple supply chain map
A simple supply chain map can give insight into various areas, such as critical business challenges. Learn why manufacturing ...
Certinia adds AI capabilities to PSA cloud suite
The PSA vendor adds AI functionality to its professional services cloud applications that are designed to help services firms ...

Close