Open In App

What is Data Augmentation? How Does Data Augmentation Work for Images?

Last Updated : 14 Apr, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Data augmentation is a technique used to increase diversity of a dataset without actually collecting new data. It works by applying various transformations to the existing data to create new, modified versions of data that helps the model generalize better. In this article, we will learn more about data augmentation.

How Does Data Augmentation Work for Images?

Data augmentation for images works by applying various transformations technique to the original images. These transformations are applied in a way that maintains the original label of the data while creating augmented data for training. Some of these transformations are:

1. Geometric Transformations

Geometric transformations alter the spatial properties of an image. It include:

  • Rotation: It rotate the image to a certain angle like 90° or 180°.
  • Flipping: It flips the image horizontally or vertically.
  • Scaling: Helps in zooming in or out in image.
  • Translation: Shifting the image along the x or y axis.
  • Shearing: Slanting the shape of the image.

2. Color Space Augmentations

Color space augmentations modify the color properties of an image. These include:

  • Brightness Adjustment: We can increase or decrease the brightness of the image.
  • Contrast Adjustment: It change the contrast of image.
  • Saturation Adjustment: It modify intensity of colors in the image.
  • Hue Adjustment: Shifting the colors by changing the hue.

3. Kernel Filters

Kernel filters apply convolutional operations to enhance or suppress specific features in the image. It includes:

  • Blurring: Applying Gaussian blur to smooth the image.
  • Sharpening: Enhancing the edges to make the image sharper.
  • Edge Detection: Highlighting the edges in the image using filters like Sobel or Laplacian.

4. Random Erasing

Random erasing involves randomly masking out a rectangular region of the image. This helps the model become invariant to occlusions and improves its ability to handle missing parts of objects.

5. Combining Augmentations

In this multiple augmentation techniques are combined to create more varied training data. For example an image might be rotated, flipped and then have its brightness adjusted in a single augmentation pipeline.

Implementing Data Augmentation in Python

Below is the step by step implementation of data augmentation:

1. Import the Necessary Libraries

Import the necessary libraries like numpy, matplotlib and tenserflow.

Python
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt
import numpy as np

2. Define the ImageDataGenerator

Create an instance of ImageDataGenerator with specified augmentation parameters such as rotation, width shift, height shift, shear, zoom and horizontal flip.

Python
datagen = ImageDataGenerator(
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

3. Load an Example Image

We will load an image from the CIFAR-10 dataset to use as an example for augmentation and display the original image using matplotlib.

Python
(imgs, labels), _ = tf.keras.datasets.cifar10.load_data()
img = imgs[0]

plt.figure(figsize=(6, 6))
plt.imshow(img.astype('uint8'))
plt.title("Original Image")
plt.axis('off')
plt.show()

Output:

image
Original Image

4. Reshape the Image

Here we will reshape the image to include a batch dimension which is required by the flow method of ImageDataGenerator.

Python
img = img.reshape((1,) + img.shape)

5. Generate Augmented Images

We will use the flow method to generate batches of augmented images. We will specify number of augmented images we want i.e 4 in this case.

Python
i = 0
augmented_images = []
for batch in datagen.flow(img, batch_size=1):
    augmented_images.append(batch[0].astype('uint8'))
    i += 1
    if i % 4 == 0: 
        break

6. Display Augmented Images

We will display the generated augmented images in a matrix format using matplotlib subplots.

Python
fig, axes = plt.subplots(1, 4, figsize=(20, 5))
axes = axes.flatten()
for img, ax in zip(augmented_images, axes):
    ax.imshow(img)
    ax.axis('off')
plt.suptitle("Augmented Images")
plt.show()

Output:

augmented-images
Augmented Images generated by Image Data Generator

We can see that we get 4 augmented images of our original image. It is used for:

  • Improves Model Generalization: By exposing the model to a wider variety of data, it learns to generalize better to unseen data.
  • Reduces Overfitting: It prevents model from learning noise and memorizing the training data.
  • Enhances Robustness: It make model more robust to variations and distortions in real-world data.
  • Cost-Effective: It reduces the need for collecting and annotating large amounts of new data.

Tools and Libraries for Image Data Augmentation

Several tools and libraries provide image data augmentation:

  • TensorFlow: TensorFlow’s tf.image module provides functions for image transformations.
  • Keras: Keras offers the ImageDataGenerator class for real-time data augmentation.
  • PyTorch: PyTorch’s torchvision.transforms module includes a wide range of augmentation techniques.
  • Albumentations: A fast image augmentation library with a rich set of transformations.
  • imgaug: A flexible library for image augmentation with support for various augmentations.

Data augmentation is a technique for expanding and diversifying datasets particularly in image processing. By applying various transformations to existing data we can create new training examples that help improve model generalization, reduce overfitting and enhance robustness.


Next Article

Similar Reads