What is Data Augmentation? How Does Data Augmentation Work for Images?
Data augmentation is a technique used to increase diversity of a dataset without actually collecting new data. It works by applying various transformations to the existing data to create new, modified versions of data that helps the model generalize better. In this article, we will learn more about data augmentation.
How Does Data Augmentation Work for Images?
Data augmentation for images works by applying various transformations technique to the original images. These transformations are applied in a way that maintains the original label of the data while creating augmented data for training. Some of these transformations are:
1. Geometric Transformations
Geometric transformations alter the spatial properties of an image. It include:
- Rotation: It rotate the image to a certain angle like 90° or 180°.
- Flipping: It flips the image horizontally or vertically.
- Scaling: Helps in zooming in or out in image.
- Translation: Shifting the image along the x or y axis.
- Shearing: Slanting the shape of the image.
2. Color Space Augmentations
Color space augmentations modify the color properties of an image. These include:
- Brightness Adjustment: We can increase or decrease the brightness of the image.
- Contrast Adjustment: It change the contrast of image.
- Saturation Adjustment: It modify intensity of colors in the image.
- Hue Adjustment: Shifting the colors by changing the hue.
3. Kernel Filters
Kernel filters apply convolutional operations to enhance or suppress specific features in the image. It includes:
- Blurring: Applying Gaussian blur to smooth the image.
- Sharpening: Enhancing the edges to make the image sharper.
- Edge Detection: Highlighting the edges in the image using filters like Sobel or Laplacian.
4. Random Erasing
Random erasing involves randomly masking out a rectangular region of the image. This helps the model become invariant to occlusions and improves its ability to handle missing parts of objects.
5. Combining Augmentations
In this multiple augmentation techniques are combined to create more varied training data. For example an image might be rotated, flipped and then have its brightness adjusted in a single augmentation pipeline.
Implementing Data Augmentation in Python
Below is the step by step implementation of data augmentation:
1. Import the Necessary Libraries
Import the necessary libraries like numpy, matplotlib and tenserflow.
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt
import numpy as np
2. Define the ImageDataGenerator
Create an instance of ImageDataGenerator
with specified augmentation parameters such as rotation, width shift, height shift, shear, zoom and horizontal flip.
datagen = ImageDataGenerator(
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest'
)
3. Load an Example Image
We will load an image from the CIFAR-10 dataset to use as an example for augmentation and display the original image using matplotlib.
(imgs, labels), _ = tf.keras.datasets.cifar10.load_data()
img = imgs[0]
plt.figure(figsize=(6, 6))
plt.imshow(img.astype('uint8'))
plt.title("Original Image")
plt.axis('off')
plt.show()
Output:

4. Reshape the Image
Here we will reshape the image to include a batch dimension which is required by the flow
method of ImageDataGenerator
.
img = img.reshape((1,) + img.shape)
5. Generate Augmented Images
We will use the flow
method to generate batches of augmented images. We will specify number of augmented images we want i.e 4 in this case.
i = 0
augmented_images = []
for batch in datagen.flow(img, batch_size=1):
augmented_images.append(batch[0].astype('uint8'))
i += 1
if i % 4 == 0:
break
6. Display Augmented Images
We will display the generated augmented images in a matrix format using matplotlib
subplots.
fig, axes = plt.subplots(1, 4, figsize=(20, 5))
axes = axes.flatten()
for img, ax in zip(augmented_images, axes):
ax.imshow(img)
ax.axis('off')
plt.suptitle("Augmented Images")
plt.show()
Output:

We can see that we get 4 augmented images of our original image. It is used for:
- Improves Model Generalization: By exposing the model to a wider variety of data, it learns to generalize better to unseen data.
- Reduces Overfitting: It prevents model from learning noise and memorizing the training data.
- Enhances Robustness: It make model more robust to variations and distortions in real-world data.
- Cost-Effective: It reduces the need for collecting and annotating large amounts of new data.
Tools and Libraries for Image Data Augmentation
Several tools and libraries provide image data augmentation:
- TensorFlow: TensorFlow’s
tf.image
module provides functions for image transformations. - Keras: Keras offers the
ImageDataGenerator
class for real-time data augmentation. - PyTorch: PyTorch’s
torchvision.transforms
module includes a wide range of augmentation techniques. - Albumentations: A fast image augmentation library with a rich set of transformations.
- imgaug: A flexible library for image augmentation with support for various augmentations.
Data augmentation is a technique for expanding and diversifying datasets particularly in image processing. By applying various transformations to existing data we can create new training examples that help improve model generalization, reduce overfitting and enhance robustness.