Skip to content

This repo implements a deep learning pipeline for classifying environmental sounds from the ESC-50 dataset.

License

Notifications You must be signed in to change notification settings

JoelDeonDsouza/Auto_CNN

Repository files navigation

πŸš€ Auto_CNN

This project implements a deep learning pipeline for classifying environmental sounds from the ESC-50 dataset. It features a custom convolutional neural network (AutoCNN) built with PyTorch, using residual blocks for improved learning. The solution includes data augmentation, training on GPU using Modal, and a scalable inference API.

πŸ”§ Tech Stack

ML/Backend:

  • PyTorch
  • Torchaudio
  • Modal (GPU-accelerated training & inference)
  • FastAPI (for serving inference endpoints)

Client (optional visualization layer):

  • Next.js
  • Tailwind CSS
  • TypeScript
  • Shadcn UI

🎯 Key Features

βœ… Custom CNN architecture with residual connections
βœ… ESC-50 dataset ingestion and preprocessing
βœ… Data augmentation with Mixup and spectrogram masking
βœ… Fully managed training on GPU using Modal
βœ… Inference API with real-time audio classification
βœ… Intermediate feature map visualization for debugging and interpretability
βœ… Example endpoint for testing predictions from WAV files

πŸ“¦ Environment Variables (.env)

# Modal API Key
NEXT_PUBLIC_MODAL_API=

πŸ“ Dataset

This project uses the ESC-50 dataset, which contains 50 environmental sound categories. The training pipeline automatically downloads and prepares the dataset during Modal app initialization.

βš™οΈ Installation

# Clone the repo
git clone https://github.com/JoelDeonDsouza/Auto_CNN.git
cd auto-cnn

# Install dependencies
pip install -r requirements.txt

πŸš€ Training on Modal

You can launch a training job on Modal with GPU acceleration.

modal run train.py

The trained model will be saved in a Modal-managed volume.

πŸ” Inference

Deploy the inference API:

modal deploy main.py

Test an inference request locally:

modal run main.py

πŸ”Š Example Audio Test

Put your WAV files in the audio-tests/ directory. An example (chirpingBirds.wav) is included for testing.

πŸ› οΈ Key Components

  • AutoCNN: Custom CNN model with residual blocks
  • ESC50Dataset: PyTorch Dataset class for ESC-50
  • train.py: Training loop with augmentation, optimizer, and TensorBoard logging
  • main.py: FastAPI-powered inference endpoint
  • modal: Manages GPU workloads and deploys endpoints

πŸ”¬ Model Architecture

The AutoCNN model follows a ResNet-inspired structure with four convolutional stages, followed by global pooling and a linear classifier.

About

This repo implements a deep learning pipeline for classifying environmental sounds from the ESC-50 dataset.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published