Learning new tasks without new data.
This project explores how machine learning models themselves can be generated using other machine learning models, without direct task-specific supervision.
This project introduces an unconventional but powerful idea:
using generative models to create new classifiers by fusing existing ones.
Specifically, we leverage CycleGANs to merge the learned feature representations of two independently trained Convolutional Neural Networks (CNNs):
- One CNN trained to recognize cats
- One CNN trained to recognize the color black
By translating and combining their feature spaces, we generate a new CNN capable of detecting black cats โ without ever being trained on a single black cat image.
๐ Result:
The generated model achieves 88% classification accuracy on black cat detection, despite zero direct exposure to black cat data.
This work opens new directions for:
- Learning under data scarcity
- Automated model generation
- Knowledge transfer beyond traditional fine-tuning
-
๐ CycleGAN-based Model Fusion
Uses CycleGANs to translate and align feature kernels between CNNs trained on unrelated domains. -
๐งฌ Generated CNNs (Zero-shot Task Creation)
Constructs a task-specific classifier purely from pre-trained models. -
๐ Feature Space Validation
Employs UMAP, K-Means, and DBSCAN to analyze and validate learned representations. -
๐งช Unsupervised Generalization
Demonstrates black cat recognition without labeled black cat data.
Instead of training a model on data, we train a model on other models.
-
Train two CNNs on separate concepts
- Object: cat
- Attribute: black
-
Extract convolutional kernels from both networks.
-
Train CycleGANs to translate kernels between these feature domains.
-
Initialize a new CNN using the CycleGAN-generated kernels.
-
Evaluate whether this synthesized CNN can recognize black cats.
โก๏ธ It can.
| Dataset | Samples |
|---|---|
| Black / Random Images | 1,826 (1,745 black, 81 random) |
| Cat / Random Images | 30,405 (29,843 cats, 562 random) |
| Kernel Sets for CycleGAN | 4,498 per convolutional layer |
- 2 Convolutional layers
- Kernel size: 5ร5
- Activation: ReLU
- Max-pooling layers
- Trained independently on separate domains
- GeneratorโDiscriminator architecture
- Learns kernel-space translation, not image translation
- Operates directly on convolutional filters
- Initialized entirely using CycleGAN-generated kernels
- No gradient updates using black cat images
- Accuracy
- Precision & Recall
- Cluster Entropy
- Cluster Purity
- Cosine Similarity
- UMAP Visualization
- โ The generated CNN successfully clusters black cat images
- ๐ UMAP projections show clear separation of semantic concepts
- ๐ Cosine similarity confirms meaningful feature alignment
- ๐ง Demonstrates unsupervised semantic composition
The model learns โblack AND catโ without ever seeing a black cat.
Traditional ML assumes:
New task โ new labeled data
This project challenges that assumption by showing:
- Tasks can be composed
- Models can be generated, not trained
- Generative models can operate in parameter space, not just data space
This has implications for:
- Low-resource domains
- Privacy-sensitive data
- Automated ML systems
- Foundation model composition
- ๐ง Hyperparameter optimization for clustering and feature fusion
- ๐ Alternative feature-space similarity metrics
- ๐ง Semantic-aware end-to-end pipelines
- ๐งช Scaling to deeper CNNs and transformers
- ๐ Multi-attribute model composition
- GPU with โฅ 6GB VRAM (recommended)
- Python 3.11
- PyTorch 2.5
- torchvision
- numpy
- scikit-learn
- seaborn
- matplotlib
- tqdm
Install dependencies:
pip install -r requirements.txt###๐ License MIT
Generative models, representation learning, and non-traditional ML paradigms.
โWhy train on more data when you can train on more models?โ