20,009 questions
-1
votes
1
answer
46
views
YOLOv8 custom training loop using v8DetectionLoss fails to converge on custom dataset (7 classes) [closed]
I am trying to implement a custom training loop for object detection using YOLOv8 (Ultralytics) and PyTorch. My goal is to fine-tune a pre-trained yolov8n.pt model on the Aquarium dataset, which ...
-2
votes
0
answers
39
views
Resource specifications or requirements of Vision Language models(llm that is specialized for processing images) [closed]
I’m having difficulty finding the hardware resource specifications for different LLMs and VLMs. The leaderboard at this link — https://huggingface.co/spaces/opencompass/open_vlm_leaderboard — includes ...
Advice
1
vote
3
replies
76
views
Python library recommendation for the implementation of a neural network modification algorithm
I want to implement in python some algorithms from a paper that allow for a pre-trained neural network to be modified (adding or removing neurons or layers) conserving (theoretically) the outputs of ...
Advice
0
votes
0
replies
39
views
Large Kernel in ConvNets
I want to find a convolutional network with a large kernel (larger than 5x5 or 7x7). I want to perform kernel analysis, and to do this, I need to convert the model to the onnx format. I found ...
1
vote
1
answer
130
views
Torch Conv2d results in both dimensions convolved
I have input shape to a convolution (50, 1, 7617, 10). Here, 7617 is word vectors as rows, and 10 is the number of words in columns. I want to convolve column-wise and obtain (2631, 1, 7617, 1), 1 ...
0
votes
0
answers
298
views
Installation error while installing GroundingDino
I am trying to install the GroundingDino as instructed in the README file of their official GitHub repo, but I am facing the error below:
Obtaining file:///home/kgupta/workspace/Synthetic_Data_gen/...
0
votes
1
answer
126
views
Why does a LSTM pytorch model yield constant values?
I am training a LSTM model with data from yfinance. The process is really standard. I get the data with yf.download(ticker=ticker) where ticker='AAPL and do df.rolling(30, min_periods=1) to smooth the ...
0
votes
1
answer
124
views
Preventing GPU memory leak due to a custom neural network layer
I am using the MixStyle methodology for domain adaptation, and it involves using a custom layer that is inserted after every encoder stage. However, it is causing VRAM to grow linearly, which causes ...
-3
votes
1
answer
97
views
Can I visualize a neural network’s loss landscape to see if it’s stuck in a bad minimum? Any code example for this? [closed]
So, I’m trying to understand why sometimes neural networks get stuck during training. I heard people talk about ‘local minima’ and ‘saddle points,’ but I can’t really picture them. I want to actually ...
0
votes
0
answers
78
views
KFold cross-validation in Keras: model not resetting between folds (MobileNet backbone)
I am trying to perform KFold cross-validation on a Keras model. The first fold runs exactly as expected, but from the second fold onwards the model doesn’t seem to reset. The training behaves ...
2
votes
0
answers
169
views
TensorFlow/Keras model accumulates system and GPU RAM during training
I am training a model using TensorFlow/Keras using TensorFlow 2.19.0/Keras 3.10.0. During training, I monitor nvidia-smi and top, and the system RAM and the GPU RAM increase during the training period....
0
votes
1
answer
89
views
Differentiable weight setting in flax NNX
I'm doing some experiments with Flax NNX (not Linen!).
What I'm trying to do is compute the weights of a network using another network:
A hypernetwork receives some input parameters W and outputs a ...
3
votes
1
answer
122
views
Neural Network built from scratch using numpy isn't learning
I'm building a neural network from scratch using only Python and numpy, It's meant for classifying the MNIST data set, I got everything to work but the network isn't really learning, at epoch 0 it's ...
0
votes
1
answer
35
views
Model with ResNet blocks stuck at low accuracy
I am trying to implement classification of ECG segments from PTB-XL database (https://physionet.org/content/ptb-xl/1.0.3/). The architecture of the model which I am using is:
import torch
import torch....
0
votes
0
answers
66
views
Building NN from scratch, why does my NN not memorize a small sample size of training data? It ends up being a class distribution
No matter which input I give it after training, it still spits the class distribution.. whereas if I just remove the hidden layer and use a single layer nn, it works much better.
I know the proper ...