Questions tagged [deep-learning]

Ask Question

For questions related to deep learning, which refers to a subset of machine learning methods based on artificial neural networks (ANNs) with multiple hidden layers. The adjective deep thus refers to the number of layers of the ANNs. The expression deep learning was apparently introduced (although not in the context of machine learning or ANNs) in 1986 by Rina Dechter in the paper "Learning while searching in constraint-satisfaction-problems".

2,006 questions

0 votes

0 answers

24 views

Rethinking My Deep-Research Agent Workflow — Should We Move Beyond Static Trees? [closed]

I’m reevaluating a deep-research workflow I built earlier and would love some advice. My previous design used a static tree workflow (fixed width/depth, node = search → extract → summarize → generate ...

Gosh Li

asked Nov 28 at 15:46

0 votes

1 answer

43 views

What are the best Python library to implementation neural network modification algorithms? [closed]

I want to implement in python some algorithms from a paper that allow for a pre-trained neural network to be modified (adding or removing neurons or layers) conserving (theoretically) the outputs of ...

Rubén Sales Castellar

asked Nov 23 at 21:17

1 vote

1 answer

59 views

Plateau in performance of DQN snake AI [closed]

I'm currently making an AI to play snake using DQN and have run into a performance plateau. Here is the information about the architecture of the model. Network's design: I use CNN + MLP for both ...

Hào Võ

asked Oct 17 at 5:52

3 votes

2 answers

100 views

If large language models don’t reason symbolically, how can they still follow logical chains in text?

Transformers don’t use formal logic, yet models like GPT can handle multi-step reasoning questions. What mechanisms inside the network allow this kind of emergent logic without explicit symbolic ...

Anushka_Grace

asked Oct 5 at 17:53

0 votes

0 answers

18 views

Preventing GPU memory leak due to a custom neural network layer

I am using the MixStyle methodology for domain adaptation and it involves using a custom layer which is inserted after every encoder stage. However, it is causing VRAM to grow linearly, which causes ...

Vedant Dalimkar

asked Sep 29 at 16:00

-1 votes

1 answer

61 views

How are very deep models designed?

How did we discover the architecture of state of the art large language models?

Alex

asked Sep 3 at 21:29

0 votes

1 answer

87 views

What would be the impact on AI or humans if we could train 10s of trillion parameter models? [closed]

The largest models humans trained has roughly $1$-$2$ trillion parameters (@alberto comment) 100 billion parameters while training $10$s of trillion parameter models are on the horizon. However these ...

Justaperson

asked Sep 1 at 15:55

0 votes

0 answers

41 views

Image Segmentation Labeling Void-type Defects

I have a question regarding labeling images for defect detection using semantic segmentation. If the defect is a negative-space type (something that should be there but not there), how should the ...

Patrick Joel Tirta

asked Sep 1 at 8:12

-1 votes

1 answer

98 views

Advantage of large k in training

In AI machine learning we multiply $n\times k$ and $k\times m$ matrices? I found sources where $k$ is order of $10000$ to $20000$. What does $k$ represent and iss there any advantage potentially of ...

Justaperson

asked Aug 29 at 13:18

0 votes

1 answer

87 views

Is the way Bishop's "Deep Learning" include bias in the NN equation correct?

I have been studying Neural networks on Bishop's "Deep learning - Foundations and concepts" and came across these equation: $$ y_k(\mathbf{x}, \mathbf{w}) = f\left(\sum_{j=0}^Mw_{kj}^{(2)} h ...

niccolo_zanieri

asked Aug 22 at 10:18

1 vote

1 answer

90 views

Reconciling SWA with Fisher Information

I've seen several claims that deep learning MLE points in "flatter loss regions" improve generalization to holdout data. Most notably I've seen such claims e.g. in SWA, but also in some ...

profPlum

asked Aug 18 at 18:22

0 votes

0 answers

46 views

How do I train neural network to predict face preference?

The title is probably very broad but I have several very specific questions. I'm a relative novice to deep learning and neural network although I have good background in mathematics and CS in general. ...

Orion's Belt

asked Jul 22 at 8:39

1 vote

1 answer

103 views

DQN is not learning in Atari Pong environment and I can't figure out where I'm messing up

I'm trying to implement the findings from this DeepMind DQN paper (2015) from scratch in PyTorch using the Atari Pong environment. I've tested my Deep Q-Network on a simple test environment, where ...

Rohan Patel

asked Jul 21 at 21:18

1 vote

0 answers

64 views

Optimal matrix multiplication - impact and applications

Let $A$ be integer matrix of size $n\times t$ and $B$ be integer matrix of size $t\times m$. Let max entry in absolute value be of $b$ bits in $A,B$. If we can multiply $A,B$ in say $\leq100(n+m)tb(\...

Justaperson

asked Jul 19 at 18:07

0 votes

0 answers

28 views

Bayesian Neural networks with Quantile Huber loss

In many social science applications, we often face a lack of data and non-linear relationships between variables. I am wondering whether anyone has come across any papers or discussions about Bayesian ...

dragonforce

asked Jul 4 at 8:36

15 30 50 per page

2 3 4 5

…

134 Next

Stack Exchange Network

Questions tagged [deep-learning]

Rethinking My Deep-Research Agent Workflow — Should We Move Beyond Static Trees? [closed]

What are the best Python library to implementation neural network modification algorithms? [closed]

Plateau in performance of DQN snake AI [closed]

If large language models don’t reason symbolically, how can they still follow logical chains in text?

Preventing GPU memory leak due to a custom neural network layer

How are very deep models designed?

What would be the impact on AI or humans if we could train 10s of trillion parameter models? [closed]

Image Segmentation Labeling Void-type Defects

Advantage of large k in training

Is the way Bishop's "Deep Learning" include bias in the NN equation correct?

Reconciling SWA with Fisher Information

How do I train neural network to predict face preference?

DQN is not learning in Atari Pong environment and I can't figure out where I'm messing up

Optimal matrix multiplication - impact and applications

Bayesian Neural networks with Quantile Huber loss

Hot Network Questions