Questions tagged [machine-learning]
Theoretical questions about Machine learning, especially Computational Learning Theory, including Algorithmic Learning Theory, PAC learning, and Bayesian Inference
373 questions
-4
votes
0
answers
43
views
Is Shared Mutual Information a Fundamental Constraint for DSIC Reference-Free Evaluation Mechanisms?
The TVD-MI mechanism (Robertson & Koyejo, 2025) achieves dominant-strategy incentive compatibility (DSIC) for reference-free evaluation by exploiting the Data Processing Inequality (DPI) and ...
2
votes
0
answers
33
views
Is there a reason to only use one step of gradient descent when test-time training transformers for in-context learning?
I'm aware of the fact that transformers with a single linear self-attention layer and no MLP layer learn to implement one step of gradient descent on a least-squares linear regression objective. I'm ...
3
votes
0
answers
65
views
Learning boolean assignment from satisfied formulas
I'm interested in a class of learning problems which seem "dual" to the setup of learning boolean functions from satisfying assignments, as it is studied (in various forms) by Denis and De ...
1
vote
0
answers
58
views
Can tree-based models learn polynomial boundaries easily?
Assume that we have a binary classification problem with some features $x_1, \dots, x_f$ where the corresponding class is given by
$$
c(x_1, \dots, x_f) = \begin{cases} 0 & P(x_1, \dots, x_f) < ...
3
votes
1
answer
298
views
What is example of a hypermatrix that is not a tensor?
The semantics surrounding the word tensor are quite controversial in math, physics, and computer science. In Professor Lek-Heng Lim's write-up, Tensors in Computations, it is shown that separability, ...
2
votes
0
answers
192
views
What Are the Most Important Results in Machine Learning Theory in the First Quarter of the 21st Century?
The 21st century is already a quarter complete. Over the past 25 years, machine learning has made tremendous progress. It is quite easy to find surveys summarizing the most important results in ...
0
votes
2
answers
192
views
Invention of novel NN architecture
How does one come up with novel neural network architectures? How does one "validate" a neural network architecture?
I believe this is an extremely broad question but I'm looking for ...
0
votes
0
answers
89
views
Agnostic PAC learning - equivalent definitions?
Learning model:
Domain set: $\mathcal{X}$,
Label set: $\{0,1\}$,
Hypothesis class: a set $\mathcal{H}$ of binary hypotheses $h:\mathcal{X}\to\{0,1\}$
Data-labels generating distribution: $\mathcal{D}$...
0
votes
0
answers
42
views
Sample complexity of covariance matrix estimation of a Gaussian random variable (with explicit constants)
I'm seeking an explicit bound for the number of samples required to estimate the covariance matrix of a Gaussian distribution. In https://arxiv.org/pdf/1011.3027v7 (end of page 31), the following ...
1
vote
0
answers
87
views
ML Gradient Descent Convergence Theorem
Does there exist a generalization of this theorem, by Yurii Nesterov in Introductory Lectures on Convex Optimization (2004) which relaxes the assumption that the loss function is convex and shows that ...
1
vote
0
answers
46
views
PAC-learning description of (quantum) hypothesis class containing randomness
I was wondering how to correctly describe the following hypothesis class mathematically correctly:
Say I have a quantum circuit which I postprocess by feeding its results into a neural network. How ...
2
votes
0
answers
97
views
Learning a boolean function using decision tree with small number of queries
I am working on a problem and I am looking to solve the following subproblem : Given a "restrictive" blackbox access to boolean function $\phi$, output a "small-sized" CNF that ...
1
vote
0
answers
39
views
Agnosting Learning Algorithm for Squared Loss Regression and Conditional Density Estimation
In the lecture notes titled "Foundations of Reinforcement Learning and Interactive Decision Making" by Foster and Rakhlin, it is mentioned in the Proposition 1 that there exists an algorithm ...
2
votes
1
answer
141
views
The true meaning of sampling from a distribution in the context of active learning
I would like to understand intuitively what it means to sample from a distribution $\mathcal{D}$. It may sound like a dumb question, but I can't find an answer anywhere, a colleague recommended ...
1
vote
0
answers
60
views
Constructing a DFA with $n$ states for which $L*$ needs $n$ equivalence queries
I'm working on constructing deterministic finite automata (DFAs) with a specific learning complexity when using the L* algorithm developed by Dana Angluin. My goal is to create a DFA of size ( n ) ...