Newest 'machine-learning' Questions - Theoretical Computer Science Stack Exchange

-4 votes

0 answers

43 views

Is Shared Mutual Information a Fundamental Constraint for DSIC Reference-Free Evaluation Mechanisms?

The TVD-MI mechanism (Robertson & Koyejo, 2025) achieves dominant-strategy incentive compatibility (DSIC) for reference-free evaluation by exploiting the Data Processing Inequality (DPI) and ...

Charlie Parker

254

asked Nov 24 at 6:20

2 votes

0 answers

33 views

Is there a reason to only use one step of gradient descent when test-time training transformers for in-context learning?

I'm aware of the fact that transformers with a single linear self-attention layer and no MLP layer learn to implement one step of gradient descent on a least-squares linear regression objective. I'm ...

Max Masterton

21

asked Jul 23 at 16:41

3 votes

0 answers

65 views

Learning boolean assignment from satisfied formulas

I'm interested in a class of learning problems which seem "dual" to the setup of learning boolean functions from satisfying assignments, as it is studied (in various forms) by Denis and De ...

sd234

623

asked Mar 19 at 23:30

1 vote

0 answers

58 views

Can tree-based models learn polynomial boundaries easily?

Assume that we have a binary classification problem with some features $x_1, \dots, x_f$ where the corresponding class is given by $$ c(x_1, \dots, x_f) = \begin{cases} 0 & P(x_1, \dots, x_f) < ...

Seewoo Lee

111

asked Mar 1 at 0:48

3 votes

1 answer

298 views

What is example of a hypermatrix that is not a tensor?

The semantics surrounding the word tensor are quite controversial in math, physics, and computer science. In Professor Lek-Heng Lim's write-up, Tensors in Computations, it is shown that separability, ...

YoungProbopass

31

asked Jan 12 at 3:34

2 votes

0 answers

192 views

What Are the Most Important Results in Machine Learning Theory in the First Quarter of the 21st Century?

The 21st century is already a quarter complete. Over the past 25 years, machine learning has made tremendous progress. It is quite easy to find surveys summarizing the most important results in ...

Rowan

151

asked Jan 10 at 7:06

0 votes

2 answers

192 views

Invention of novel NN architecture

How does one come up with novel neural network architectures? How does one "validate" a neural network architecture? I believe this is an extremely broad question but I'm looking for ...

rhoyrboat

11

asked Jan 2 at 19:44

0 votes

0 answers

89 views

Agnostic PAC learning - equivalent definitions?

Learning model: Domain set: $\mathcal{X}$, Label set: $\{0,1\}$, Hypothesis class: a set $\mathcal{H}$ of binary hypotheses $h:\mathcal{X}\to\{0,1\}$ Data-labels generating distribution: $\mathcal{D}$...

VK88

1

asked Dec 11, 2024 at 11:41

0 votes

0 answers

42 views

Sample complexity of covariance matrix estimation of a Gaussian random variable (with explicit constants)

I'm seeking an explicit bound for the number of samples required to estimate the covariance matrix of a Gaussian distribution. In https://arxiv.org/pdf/1011.3027v7 (end of page 31), the following ...

Dante Perez

1

asked Dec 3, 2024 at 21:09

1 vote

0 answers

87 views

ML Gradient Descent Convergence Theorem

Does there exist a generalization of this theorem, by Yurii Nesterov in Introductory Lectures on Convex Optimization (2004) which relaxes the assumption that the loss function is convex and shows that ...

Kyle

11

asked Sep 17, 2024 at 16:37

1 vote

0 answers

46 views

PAC-learning description of (quantum) hypothesis class containing randomness

I was wondering how to correctly describe the following hypothesis class mathematically correctly: Say I have a quantum circuit which I postprocess by feeding its results into a neural network. How ...

Taleofwoe

11

asked Aug 8, 2024 at 19:44

2 votes

0 answers

97 views

Learning a boolean function using decision tree with small number of queries

I am working on a problem and I am looking to solve the following subproblem : Given a "restrictive" blackbox access to boolean function $\phi$, output a "small-sized" CNF that ...

AlternatingGroupoid

21

asked Aug 4, 2024 at 6:12

1 vote

0 answers

39 views

Agnosting Learning Algorithm for Squared Loss Regression and Conditional Density Estimation

In the lecture notes titled "Foundations of Reinforcement Learning and Interactive Decision Making" by Foster and Rakhlin, it is mentioned in the Proposition 1 that there exists an algorithm ...

atul ganju

153

asked Aug 1, 2024 at 3:07

2 votes

1 answer

141 views

The true meaning of sampling from a distribution in the context of active learning

I would like to understand intuitively what it means to sample from a distribution $\mathcal{D}$. It may sound like a dumb question, but I can't find an answer anywhere, a colleague recommended ...

Guesttilunderstandingnature

23

asked May 3, 2024 at 22:17

1 vote

0 answers

60 views

Constructing a DFA with $n$ states for which $L*$ needs $n$ equivalence queries

I'm working on constructing deterministic finite automata (DFAs) with a specific learning complexity when using the L* algorithm developed by Dana Angluin. My goal is to create a DFA of size ( n ) ...

Coping Forever

21

asked Mar 5, 2024 at 11:26

Stack Exchange Network

Questions tagged [machine-learning]

Is Shared Mutual Information a Fundamental Constraint for DSIC Reference-Free Evaluation Mechanisms?

Is there a reason to only use one step of gradient descent when test-time training transformers for in-context learning?

Learning boolean assignment from satisfied formulas

Can tree-based models learn polynomial boundaries easily?

What is example of a hypermatrix that is not a tensor?

What Are the Most Important Results in Machine Learning Theory in the First Quarter of the 21st Century?

Invention of novel NN architecture

Agnostic PAC learning - equivalent definitions?

Sample complexity of covariance matrix estimation of a Gaussian random variable (with explicit constants)

ML Gradient Descent Convergence Theorem

PAC-learning description of (quantum) hypothesis class containing randomness

Learning a boolean function using decision tree with small number of queries

Agnosting Learning Algorithm for Squared Loss Regression and Conditional Density Estimation

The true meaning of sampling from a distribution in the context of active learning

Constructing a DFA with $n$ states for which $L*$ needs $n$ equivalence queries

Hot Network Questions