pac-learning

Finding the tightest (smallest) triangle that fits all points

MathCurious

2022年5月31日 04:59

I'm supposed to find an algorithm that, given a bunch of points on the Euclidean plane, I have to return the tightest (smallest) origin centered upright equilateral triangle that fits all the given points inside of it, in a way that if I input some random new point, the algorithm will return $+$ if the point is inside the triangle and $-$ if not. Someone has suggested me to go over all the possible points and find the point with …

Topic: pac-learning algorithms machine-learning

Category: Data Science

PAC Learnability - Notation

tkj80

2022年5月27日 04:07

The following is from Understanding Machine Learning: Theory to Algorithm textbook: Definition of PAC Learnability: A hypothesis class $\mathcal H$ is PAC learnable if there exist a function $m_H : (0, 1)^2 \rightarrow \mathbb{N}$ and a learning algorithm with the following property: For every $\epsilon, \delta \in (0, 1)$, for every distribution $D$ over $X$, and for every labeling function $f : X \rightarrow \{0,1\}$, if the realizable assumption holds with respect to $\mathcal H,D,f$ then when running the learning …

Topic: pac-learning notation machine-learning

Category: Data Science

Learner Algorithm Time & Sample Complexity

MathCurious

2022年5月25日 15:06

Let $X=R^{2}$. Let $u=\left(\frac{\sqrt{3}}{2},-\frac{1}{2}\right),\ w=\left(-\frac{\sqrt{3}}{2},-\frac{1}{2}\right),\ v=\left(0,1\right)$ and $C=H=\left\{h\left(r\right)=\left\{\left(x_{1},x_{2\ }\right)\ |\left(x_{1},x_{2\ }\right)\cdot u\le4,\ \left(x_{1},x_{2\ }\right)\cdot w\le r,\ \left(x_{1},x_{2\ }\right)\cdot v\le r\right\}\right\}$ for $r>0$, the set of all origin centered upright equilateral triangles. Describe a sample complexity algorithm $L$ that learns $C$ using $H$. State the time and sample complexity of your algorithm and prove it. I was faced with this question in a homework assignment and I'm a bit confused.. My solution is: Let D be our dataset Learner Algorithm: maxDistance …

Topic: pac-learning classification algorithms

Category: Data Science

VC Dimension of a Countably Infinite Class

Chen Reddy Sundeep

2022年5月14日 14:06

I know that there are many examples of classes where the VC Dimension is finite/infinite even though the size of the class is Uncountably Infinite. However, I could not argue if the VC Dimension of a Countably Infinite class is always finite? (I feel that its size will be "smaller" than the size of a power set of an arbitrarily large set) Any help on this is appreciated.

Topic: vc-theory pac-learning machine-learning

Category: Data Science

Where does the "deep learning needs big data" rule come from

Aran

2022年3月21日 10:05

When reading about deep learning I often come across the rule that deep learning is only effective when you have large amounts of data at your disposal. These statements are generally accompanied by a figure such as this: The example (taken from https://hackernoon.com/%EF%B8%8F-big-challenge-in-deep-learning-training-data-31a88b97b282 ) is attributed to a 'famous slide from Andrew Ng'. Does anyone know what this figure is actually based upon? Is there any research that backs up this claim?

Topic: model-selection pac-learning deep-learning bigdata machine-learning

Category: Data Science

Proving that a Hypothesis Class is not PAC-Learnable

M. Fire

2021年10月5日 15:12

I was wondering how one can show that a class of classifiers $H$ is not PAC-learnable (in the realizable case) without using VC-dimensions in the argument? I know how to show PAC-learnability through the PAC requirements. But what I'm not sure how to show that it's not PAC-learnable. Thanks

Topic: pac-learning

Category: Data Science

Disproving or proving claim that if VCdim is "n" then it is possible that a set of smaller size is not shattered

C.H.

2021年3月28日 12:01

Today in the lecture the lecturer said something I found peculiar, and I felt very inconvenient when I heard it: He claimed, that if the maximal VCdim of some hypothesis class is $n\in\mathbb N$, then it is possible that there is some $i<n$ such that for every subset C of size i the subset C is not shattered. Is his claim true? I thought that we can take some subset of size $i,\forall i\in [n]$of the set C* which satisfies …

Topic: vc-theory pac-learning machine-learning

Category: Data Science

Why does PAC learning focus on learnability of the hypothesis class and not the target function?

Jack M

2020年9月27日 19:12

The definition of PAC learning is roughly An algorithm is a PAC learning algorithm if it given enough data, for any target function, it asymptotically does as well as it possibly could given the functions it's capable of representing. This definition seems sort of unambitious. In reality, I care more about approximating the target function well in an absolute sense, not just approximating it as well as my hypothesis class can muster. By some kind of no-free-lunch principle, it's probably …

Topic: pac-learning

Category: Data Science

Why is a lower bound necessary in proofs of VC-dimensions for various examples of hypotheses?

learning machine

2020年2月14日 20:01

In the book "Foundations of Machine Learning" there are examples of proving the VC dimensions for various hypotheses, e.g., for axis-aligned rectangles, convex polygons, sine functions, hyperplanes, etc. All proofs first derive a lower bound, and then show an upper bound. However, why not just derive the upper bound since the definition of VC dimension only cares about the "largest" set that can be shattered by hypothesis set $\mathcal{H}$? Since all examples ends up with a lower bound matching the …

Topic: vc-theory pac-learning machine-learning

Category: Data Science

A question on realizable sample complexity

Nadav Schweiger

2020年1月5日 20:02

I came across the following exercise, and I just can't seem to crack it: Let $l$ be some loss function such that $l \leq 1$. Let $H$ be some hypothesis class, and let $A$ be a learning algorithm. show that: $m^{\text{stat, r}}_H (\epsilon) = O\left(m^{\text{stat, r}}_H (\epsilon/2, 1/2)\cdot \log(1/\epsilon) + \frac{\log(1/\epsilon)}{\epsilon^2}\right)$ Where $m^{\text{stat, r}}_H (\epsilon)$ is the minimal number $m$ such that for any realizable distribution over training examples $D$ we have that: $$\mathbb{E}_{S \sim D^m}\left[ l_D(A(S)) \right]\leq \epsilon$$ And …

Topic: vc-theory theory pac-learning machine-learning

Category: Data Science

Are decision tree algorithms linear or nonlinear

user2966197

2019年9月10日 22:27

Recently a friend of mine was asked whether decision tree algorithms are linear or nonlinear algorithms in an interview. I tried to look for answers to this question but couldn't find any satisfactory explanation. Can anyone answer and explain the solution to this question? Also, what are some other examples of nonlinear machine learning algorithms?

Topic: pac-learning decision-trees classification algorithms machine-learning

Category: Data Science

A trick used in Rademacher complexity related Theorem

learning machine

2019年4月10日 12:07

I am currently working on the proof of Theorem 3.1 in the book "Foundations of Machine Learning" (page 35, First edition), and there is a key trick used in the proof (equation 3.10 and 3.11): $$\begin{align*} &E_{S,S'}\left[\underset{g \in \mathcal{G}}{\text{sup}}\frac{1}{m}\sum_{i=1}^{m} g(z'_i)-g(z_i)\right]=E_{\boldsymbol{\sigma},S,S'}\left[\underset{g \in \mathcal{G}}{\text{sup}}\frac{1}{m}\sum_{i=1}^{m} \sigma_i(g(z'_i)-g(z_i))\right] \\ &\text{where } {\Bbb P}(\sigma_i=1)={\Bbb P}(\sigma_i=-1)=\frac{1}{2} \end{align*}$$ It is also shown in the lecture pdf page 8 in this link: https://cs.nyu.edu/~mohri/mls/lecture_3.pdf This is possible because $z_i$ and $z'_i$ can be swapped. My question is, why can we …

Topic: theory pac-learning machine-learning

Category: Data Science

Generalization bound (single hypothesis) in "Foundations of Machine Learning"

learning machine

2019年3月31日 09:49

I have a question about Corollary $2.2$: Generalization bound--single hypothesis in the book "Foundations of Machine Learning" Mohri et al. $2012$. Equation $2.17$ seems to only hold when $\hat{R}_S(h)<R(h)$ in equation $2.16$ because of the absolute operator. Why is this not written in the corollary? Am I missing something important? Thank you very much for reading this question.

Topic: pac-learning machine-learning

Category: Data Science

Meaning of Instance Space and Concept Class, (PAC Learnable)

Tommaso Bendinelli

2018年12月21日 14:57

I'm studying Probably approximately correct learning, and I don't understand what an Instance Space and a Concept is. I have see that wikipedia https://en.wikipedia.org/wiki/Probably_approximately_correct_learning provides various examples, but it's still rather an abstract concept. Could you provide me with an intuitive definition and some tangible examples?

Topic: pac-learning

Category: Data Science

Intuition behind Occam's Learner Algorithm using VC-Dimension

QuantumHoneybees

2018年7月25日 14:50

So I'm learning about Occam's Learning algorithm and PAC-Learning where for a given hypothesis space $H$, if we want to have a model/hypothesis $h$ that has an True error of $error_D \leq \epsilon$, with a probability of $(1-\delta)$ for a given probability $\delta$, we need to train it on $m$ examples with $m$ being defined as: $$ m > \frac{1}{2\epsilon^2}\{\log(|H|)+log(\frac{1}{\delta})\}$$ Now, I'm looking for some way to explain the terms of the equation in very simple terms to gain some …

Topic: vc-theory theory pac-learning machine-learning

Category: Data Science

Generalization Error Definition

Green Falcon

2018年7月11日 10:40

I was reading about PAC framework and faced the definition of Generalization Error. The book defined it as: Given a hypothesis h ∈ H, a target concept c ∈ C, and an underlying distribution D, the generalization error or risk of h is defined by The generalization error of a hypothesis is not directly accessible to the learner since both the distribution D and the target concept c are unknown. However, the learner can measure the empirical error of a …

Topic: pac-learning deep-learning machine-learning

Category: Data Science

What is PAC learning?

Green Falcon

2017年10月10日 11:40

I have seen here but I really cannot realize that. In this framework, the learner receives samples and must select a generalization function (called the hypothesis) from a certain class of possible functions. The goal is that, with high probability (the "probably" part), the selected function will have low generalization error. Actually we do that in every machine learning situations and we do latter part for avoiding over-fitting. Why do we call it PAC-learning? I also have not get the …

Topic: pac-learning machine-learning

Category: Data Science

About