Learner Algorithm Time & Sample Complexity
Let $X=R^{2}$. Let $u=\left(\frac{\sqrt{3}}{2},-\frac{1}{2}\right),\ w=\left(-\frac{\sqrt{3}}{2},-\frac{1}{2}\right),\ v=\left(0,1\right)$ and $C=H=\left\{h\left(r\right)=\left\{\left(x_{1},x_{2\ }\right)\ |\left(x_{1},x_{2\ }\right)\cdot u\le4,\ \left(x_{1},x_{2\ }\right)\cdot w\le r,\ \left(x_{1},x_{2\ }\right)\cdot v\le r\right\}\right\}$ for $r0$, the set of all origin centered upright equilateral triangles. Describe a sample complexity algorithm $L$ that learns $C$ using $H$. State the time and sample complexity of your algorithm and prove it.
I was faced with this question in a homework assignment and I'm a bit confused.. My solution is:
Let D be our dataset
Learner Algorithm:
maxDistance = [null,0]
for all d∈D
if the class of d is +:
Calculate the Euclidean distance of d from the origin, denote it k
if k maxDistance[1]
maxDistance=[d,k]
d_max = maxDistance[0]
Set r = max{d_max∘u, d_max∘w, d_max∘v}
return h such that ∀x∈D, h(x) = + if all the following holds: x∘u≤r, x∘w≤r, x∘v≤r, else h(x) = -.
Now clearly the time complexity is $O(m)$, $m$ being the size of the dataset, as for the sample complexity here I'm getting confused.. My explanation is:
Let $c$ be the true concept, and let $h$ be the hypothesis we are learning using $L$.
We wish to find the minimum number of training instances that will guarantee with probability of at least $1-δ$ that our $h$ will have error of at most $ε$. Note that in our case the errors will come from instances that belong to class $c$ and reside in either $B_1,B_2$ or $B_3$. Therefore, we want the probability of an error for each one of the $B_i$ to be at most $\frac{ϵ}{3}$. Note that the probability of a single “positive” example not visiting $B_i$ is at most $1-\frac{ϵ}{3}$, therefore the probability of all “positive” examples not visiting $B_i$ is at most $(1-\frac{ϵ}{3})^m$ and as we have three such $B_i$'s we get that
$3(1-\frac{ϵ}{3})^m≤3 exp(-(m\frac{ϵ}{3}))≤δ$
$3 exp(-(m\frac{ϵ}{3}))≤δ$
$exp(-(m\frac{ϵ}{3}))≤\frac{δ}{3}$
$-m\frac{ϵ}{3}≤ln(\frac{δ}{3})$
$mϵ≥3 ln(\frac{δ}{3})$
$m≥\frac{3}{3} ln(\frac{δ}{3})$
Therefore, we have that $m(ϵ,δ)=\frac{3}{ϵ} ln(\frac{δ}{3})$.
We didn't see anywhere a proper proof pattern for the sample complexity, and I also failed to find anything useful online so I felt like I'm just guessing..
I wanted to ask a couple things:
- How do you prove the sample complexity?
- How to properly suggest a learning algorithm (is pseudo code like I wrote considered fine?)
- Does my logic in the sample complexity proof works?
Thanks in advance. :)
Topic pac-learning classification algorithms
Category Data Science