VC dimension of hypothesis space of finite union of intervals

I have the following concept: $$C = \left\{\bigcup_{i=1}^{k}(a_i, b_i): a_i, b_i \in {\Bbb R}, a_i b_i, i=1,2,..,k\right\} $$ and was wondering how to determine the VC dimension of C?

Topic vc-theory classification machine-learning

Category Data Science


VC dimension is defined for a hypothesis space $H$, e.g. a set of binary classifiers $C \rightarrow \{0, 1\}$. For example, hypothesis space $$H=\{{\Bbb 1}_{x \le \theta}: \theta \in {\Bbb R}\}$$ has VC dimension $1$, because for any $C=\{a<b\}$, it does not contain a classifier that gives $\{a \rightarrow 0, b\rightarrow 1\}$.

For example, a classifier from $H$ would be $f(x)={\Bbb 1}_{x \le a}$ that gives $\{a \rightarrow 1, b \rightarrow 0\}$.

From C to H

As you have illustrated in the comments, we can build a hypothesis space $H$ from $C$ as follows: $$H=\left\{{\Bbb 1}_{x \in C}: C = \left\{\bigcup_{i=1}^{k}(a_i, b_i): a_i, b_i \in {\Bbb R}, a_i < b_i, i=1,2,..,k\right\}\right\}$$ Meaning, each classifier in $H$ is a union of $k$ intervals that labels a point inside the union as $1$ and outside as $0$.

VC dimension of this $H$ is $2k$:

  1. For VC $\geq 2k$: Let $A$ be an arbitrary set , and $A \rightarrow \{0, 1\}$ be an arbitrary labeling. By going from minimum to maximum member of $A$, we can cover all adjacent $1$s with one interval, and only need to use another interval when there is a $0$ barrier. Therefore, we need $k$ intervals to cover $k$ isolated regions of $1$s. Furthermore, a set with $2k$ members has at most $k$ isolated $1$s (since to have $k+1$ isolated $1$s there should be $k$ $0$ barriers in-between), and thus, needs at most $k$ intervals.

  2. For VC $< 2k+1$ by contradiction: for any ordered set $A_{2k+1}=\{a_1<...<a_{2k+1}\}$, there is labeling $a_k \rightarrow 1_{\text{k odd}}$, i.e. $\{a_1 \rightarrow 1, a_2 \rightarrow 0,...,a_{2k+1} \rightarrow 1\}$ with $k+1$ isolated $1$s which cannot be covered with $k$ intervals.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.