Why is a lower bound necessary in proofs of VC-dimensions for various examples of hypotheses?
In the book "Foundations of Machine Learning" there are examples of proving the VC dimensions for various hypotheses, e.g., for axis-aligned rectangles, convex polygons, sine functions, hyperplanes, etc.
All proofs first derive a lower bound, and then show an upper bound. However, why not just derive the upper bound since the definition of VC dimension only cares about the "largest" set that can be shattered by hypothesis set $\mathcal{H}$? Since all examples ends up with a lower bound matching the upper bound, is the lower bound just helpful/useful to set a target when trying to show an upper bound?
Reference: From page 41 of this book pdf version https://pdfs.semanticscholar.org/e923/9469aba4bccf3e36d1c27894721e8dbefc44.pdf
Topic vc-theory pac-learning machine-learning
Category Data Science