Normalizing the final weights vector in the upper bound on the Perceptron's convergence
The convergence of the "simple" perceptron says that:
$$k\leqslant \left ( \frac{R\left \| \bar{\theta} \right \|}{\gamma } \right )^{2}$$
where $k$ is the number of iterations (in which the weights are updated), $R$ is the maximum distance of a sample from the origin, $\bar{\theta}$ is the final weights vector, and $\gamma$ is the smallest distance from $\bar{\theta}$ to a sample (= the margin of hyperplane).
Many books implicitly say that $\left \| \bar{\theta} \right \|$ is equal to 1. But why do they normalize it ?
Topic convergence perceptron machine-learning
Category Data Science