Proof of Correctness of Perceptron Training Rule

The Perceptron Training Rule is basically applying Stochastic Gradient Descent for finding the coefficients of a hyperplane (which works as a Decision Boundary) for doing binary classification of data points (instances).

I read that the Stochastic Gradient Descent algorithm could be proved to work accurately for finding the coefficients (aka weights) of a hyperplane Decision Boundary, given the following:

  1. Provided the training examples are linearly separable.
  2. Provided a sufficiently small Learning Rate is used.

Could anyone please prove the above?

Topic perceptron neural-network machine-learning

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.