If you think intuitively what h_t (that is h_t-1) shoud represent is, very freely speaking, amount of rembering from the previous steps. Since you are just starting, and want to remember everything, very intuitive thing is to set the value at 1. Than you can see in the equations that the mutiplication with h_t-1 makes no difference in the beginning.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.