Two definitions of DCG measure
I wanted to check the definition of Discounted Cumulative Gain (DCG) measure in the original paper Jarvelin and it seems it differs from the one given in the later literature Wang. Originally, for $n$ documents ranked from $r = 1, \ldots, p$, the $\text{DCG}_p$ is defined as $$\text{DCG}_p = \sum\limits_{r=1}^{b} G_r + \sum\limits_{r=b}^{p}\frac{G_r}{\log_br},$$ where $G_i$ is the relevance (or gain) of the $i$-th document. So the measure depends on the logarithm base $b$. For ranks below $b$, i.e. $rb$, gains are not penalized. If $b=2$, then we can write: $$\text{DCG}_p = G_1 + \sum\limits_{r=2}^{p}\frac{G_r}{\log_2 r}.$$ It does not look the same as the one given on wikipedia, where the argument of the logarithm is shifted by $1$: $$\text{DCG}_p = G_1 + \sum\limits_{r=2}^{p}\frac{G_r}{\log_2(r+1)}.$$
Where does this change come from? Why others use different metric?