R: Rates of change from an initial value
I have a collection of csvs and must produce yearly rates of change per group within each csv, as well as a rate of change compared to the initial value.
I am using the function below to calculate yearly rates of change, and it works fine through my loop.
func - function(x, n=1) {
c(rep(NA, n), diff(x, n) / head(x, -1*n)*100)
}
df$RateOfChange - ave(df$Data, factor(df$Group), FUN=func)
Year Group Data RateOfChange
2010 1a 3.5 NA
2011 1a 4 14.29
2012 1a 3.8 -5.00
2013 1a 4.1 7.89
2014 1a 4 -2.44
2010 1b 2 NA
2011 1b 2.3 15.00
2012 1b 2.4 4.35
2013 1b 3 25.00
2014 1b 1.8 -40.00
2009 2 10 NA
2010 2 12 20.00
2011 2 13 8.33
2012 2 15 15.38
2013 2 14 -6.67
2014 2 11 -21.43
However, I also would like to calculate the percentage change in each year compared to the initial value, like so:
Year Group Data BaselineChange
2010 1a 3.5 NA
2011 1a 4 14.29
2012 1a 3.8 8.57
2013 1a 4.1 17.14
2014 1a 4 14.29
2010 1b 2 NA
2011 1b 2.3 15.00
2012 1b 2.4 20.00
2013 1b 3 50.00
2014 1b 1.8 -10.00
2009 2 10 NA
2010 2 12 20.00
2011 2 13 30.00
2012 2 15 50.00
2013 2 14 40.00
2014 2 11 10.00
Surely there is a way to do this. It'd be easy enough in Excel for a single csv, but I have several hundred, each with multiple groups. I have tried dplyr::diff
and dplyr::lag
, but I can't get the pipeline to abide by the group variables in the way that other questions posted on this site do.
Category Data Science