Fill the missing values (NA) in various columns (independently of each other) using imputeTS package (in particular, na_kalman function)

A friend of mine has recently started working on R-studio and is interested in filling the NA values in different columns using the above-mentioned function. Also, since he intends to run a time series analysis for every column, what should be the correct approach?

Topic missing-data r

Category Data Science


To replace by column means, an easy approach would be to use the base R function colMeans. Let's say you have a data frame df.

1) If you want to replace the NAs per column one by one, you could try this:

df <- sapply(df, function(x)ifelse(is.na(x), mean(x, na.rm=TRUE), x))

2) If you want to replace all NAs in one go, you could try this:

df <- ifelse(is.na(df), rep(colMeans(df, na.rm=TRUE), rep(nrow(df), ncol(df))), unlist(df))

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.