Calculation of relative standard deviation with a custom function in R

I have measured concentrations of elements in a number of samples. Each concentration is an average of three measurements. Also the standard deviation of these measurements is recorded. I tried to calculate the relative standard deviation with a custom function in R but something is wrong here...

library(tidyverse)

data - tribble(
  ~Sample, ~Cu_conc, ~Fe_conc, ~K_conc, ~Mg_conc, ~Mn_conc, ~Cu_std, ~Fe_std, ~K_std, ~Mg_std, ~Mn_std,
  A, 104.126, 0.729185, 283.741, 21.348, 440.639, 0.783757, 0.00637, 1.544, 0.100056, 4.586,
  B, 32.409, 0.782756, 451.802, 43.196, 727.316, 0.774423, 0.014793, 0.473762, 0.007142, 0.231931,
  C, 73.447, 1.959, 243.566, 15.113, 201.526, 0.856306, 0.082993, 1.428, 0.175292, 2.529,
  D, 125.114, 1.5, 273.146, 34.369, 328.96, 1.429, 0.010748, 0.109602, 0.112713, 2.553,
  E, 212.173, 3.045, 163.773, 24.257, 1448.46, 1.302, 0.015061, 0.729027, 0.153371, 6.866,
  F, 185.085, 2.776, 176.943, 24.902, 1254.8, 0.0915, 0.062706, 0.252296, 0.009758, 2.233,
  G, 40.643, 1.192, 87.437, 10.387, 299.003, 0.419244, 0.004575, 0.349594, 0.035921, 0.28355,
  H, 38.938, 1.014, 84.263, 10.651, 150.795, 0.210011, 0.005417, 0.540937, 0.003111, 2.863,
  I, 35.066, 0.6763, 153.529, 11.861, 405.314, 0.706683, 0.011766, 0.110662, 0.059892, 3.1,
  J, 54.571, 1.152, 91.632, 15.625, 213.258, 0.161998, 0.001985, 0.490803, 0.003677, 0.297692
)

# analysed elements
elements - c(Cu, Fe, K, Mg, Mn)

# function: relative standard deviation
relstdev - function(conc, stdev) {
  ifelse(is.na(conc) | is.na(stdev) | conc == 0, NA,
         ifelse(stdev = conc, 100,
                abs(100 * stdev / conc)
         )
  )
}

# calculation
for (e in elements) {
  data - data %% mutate({e}_rsd := relstdev({e}_conc, {e}_stdev))
}

When the program is run I get new columns for each element with rsd but the values are all 100: not what I expected!

data %% select(ends_with(_rsd))

# A tibble: 10 x 5
   Cu_rsd Fe_rsd K_rsd Mg_rsd Mn_rsd
    dbl  dbl dbl  dbl  dbl
 1    100    100   100    100    100
 2    100    100   100    100    100
 3    100    100   100    100    100
 4    100    100   100    100    100
 5    100    100   100    100    100
 6    100    100   100    100    100
 7    100    100   100    100    100
 8    100    100   100    100    100
 9    100    100   100    100    100
10    100    100   100    100    100

This implies that in the rsd-function the condition stdev = conc had to be true. This is not the case:

data$Cu_std = data$Cu_conc

[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

Can someone help me please?

Topic programming statistics r

Category Data Science


I find these sorts of operations much easier to do if the input data is in a long instead of a wide format, this has the additional advantage of being faster on larger datasets. I first transform the dataset such that each row is a sample for a specific element and the columns hold the concentration and standard deviation for that combination. I then use the same logic with dplyr::mutate() to create the new rsd column.

data %>%
  gather(type, value, -Sample) %>%
  separate(type, c("element", "type"), sep="_") %>%
  spread(type, value) %>%
  mutate(
    rsd = ifelse(
      is.na(conc) | is.na(std) | conc == 0,
      NA,
      ifelse(
        std >= conc,
        100,
        abs(100 * std / conc)
      )
    )
  )

# A tibble: 50 x 5
   Sample element    conc     std    rsd
   <chr>  <chr>     <dbl>   <dbl>  <dbl>
 1 A      Cu      104.    0.784   0.753 
 2 A      Fe        0.729 0.00637 0.874 
 3 A      K       284.    1.54    0.544 
 4 A      Mg       21.3   0.100   0.469 
 5 A      Mn      441.    4.59    1.04  
 6 B      Cu       32.4   0.774   2.39  
 7 B      Fe        0.783 0.0148  1.89  
 8 B      K       452.    0.474   0.105 
 9 B      Mg       43.2   0.00714 0.0165
10 B      Mn      727.    0.232   0.0319
# … with 40 more rows

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.