Alberto Martín Izquierdo February 2016

element in R dataframe good practice

Accessing the ith element in the column bar in the dataframe foo in R can be done in two different ways:

foo[i,"bar"]

and

foo$bar[i].

Is there any difference between them? If so, which one should be used in terms of efficiency, readability, etc.?

Apologies if this has already been asked, but [] and $ characters are very elusive.

Answers


Alex February 2016

I tend to think this is an opinion based question, and therefore inappropriate for SO. But since you ask for speed considerations, I won't flag it as such. Note: There are more than the two methods you describe for indexing...

data(mtcars)
library(microbenchmark)
microbenchmark(opt_a= mtcars$disp[12],
           opt_b= mtcars[12, "disp"],
           opt_c= mtcars[["disp"]][12])

Unit: microseconds
  expr   min      lq     mean  median     uq     max neval cld
 opt_a 5.322  6.4620  8.34029  6.8425  7.603  56.640   100  a 
 opt_b 9.503 10.0735 15.41463 10.6435 11.024 354.285   100   b
 opt_c 4.181  4.942  7.77386  5.322  6.082 84.009   100     a 

using foo$bar[i] appears to be considerably faster than foo[i, "bar"] but not the fastest alternative

Post Status

Asked in February 2016
Viewed 1,811 times
Voted 8
Answered 1 times

Search




Leave an answer