KillerSnail February 2016

grouping and summing up dummy vars from caret R

I have data like this

dataset = data.frame(id = c(1,2,1,4,5,6), class = c('a', 'a', 'b', 'a', 'b', 'b') )

I want to convert it into dummy vars but caret's dummy vars doesn't collapse id up it returns the same number of rows as the input. How do I group it so that id 1 has both a and b variables as 1?

dummies <- caret::dummyvars(id ~ . , data=dataset)
predict(dummies, newdata = dataset)

Answers


Pekka February 2016

In this case use dcast function for data.table:

library(data.table)

setDT(dataset)

dataset[,dummy:=1]    
d2 = dcast(dataset,id~class,value.var = 'dummy',fun.aggregate = length)
d2[is.na(d2)] = 0

Note that this solution will return the number of a's and b's found for each id. If you need only 1 or 0 change for example the fun.aggregate to be

fun.aggregate = function(x) as.integer(length(x) >0)

dummyVars works row wise and for that it doesn't matter what is the value in id

Post Status

Asked in February 2016
Viewed 3,562 times
Voted 5
Answered 1 times

Search




Leave an answer