Sandy2511 February 2016

Count of Comma separate values in r

I have a column named subcat_id in which the values are stored as comma seperated lists. I need to count the number of values and store the counts in a new column . The lists also have Null values that I want to get arid of.

Example

I would like to store the counts in the n column.

Answers


akrun February 2016

We can try

 nchar(gsub('[^,]+', '', gsub(',(?=,)|(^,|,$)', '', 
      gsub('(Null){1,}', '', df1$subcat_id), perl=TRUE)))+1L
 #[1] 6 4

Or

library(stringr)
str_count(df1$subcat_id, '[0-9.]+')
#[1] 6 4

data

 df1 <- data.frame(subcat_id = c('1,2,3,15,16,78', 
        '1,2,3,15,Null,Null'), stringsAsFactors=FALSE)


Matthew February 2016

You can do

sapply(strsplit(subcat_id,","),FUN=function(x){length(x[x!="Null"])})

strsplit(subcat_id,",") will return a list of each item in subcat_id split on commas. sapply will apply the specified function to each item in this list and return us a vector of the results.

Finally, the function that we apply will take just the non-null entries in each list item and count the resulting sublist.

For example, if we have

subcat_id <- c("1,2,3","23,Null,4")

Then running the above code returns c(3,4) which you can assign to your column.


If running this from a dataframe, it is possible that the character column has been interpreted as a factor, in which case the error non-character argument will be thrown. To fix this, we need to force interpretation as a character vector with the as.character function, changing the command to

sapply(strsplit(as.character(frame$subcat_id),","),FUN=function(x){length(x[x!="Null"])})

Post Status

Asked in February 2016
Viewed 2,476 times
Voted 9
Answered 2 times

Search




Leave an answer