Home Ask Login Register

Developers Planet

Your answer is one click away!

Sandy2511 February 2016

Count of Comma separate values in r

I have a column named subcat_id in which the values are stored as comma seperated lists. I need to count the number of values and store the counts in a new column . The lists also have Null values that I want to get arid of.


I would like to store the counts in the n column.


akrun February 2016

We can try

 nchar(gsub('[^,]+', '', gsub(',(?=,)|(^,|,$)', '', 
      gsub('(Null){1,}', '', df1$subcat_id), perl=TRUE)))+1L
 #[1] 6 4


str_count(df1$subcat_id, '[0-9.]+')
#[1] 6 4


 df1 <- data.frame(subcat_id = c('1,2,3,15,16,78', 
        '1,2,3,15,Null,Null'), stringsAsFactors=FALSE)

Matthew February 2016

You can do


strsplit(subcat_id,",") will return a list of each item in subcat_id split on commas. sapply will apply the specified function to each item in this list and return us a vector of the results.

Finally, the function that we apply will take just the non-null entries in each list item and count the resulting sublist.

For example, if we have

subcat_id <- c("1,2,3","23,Null,4")

Then running the above code returns c(3,4) which you can assign to your column.

If running this from a dataframe, it is possible that the character column has been interpreted as a factor, in which case the error non-character argument will be thrown. To fix this, we need to force interpretation as a character vector with the as.character function, changing the command to


Post Status

Asked in February 2016
Viewed 2,476 times
Voted 9
Answered 2 times


Leave an answer

Quote of the day: live life