I need to , efficiently, parse one of my dataframe column (a url string)
and call a function (strsplit) to parse it, e.g.:
url <- c("www.google.com/nir1/nir2/nir3/index.asp")
My data frame : spark.data.url.clean looks like this:
This df has 100k rows and I don't want to loop/iterate over it, parse each url separately and write the results to a new data frame.
What I DO need/want is to create a new 5 column data frame:
df.result <- data.frame(fullurl = as.character(),baseurl=as.character(), firstlevel = as.character(), secondlevel=as.character(),thirdlevel=as.character(),classificaiton=as.character())
call one of the "apply" family function over
and to write the results to the new data frame
df.result such that the first column (
fullurl) will be populated with the relevant
spark.data.url.clean$url, the 2nd to 5th columns will be populated with the relevant results from applying
- taking the only the first, 2nd, 3rd and 4th elements from the resulted vector and putting it in the first,2nd, 3rd and 4th columns in
df.result and finally putting the
spark.data.url.clean$classes in the new data frame columns
Sorry for the complication and let me know if anything need to be further cleared out.