knb February 2016

How to import a csv file with a multiline /* C-style block-comment header */ into R

I am looking for a clever way to import a CSV File with a specific header format into R. The header format is a multiline "C-style" block-comment followed by a one-line column name header, followed by data. It looks like this:

/* DATA DESCRIPTION:
key1: value1
key2: values
[...variable number of key-value pairs, may be nested...]
License:    Creative Commons Attribution 3.0 Unported (CC-BY)
Size:   174 data points
*/
Date/time start Date/time end   
2008-06-01T00:00:00 2008-06-30T23:30:00 
2008-07-01T00:00:00 2008-07-31T23:30:00

For a one-off task it's okay to do this manually like this (counting the header lines, n=47):

filelist <- read.tsv(infile, skip = 47, stringsAsFactors = FALSE, header = TRUE )

...but I am looking for a more generic way to read this in.

(I don't think this is a duplicate question. Closest answer I have found here is this one from 2010.

Answers


pluke February 2016

Try this. For a file called test.csv:

/*comment


*/
var,cond,value
data,data,data
data,data,data
data,data,data
data,data,data

Code:

con <- file(paste(folder,"test.csv", sep=""),open="r")
lines <- readLines(con)
start <- match("*/", lines) #gets the row index of the close comment
results <- read.csv(paste(folder,"test.csv", sep=""), head=TRUE, sep=",", skip=start)

Returns:

   var cond value
1 data data  data
2 data data  data
3 data data  data
4 data data  data

Post Status

Asked in February 2016
Viewed 1,776 times
Voted 9
Answered 1 times

Search




Leave an answer