user2884505 February 2016

Excel - Query for CSV missing columns

So I'm using Office 2016's new Query feature to get data from various CSV API endpoints. I managed to get it to work with Yahoo Finance's but I can't seem to get Morningstar's to work. When I visit this link in my browser, I download a full CSV with all the proper data:

Link: http://financials.morningstar.com/ajax/ReportProcess4CSV.html?t=MSFT&reportType=is&period=12&dataType=A&order=asc&columnYear=10&number=3

However, when I use the query feature, it only loads the first row. I'm guessing it's because only the first column in the first row in the CSV has data, causing it to ignore the rest of the columns. Does anyone know how to ignore that first row and begin importing the CSV from the second?

let
    Source = Csv.Document(Web.Contents("http://financials.morningstar.com/ajax/ReportProcess4CSV.html?t=MSFT&reportType=is&period=12&dataType=A&order=asc&columnYear=10&number=3"),[Delimiter=",",Encoding=1252])
in
    Source

Thanks in advance!

Edit: In Csv.Document, it looks like there's an extraValues parameter; maybe that might help?

Answers


Paul Ogilvie February 2016

see https://msdn.microsoft.com/en-us/library/mt260840.aspx?f=255&MSPPError=-2147217396:

let

Source = Csv.Document(File.Contents("C:\Projects\Examples\SalesForceContacts.txt"),
    [Delimiter=",",Encoding=1252]),
    #"First Row as Header" = Table.PromoteHeaders(Source),
    ...

But if the header row is corrupt, this may not work.


Alejandro Lopez-Lago - MSFT February 2016

The extra values parameter can retrieve your extra rows (using ExtraValues.List), but the shape of the table will be difficult to work with.

Here's one way to get the values:

let Source = Table.FromColumns({Lines.FromBinary((Web.Contents("http://financials.morningstar.com/ajax/ReportProcess4CSV.html?t=MSFT&reportType=is&period=12&dataType=A&order=asc&columnYear=10&number=3")),null,null,1252)}), #"Removed Top Rows" = Table.Skip(Source,1), #"Split Column by Delimiter" = Table.SplitColumn(#"Removed Top Rows","Column1",Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv),{"Column1.1", "Column1.2", "Column1.3", "Column1.4", "Column1.5", "Column1.6", "Column1.7"}), #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column1.1", type text}, {"Column1.2", type text}, {"Column1.3", type text}, {"Column1.4", type text}, {"Column1.5", type text}, {"Column1.6", type text}, {"Column1.7", type text}}), #"Promoted Headers" = Table.PromoteHeaders(#"Changed Type") in #"Promoted Headers"

This query treats the CSV file as a text document of lines. It removes the first line, which is some header statement that causes the parser to think that the CSV file has only one column. It then splits the lines into columns based on comma placement and then makes the first row the column names.

Post Status

Asked in February 2016
Viewed 3,643 times
Voted 5
Answered 2 times

Search




Leave an answer