Importing Google Spreadsheets into Windows Azure Data Explorer

Hi! This blog will take you through the steps of importing a Google spreadsheet into Microsoft Azure Data Explorer. You could then play with this data by using data from the Windows Data Market. I think I love the Data Explorer so much because it allows a nice, easy format for mixing up data from different sources.  This activity takes the form of two steps: ensuring that the Google spreadsheet is published, and then importing it into Data Explorer.

The Google spreadsheet came from the fantastic Guardian Datablog, and it focuses on the New Years Honours list for 2012. In case you’re not British or from the Commonwealth, and wondering what I’m blathering on about, the New Years Honours List is a quaint British tradition which recognises outstanding achievement to people who serve their communities.  The original Guardian commentary can be found here.

The Google spreadsheet obviously belongs to them, so I needed to take a copy of it, and publish it to my own Google account. To do this, you click on ‘File’ and then ‘Publish to the web’ It is very straightforward to do this, but if you need an image, click here.

You then need to make sure that you publish the spreadsheet as a CSV format. This is quicker and easier for importing. You can see an example of this below, or if you need the original image, you can find it here:

ii Google Publish to the web

The other item to note is that you should just select one sheet, and not ‘All Sheets’. In doing so, you’re making the data easier to import. Here, the sheet is called ‘Full List’.

Once you’ve selected the sheet, you should copy the link that appears in the box. You’ll need this in order to import the data. I didn’t import this file as ‘web content’ – instead, I did ‘Import File’ and then copied the link into the ‘Open’ Dialog box. This imported the file as text. You then get the following options:

a Open as CSV

Upon importing the data as a table, you might find that you get the following error message:

The CSV input has rows with varying numbers of columns, and the first row does not have enough columns to specify the input width for all rows. Specify a value for the ColumnCount option to prescribe the number of columns to include in the output. Here is an example of the image here:

b Open as CSV error

If this is the case, then it is perhaps easier to import the data as text in order to examine more clearly to see if there are issues. To do this, just open the ‘Text’ box at the top left hand side, or right-click and select ‘Open as Text’. Here is an example:

c open as Text

Once you’ve converted it to text, the Data Explorer screen will appear as follows. You can click on the image or click here for a larger image:

d result when opened to text

Now, it is a bit clearer to see the status of the data since we can see the text. The next step is to try and re-import it as a table.  You can do this by clicking ‘Table’ at the top left hand side. You’ll now get the following screen:

e Comma delimiter settings and skip first line

And another thing…

If you’re sharp eyed, you’ll have noticed an anomaly on line 10. Why does Herbert Douglas have the title ‘Charity’? Shouldn’t this be in ‘For Services to’ column? If you look back at the first picture of this blog, you’ll see that the same thing happens in the Google spreadsheet; since it matches the source, then this isn’t an issue with the data import, but the raw data. You can see a snapshot of the raw data below and if you need a larger image, click here.

i Google publish to the web

Don’t be fooled by the nice clean appearance of the Data Explorer – there are lots of customisation and nice things that can be done with the Data Explorer, so let’s look forward to more fun in future blogs!

Don’t you just love data? I do! I hope that this helps.

Leave a Reply