Obtaining Sample Data Sources

One of my readers, Misbah, wrote recently to ask a query about obtaining sample data sets. I’ve done some research on this – Misbah, this blog is dedicated to you and I wish you all the best in your studies. I’ve discussed some of the problems in obtaining datasets, along with some resolutions.

There are different issues in obtaining data, which are summarised here:

Confidentiality – in my career, I’ve come across some amazing data sets. However, out of respect for the confidentiality and sensitivity of the data, I never share it.
Accuracy – it can be difficult to obtain data that has been rigorously collected
Data types – psychological data can be difficult to obtain than true rational data. For example, a question such as ‘on a scale of 1 to 5, how happy are you today’ cannot be true rational data, but is simply more of a label.

How is it possible to get some reliable, free data sets that are easy to use and free from confidential restraints? Well, here are some resources which I like to use for sample sets:

The Guardian Datastore – this has plenty of sets of sample data on everything from security, war, MPs expenses to fun things such as chocolate sales. Some of the sample Tableau images on this blog have used data from this source.

The London Datastore – this has plenty of London-focused data sets.

Good old Excel also has the RAND and RANDBETWEEN function, which is a volatile function which will produce a random set of data in a spreadsheet.

Another place to look for data is the Tableau Public website. Unlike the Google Data Explorer, Tableau allows you to download the data as well.

A final place to look is Swivel, which describes itself as a Youtube for data. 

I hope that this helps you to get some sample data for the visualisations.

Add to Technorati Favorites

Share the Post:

Discover more from Jennifer Stirrup: AI Strategy, Data Consulting & BI Expert | Keynote Speaker

Subscribe now to keep reading and get access to the full archive.

Continue reading