What’s wrong with CRISP-DM, and is there an alternative?

Many people, including myself, have discussed CRISP-DM in detail. However, I didn't feel totally comfortable with it, for a number of reasons which I list below. Now I had raised a problem, I needed to find a solution and that's where the Microsoft Team Data Science Process comes in. Read on for more detail! What … Continue reading What’s wrong with CRISP-DM, and is there an alternative?

Note to Self: A roundup of the latest Azure blog posts and whitepapers on polybase, network security, cloud services, Hadoop and Virtual Machines

Here is a roundup of Azure blogs and whitepapers which I will be reading this month. This is the latest as at June 2014, and there is a focus on cloud security in the latest whitepapers, which you can find below..·         PolyBase in APS - Yet another SQL over Hadoop solution? ·         Desktop virtualization deployment … Continue reading Note to Self: A roundup of the latest Azure blog posts and whitepapers on polybase, network security, cloud services, Hadoop and Virtual Machines

Data Visualisation with Hadoop, Hive, Power BI and Excel 2013 – Slides from SQLPass Summit and SQLSaturday Bulgaria

I presented this session at SQLPass Summit 2013 and at SQLSaturday Bulgaria.The topic focuses on some data visualisation theory, an overview of Big Data and finalises the Microsoft distribution of Hadoop. I will try to record the demo as part of a PASS Business Intelligence Virtual Chapter online webinar at some point, so please watch … Continue reading Data Visualisation with Hadoop, Hive, Power BI and Excel 2013 – Slides from SQLPass Summit and SQLSaturday Bulgaria

Hadoop Summit Europe 2014 Call for Abstracts is now open

Hadoop Summit Europe 2014 Call for Abstracts is now openIf you are interested in registering, please click here. Good luck! The call for Abstracts for the EMEA Hadoop Summit is now officially open. FYR the closing date is 31st October 2013.Who should submit? If you are a developer, architect, administrator, data analyst, data scientist, IT or … Continue reading Hadoop Summit Europe 2014 Call for Abstracts is now open

Eating the elephant, one bite at a time: Loading data using Hive

In the previous 'Eating the elephant' blogs, we've talked about tables and their implementation. Now, we will look at one of the ways to get data into a table. There are different ways to do this, but here we will look only at getting data into an external Hive table using HiveQL, which is the … Continue reading Eating the elephant, one bite at a time: Loading data using Hive

Eating the elephant, one bite at a time: Partitioning in SQL Server 2012 and in Hive

Hive and SQL Server offer you the facility to partition your tables, but their features differ slightly. This blog will highlight some of the main differences for you to look out for.What problems does partitioning solve? The problem arises due to the size of the tables. Therefore, the simplest solution is to divide the table … Continue reading Eating the elephant, one bite at a time: Partitioning in SQL Server 2012 and in Hive

Eating the elephant one bite at a time: dropping databases

In the last post, you learned how simple it is to create a database using Hive. The command is very similar to the way that we do this in SQL Server. As discussed, the underlying technology works differently, but ultimately they achieve the same end; database created.Now that you've created your database, how do you … Continue reading Eating the elephant one bite at a time: dropping databases

Eating the elephant one bite at a time: creating a database using Hive

Following on from the first part in my Hadoop series for the Microsoft Business Intelligence professional, we move to the next stage where we look at the simplicity of creating a database.Before we move forward, ensure that the Hortonworks Sandbox is up and running. Our objective today is to create a new database called 'IncomeInequality'. … Continue reading Eating the elephant one bite at a time: creating a database using Hive

Eating the elephant one bite at a time: Some tips in setting up the Hortonworks Sandbox VM

In this blog post series, we will look at how the Business Intelligence professional can learn to use Big Data as a source. This series is mainly aimed at Microsoft and Tableau professionals, but everyone is welcome. Creighton Abrams once said "When eating an elephant, take one time at a time" and this is how … Continue reading Eating the elephant one bite at a time: Some tips in setting up the Hortonworks Sandbox VM