Eating the elephant one bite at a time: dropping databases

In the last post, you learned how simple it is to create a database using Hive. The command is very similar to the way that we do this in SQL Server. As discussed, the underlying technology works differently, but ultimately they achieve the same end; database created.

Now that you’ve created your database, how do you drop them again?  Fortunately, this is very simple to know how to do, if you’re already a SQL Server  aficionado.

Here is an example:

DROP DATABASE IncomeInequality;

If you want to drop the database without  error messages if the database doesn’t exist, then you can use:

DROP DATABASE IF EXISTS IncomeInequality;
 

Dropping the database in Hive isn’t straightforward, however. In SQL Server, when you drop a database, it removes all of the database and the disk files used by the database. On the other hand, Hive will not allow you to drop a database if it contains tables. To get around this default behaviour, you have to add the CASCADE command to the command.

DROP DATABASE IF EXISTS IncomeInequality CASCADE;

Using this command will delete the tables first, and then drop the database.  Then, like SQL Server, its directory is deleted too.

Once you’ve executed the command, you should double-check that the database has gone:

SHOW DATABASES; 

You can then see the result in Hue (more on this later! Let’s keep it small bites of the elephant for now!)

ShowDatabasesResult

Here, we have dropped the database and we only have the default left.

In Hive, we will look at dropping tables in the next post. This is more complex than it first seems.
I hope that helps!
Jen


Leave a Reply