Dashboard week is a test: A test of mental fitness, tenacity, working when you’re tired, learning how to work fast. The week is partly designed to see how far I can push them. In the end, it’s essentially preparation for life as a consultant. The first four days have all required pretty intensive data prep. For day 5, I’ve done the prep for them.
Dashboard week day 5 is about daily temperature readings around the world going back to the 1800s. The project came to me as an email from Ken Black.
I’ve been doing a 3-year study on the topic of global warming (https://3danim8.wordpress.com/climate-change-quantified/). This series of articles potentially has value to you and the Data School students.
Ken provided all of the data on this post as a set of zip files. Given I knew it was massive amounts of data, I knew I’d need EXASOL to store the data for analysis in Tableau. I started by building a workflow in Alteryx to import a few rows of the data in order to create the table in EXASOL with the correct data types. From there, I truncated the table and ran an import statement inside of EXAPLUS to bring all of the U.S. data into the table.
TIP: If you need to import flat files into EXASOL, zip each of them first. This dramatically speeds up the process. I was able to import 127 million records in under 6 minutes!
Next I imported the data from all of the other countries and a few minutes later I had a table of 176 million records for the team to play with. This data now all sits on EXASOL and is accessible to anyone that has registered for a login. If you don’t have a login, you can register here. This particular table is called DAILY_TEMPS and resides in the Makeover Monday schema.
Yesterday, I had a play with it myself and wanted to build something similar to this radial diagram by Antti Lipponen. Here’s my version of the average yearly maximum temperature by State from 1900-2016. The bar length represents the average max temp and the color represents the difference from the 1961-1990 average temp.
The task for DS6 today is to create a visualisation with this data. That’s it!