Dashboard Week - Day 4: Californian University Attendance Data 1894-1945

by Ross Easton

Today contained an added challenge, as we were tasked with doing this project entirely in Tableau version 8. I went into the day not exactly sure what would be missing from this version as I only began using Tableau comparatively recently, but by the end I’m extremely grateful for all the features available to us in the modern releases, as I was barred from doing anything like what I wanted to do due to the limitations of the software.

The day started well as the data was easy to find and was even easier to download. So I was straight into data prep after only a few minutes – a nice change from yesterday.

The data required some simple cleansing but nothing overly complicated – again a nice change from yesterday as it allowed me to focus one what I wanted to visualise from an early point in the day.

Nice easy data prep

My first encounter with the limitations of this version of tableau came at this point – virtually the second I opened it. I had output my data from alteryx as a .hyper file….and sure enough Tableau was unable to connect to it. This was easily solved, however, as I simply output the data as a .tde file instead.

First hurdle defeated

I then encountered a strange date issue in Tableau. The date field I wanted to use was in a numeric format, so I changed it to a date. This field contained data from 1893 through until 1945, but when I added it to the view it only displayed data from one year.

According to Tableau this data is only for 1905…

But then when I simply left the field in a numeric format it worked, although the line chart looked exactly the same.

Somehow not a date but actually a date

Next I spent some time altering the aliases of degree subjects using the added information from the ‘information’ field. It was possible to identify many, but not all of the subjects in this way – anything I had doubts about changing I left in the abbreviated forms. I also created some groups on subject and campus fields at this point.

It proved to be a very interesting dataset, and with a bit of time to burn it was interesting to make charts just to see what was in the data. Subject and gender splits were particularly interesting, but what I decided I really wanted to visualise (that also turned out to be a huge mistake) was the spatial data of where each college was drawing its students from within California (there was some data for non-californian students in the data, but these records had nulls in the lat and long fields and so weren’t very useful).

What I particularly wanted to do was construct a line between each students hometown and the college they went to and make a flowmap – but knowing for sure that the makeline function would not be in Tableau this early I decided to go back into alteryx to do all of the spatial work to make it as easy as possible in Tableau.

Creating points for my campus locations

I created points for all my campus group locations , and for all my hometown locations and then joined it using my grouped campus fields. I then used a formula (below) to create a line between the two points. Everything seemed to be going swimmingly.

My line-making formula
My completed second flow

It was at this point that everything came crashing down. It turns out that, no matter how many different file types you try, you are not going to get those spatial objects into a version of Tableau this early. I probably should have kept it simple, but I was tempted by the opportunity to do some off the cuff spatial and never even checked if it would work… What a fool.

Probably should have seen this coming

I did quickly throw it into the most recent version of Tableau just to see if I had done it right, and happily it appeared I had (so watch this space I will probably re-viz this).

Making myself feel better in more recent Tableau

From here I had to reassess and approach the problem slightly differently. It was a bit late to change direction completely so I made what I could with the software. All in all it was frustrating doing some of the more aesthetic parts of design, but overall using the older version wasn’t the end of the world. You could still build a wide variety of chart types etc, you just had to approach things in a more simple manner or be prepared to bang your head against the wall.

The results of the day:

Circles wish they were as good as lines

Interactive version: https://public.tableau.com/profile/ross.easton#!/vizhome/CalifornianUniversityAttendance1893-1945/Dashboard1