Friday Project - Supplement application viz with additional data

by Nils Macher

Each Friday, our coaches ask to present a project which relates to the content we have learned over the period of the week. This week, our task was to supplement our application visualization with additional data. This task helped us to revisit the major topics of the week: Data preparation and joins in Alteryx.

The topic of my original viz is the political landscape in Uganda:

The map on the left side of the dashboard shows Uganda’s districts and is colored by the party which holds the majority of parliament members in this district. The mostly yellow colored map shows that the NRM (National Resistance Movement) dominates the political landscape in Uganda.

The right part of the dashboard shows the members of the 10th Ugandan parliament. By hovering over the icons the viewer can retrieve information about the name, district and party of the respective parliament member. The icons are double encoded (color and symbol) to show the gender distribution of the parliament. About 1/3 of Uganda’s parliament members are women.

The additional data, I gathered for this viz, is GDP per capita in US$ and GDP in Million US$. By enriching the original viz with this data, I can show which districts are considered as relatively wealthy or poor. I also might draw some careful assumptions about which party appeals to higher or lower income groups.

The data set I found which provided the GDP data I needed is here: https://pardee.du.edu/sites/default/files/Es

Because I only found this PDF document, I had to convert this data into a format which can be analysed in Tableau. I copied and pasted these tables into a text file and then transformed this data into an excel-file which can be can then be joined on in Tableau.

I came up with the following workflow to prepare my data:

The workflow works the following:

  1. I input the text file.
  2. The row containing all the tables’s information is split by the space delimiter.
  3. Then all unnessecary info is filtered out (pages numbers, reoccurring headers)
  4. The data is grouped. Since every 3 rows belong together (District name, GDP in capita, GDP in Millions) these rows need to be transformed into a single row with 3 columns. The first step in achieving this, is by using the Record ID tool and let it assign each row a number starting from zero. Now, in the next step, I can divide each Record ID by 3 and return the result as an integer. This will group all rows into groups of three. E.g. 0/3 =  0, 1/3 =0 2/3 =0. Hence, the first 3 rows are all grouped by the Record ID 0. 
  5. Now I can just use the transpose tool and concatenate each row by Record ID. This will give me all info by row.
  6. In the next step, I can just split the rows into columns on the hyphen delimiter which I used to concatenate the rows earlier.
  7. The final results is:

After joining the additional data on the district field in Tableau, I have now all the data needed to created the visualization I want:

The additional GDP data is visualized by the scatter plot on the far left side. Each point represents a district. By hovering over the data points, the respective district on the map is highlight and vice versa. The parameter above the map lets you switch between the two measures GDP per capita and GDP in Millions.

One observation from scatter plot is, that all areas, which fell to Independent parliament members, can be considered as poor. As all of them are in the bottom corner of the scatter plot. So one might assume that the politics of independent parliament members are more appealing for lower income groups.

To confirm this hypothesis, I would have to dig deeper into the data or maybe even find additional data sets.