DSNY 5 - Dashboard Week Day 4

For the last day of the dashboard week, Michael assigned a task that each DSNY 5 members re-build the initial application to The Data School. The requirement is using the original data (raw data). This task is so interesting. It reminds me all good memories at that time when I worked on that initial application.

To have that final initial application (Image 1), I revised my dashboard 4 or 5 times. Thanks to the feedback from Samuel Shurmer, Collin Smith and Michael McFadden, I learned and improved my Tableau skills then applied it into the dashboard. I chose the COVID-19 topic because it was a hot topic at that time and many relative were affected by COVID-19. In some countries, the condition of COVID-19 was still being very complicated and terrible. Therefore, I would like to know what factors caused high COVID-19 death rate. It would be great for everyone to know and defend.

My initial application dashboard on Tableau Public: https://public.tableau.com/views/ProjectCOVID19inMexico2020-2021/Revised_COVID_Dashboard?:language=en-US&:display_count=n&:origin=viz_share_link

Image 1: Initial Application to The Data School

This project is a challenge for me because I need to make it better in a short time (5 hours). The process is quite similar to the previous dashboard week day. It includes:

1/ Review the data dictionary
2/ Clean/ Prepare data in Alteryx
3/ Sketch a draft on Excalidraw
4/ Build visualization on Tableau Desktop
5/ Test and Publish on Tableau Public

Now, are you ready to go back to the past with me? Let's get started!


1/ Review the data dictionary

I searched the COVID-19 in Mexico on Kaggle page. I chose that dataset because there are many diseases which could affect to the COVID-19 death rate. I can explore which diseases could easily lead to the COVID death. Also, at that time, I used Python with pandas library to clean, prepare and explore the data. So I wrote a blog to explore the data on Kaggle and prepare the data for visualization in Tableau.

My Kaggle blog: https://www.kaggle.com/code/leluudev/covid-19-in-mexico-from-2020-2021

If you click on the Input menu on the top, it will show the data dictionary of the dataset (Image 2). There are over 1 million rows and 21 fields in the dataset. Some fields have values 97, 98 or 99. It means missing data. In some Boolean fields, 1 for yes and 2 for no.

Image 2: Data Dictionary

2/ Clean/ Prepare data in Alteryx

After understanding the data, I will clean and prepare the data in Alteryx. If you scroll down the Kaggle page, you can see how I cleaned and prepared the data with Python. However, Alteryx will make everything better. In Alteryx, I created a new field to decide the case is death or not based on the Date Died. Some fields contain the value 97, 98 or 99; I replaced those values by 0. Then, the last step is outputting the data.

Image 3: Alteryx workflow to clean and prepare data

On my Kaggle's blog, I used Pearson to find the correlation between each field (Image 4). For the values between 0 and 1, it means positive correlation. When a variable changes, the other variable is also change the same direction. For value equals 0, it means that no correlation between 2 variables. For values from 0 to -1, it means negative correlation. When a variable changes, the other variable change in the opposite direction.

Image 4: Heat map to show correlation between each field

Based on that heat map, I can give some insights for myself (Image 5). It was very helpful to me when I deciding the topic that I will focus on and explore more.

Image 5: Insights from the heat map

Back to the current task, I got the output data from Alteryx. In the next step, I will think about the topic and tells the story to the audience based on the insights that I have.


3/ Sketch a draft on Excalidraw

The challenge of this task is how to make a dashboard better than the initial application. What I could change in this revised dashboard is telling another story. Therefore, I was planning to compare the death cases between patients who were treated in hospital and home-returned.

Image 6: Sketch on Excalidraw

From the top, I have BANs to show an overview of total cases, total death cases, average age of death cases, hospitalization cases, home-returned cases. Then, the line chart compares the number of death cases between those 2 types. In the next section, I would like to know if the age and gender affect the number of death cases. For the final section, the user can explore which diseases caused high COVID-19 death rate. I also have a container to show the description of each disease if the user clicks on the name of the disease.


4/ Build visualization on Tableau Desktop

After getting the ideas to build, I started building visualization on Tableau Desktop. I kept the BANs same as the sketch on Excalidraw. For the line chart, I compared the Hospitalization and Home-Returned Death Cases by the time from Jan 2020 to May 2021. Most COVID-19 death cases were recorded in June 2020 and from Hospitalization cases (Image 7).

Image 7: First part of the dashboard

In the next section, I compared the Hospitalization and Home-Returned death cases by the age and gender (Image 8). Based on the butterfly chart, most Hospitalization cases were infected by COVID-19 from 45 to 69 years old while Home-Returned Cases were infected from 25 to 49 years old. The number of death cases in Hospitalization were more than Home-Returned very much (over thousands cases). The number of death cases in males more than female in Hospitalization type. I would say thank you to Valerija for helping me to edit the axis range by the reference line. It'd hard to compare if the axis range in the same chart is not the same.

Image 8: Butterfly chart for comparing age and gender of each type

In the next section, I compared which diseases affected the number of death cases the most (Image 9). The user can click on one disease on the left side, then it will show the description of that disease and the bar chart to compare death cases between Hospitalization and Home-Returned type. Diabetes and Pneumonia have the highest death case number.

Image 9: How other diseases affected the number of COVID-19 death cases

In the last section, I compared number of death cases admitted and unadmitted to ICU for each disease. Most death cases were unadmitted to ICU in Mexico during that time. That was also a reason why the number of death cases were high.

Image 10: Admitted and Unadmitted cases to ICU

Those are all the vizzes that I built in Tableau Desktop for this project.


5/ Test and Publish on Tableau Public

The last step before publishing the dashboard on Tableau Public is testing. I need to check the number, font, size, formatting, and the interactivity action in the dashboard. One thing that I usually forget is the tooltip. Some places don't need the tooltip; however, to give the audience more details, I need to use the tooltip.

Image 11: Full Dashboard for the Dashboard Week Day 4

Dashboard Week Day 4 on Tableau Public: https://public.tableau.com/views/Dashboard_Week_Day4/COVID19_Mexico_DB4?:language=en-US&:display_count=n&:origin=viz_share_link


Thank you for reading my Dashboard Week Day 4 blog. This is the last blog for the Dashboard Week series. December 1st, 2023 is also the last training day for DSNY 5 team. It is the time to look back what I have learned at The Data School. I am very grateful to all coaches who supported and trained the team during 4 months at The Data School. 4 months is not too long and not too short, but I have all good memories with The Data School. 4 training months ended but the journey with data is still continuing with more interesting challenges ahead.

See you in the next blog!

Author:
Le Luu
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2025 The Information Lab