Dashboard Week 3 - Web-scraping with Diabetes Data

Wednesday brought in a bit of web-scraping, which is a really powerful way of accessing information from the web. We covered web-scraping a couple times during training with PGB, where we accessed IMDB data to set up a data source with the top 250 movies. Web-scraping in Alteryx (and in general) usually involves 3 steps:

For today's project, we took a look a NCD Risk Factor Collaboration data, and I focused specifically on the data pertaining to Diabetes. In order to scrape the required data (which would be through the individual countries tab), I first needed to create the list of countries, which I did by inspecting the page and then editing the section with all the countries as HTML, as it was stored in a Javascript pop-up list.

Once I had copied this list, I added it as a text input in Alteryx and parsed it out in order to get a clean list of countries:

I thing accessed the URL of the page in order to request the data for each specific country:

I then did a bit of reshaping, and also added in some population data to boost the data set a bit. Finally I outputted the data into a .hyper file so I could begin visualizing the data.

For the dashboarding part, I wanted to work on my dashboard design and incorporate some new ideas. I decided to try set up my dashboard as a 'Patient Form', where each patient would be a country and you could look at their status in terms of Diabetes. The outcome is as follows:

It's a simple dashboard but gives quality insight into the diabetes status of specific countries!

Author:
Garth Turner
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2025 The Information Lab