DSNY 5 - Dashboard Week Day 2

On the second day of the dashboard week, we were assigned to work with cheese on https://www.cheese.com/. It means that we have to do web-scraping to extract the data from that web page. This is an interesting assignment because I really love using web-scraping and API to extract the data. In this blog, I am going to share what I have done on this assignment. The process is mostly the same as what I did on the dashboard week's first day. It includes:

1/ Do web-scraping data in Alteryx
2/ Clean/ prepare data in Alteryx
3/ Sketch a draft on Excalidraw
4/ Build visualization on Tableau Desktop
5/ Test and publish the dashboard on Tableau Public

Image 1: www.cheese.com page

1/ Web-scraping in Alteryx

Web scraping and API is one of my favorite topics. During training at The Data School, Valerija taught us how to do web scraping to get the data from the web page. We used to extract the data from the web page and then build the dashboard from that data. The assignment for the second day of the dashboard week is a similar one but a different page.

Image 2: All kinds of cheese on www.cheese.com

When I scrolled down to the end of the page, I recognized that there are 1871 kinds of cheese. They are displayed in 4 columns and 5 rows (20 types of cheese for each page). There are 94 total pages. Therefore, my goal was to get information on all 1871 different cheeses in Alteryx.

a) Getting the name of all cheeses in 94 pages

Image 3: Getting the name of all 1871 cheeses

Image 3 is the workflow in which I got the names of 1871 different cheeses in 94 pages. I already knew there are 94 pages, so I created a page field from 1 to 94 and then assigned it to the original URL (www.cheese.com) as a parameter. Then, I downloaded the data and prepared the data to get only the name of each cheese.

b) Download data from each cheese

Image 4: Downloading data from each different cheese

After I got the name of each cheese, I created a new URL by using the original URL with the name of each cheese. For example: for the data about Gouda cheese, I need to extract the data from www.cheese.com/gouda. Now, I got all the fields from the page.


2/ Clean/ Prepare data in Alteryx

After I got the data from web-scraping the page, I needed to clean and prepare the data to keep only the necessary fields.

Image 5: Example of the data of Pecorino Romano cheese

For example, image 5 shows the data on Pecorino Romano cheese. The fields include: made from, Country of origin, region, family, type,... However, I only keep some necessary fields: made from which milk, country, type, texture, rind, color, flavour, aroma, and vegetarian.

This is a difficult challenge because there are some hyperlinks in Image 5 and each dimension could have multiple values.

Image 6: Many values in each dimension

For example: In image 6, for one sentence there are many hyperlinks for cow, goat, and sheep milk. Therefore, my goal is to get only the word without a hyperlink tag (<a href=>).

Image 7: Prepare data for each dimension

To do that, I used many RegEx tools to extract the data from the paragraph tag <p> and remove the hyperlink tag. In the result window, each cheese has multiple values and is separated by a comma. (Image 7)

Image 8: Clean the data for other columns

I also did the same for other dimensions for each cheese. For the country of origin field, one cheese could come from many countries. Therefore, I also need to use RegEx tools to clean and get rid of the hyperlink tag. Finally, I use the Union tool and Cross Tab tool to combine all data in each branch together and transform the data. (Image 9)

Image 9: Output the data after cleaning and preparing

After I got the prepared data, it was time to sketch my idea on Excalidraw.


3/ Sketch a draft on Excalidraw

After I cleaned and prepared the data in Alteryx, I got all the fields that I needed, including Cheese Name, Countries, Region, Flavor, Color, Made from kind of milk, Aroma, Type (soft, hard,...), Texture, .... I was planning to tell a story to let the user explore each type of cheese.

Image 10: First part of the sketch
Image 11: Second part of the sketch

At the beginning of the sketch (Image 10), I show how many kinds of cheese are in the dataset, how many cheeses are vegetarian, the number of flavors, and the total countries the cheeses are made from. Then show the top countries that most cheeses come from. In the next section, I show what kind of milk cheeses are made from by showing the number of cheeses for each type. If the user clicks on one of the BANs, it will filter the rest of the dashboard.

In the next section, I show the top type (soft, semi-soft, hard, semi-hard, ...) and texture (smooth, creamy, ...) for a specific milk cheese that the user selected above. Then, show the detailed table including cheese name, flavor, type, texture, producers, and country,... (Image 12)


4/ Build Visualization in Tableau Desktop

After I got the idea from the sketch, now I can build the visualization and dashboard in Tableau Desktop.

Image 13: Showing BANs and comparing the number of cheeses by country

From the top of the dashboard, I show some information about the cheese dataset. Then compare the number of cheeses by country. (Image 13)

Image 14: Compare different kinds of milk that cheeses are made from

In the next section, I compare different kinds of milk that cheeses are made from. Most cheeses are made from Cow's milk. However, some cheeses are a combination of many kinds of milk. Below the BANs is the symbol of each animal. The user can click on each symbol to filter the next sections.

Image 15: Compare different types and textures of each milk type

After choosing the milk type, in the next section, I can compare the type and texture of that milk. The user can click on the bar of the type and texture to filter the detailed table.

Image 16: Showing the detailed table for that milk type

In the last section, the detailed table will show all the information about the cheese including Cheese Name, Flavour, Aroma, Type, Texture, Colour, Producers, and Country, and distinguish whether cheese for vegetarian or not. There is a button on the top right of this section (Image 16). The user can choose the aroma, color, and country from that settings pane.


5/ Test and publish the dashboard on Tableau Public

Before publishing the dashboard on Tableau Public, I have to check the formatting of the dashboard to make sure it's consistent. Then, I also check the interactivity of each visualization. If I apply the filter action in visualization, I need to check the interactivity works for that viz. If everything is ok, I will post the dashboard on Tableau Public.

Image 17: Final Dashboard for Dashboard Week - Day 2

My Tableau Public Dashboard link is here: https://public.tableau.com/views/cheese_dashboard_DB2/Cheese_DB2?:language=en-US&publish=yes&:display_count=n&:origin=viz_share_link


Thank you for reading my blog about what I have done on the second day of the dashboard week. I hope to meet you again in the next blog.

Author:
Le Luu
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2025 The Information Lab