We've previously learnt how to web scrape in Alteryx and in all honesty it took me a little while to get my head round.
Today, we had a lesson on Power BI and I'm pretty amazed with the software's ability to web scrape.
We looked at the webpage of IMDbs top 250 movies, here's how it looked:
![](https://www.thedataschool.co.uk/content/images/2022/02/image-82.png)
With only a few clicks we were able to pull this data into Power BI.
Have a look below:
![](https://www.thedataschool.co.uk/content/images/2022/02/Power-BI-web-scrape.gif)
1) Choose 'Get Data' and select 'from web'
2) Insert the URL and click 'ok'
3) Select the data from the options provided.
Yep that's it!
If your data is clean then you can load this directly to Power BI, but this table did require a bit of cleaning, so let's clean...
![](https://www.thedataschool.co.uk/content/images/2022/02/Power-BI-clean.gif)
Select your table and click 'transform data', a window with open up that reminded me a little of excel.
![](https://www.thedataschool.co.uk/content/images/2022/02/Power-BI-cleaning.gif)
We can use the 'Split Column' tab in the toolbar to split up our Rank, Title and Year and we're given an 'Applied Steps' tab on the right hand side - a pretty cool feature as you're able to easily look back through any transformations you've made - when selecting one of the transformations you'll also be able to see how your data looked at that particular stage.
![](https://www.thedataschool.co.uk/content/images/2022/02/image-83.png)
I won't go through all the cleaning stages, but using some of the other features of Power BI it was pretty quick and easy to get the table in the format we needed to start building our analysis.
I must say - I'm a fan of using Power BI for webscraping (sorry Alteryx fans), but then again it may not always be this simple!