Today we were challenged to web scrape the parliamentary rules database – we WERE NOT allowed to download the ready made raw data from the website, but to create our own raw dataset using Alteryx!
Web-scraping is not my strong point as it can involve ALOT of Regex. Therefore the first challenge I gave myself for today was to try to use as much Regex as possible (and if possible only Regex!) in the hope of improving!
Here is my finished Alteryx workflow:
![](https://www.thedataschool.co.uk/content/images/wordpress/2019/10/Workflow-for-parliament-1024x347.png)
The difficulty I had with this workflow was giving each clause within each law a different ID number – in the end I managed to accomplish this using a rather complex multirow formula!
For my dashboard I went with something a bit different this time around and decided to keep things quite minimalist. I decided to look at how laws passed in the House of Commons have changed during wartime. Here is the final result:
![](https://www.thedataschool.co.uk/content/images/wordpress/2019/10/snipped-dash-1024x615.png)
Over-all I’m pretty happy with the result!