Second assignment: retrieve all the data for every year for all participants in the London Marathon using the below website.
https://www.virginmoneylondonmarathon.com/en-gb/event-info/race-results/
We had to collect all historical data for all participants that had the same initials as ours; my initials had to be SE, for some reason Kriebel said I could not do SM.
After searching for participants with my initials in 2018, the following URL shows up in my web browser.
I am going to focus on the last few characters of the URL: year=2018&page=1.
I can see that following the first = in the URL, 2018 is entered, so that means when I when scrape the website using Alteryx, I would have to change that to include all years since 1981.
After the second – in the URL, 1 is entered, which also means I will have to change that to include all the other pages when I scrape the website on Alteryx.
To do this input a Text Input tool into the Alteryx canvas and enter the URL mentioned above. Then, insert a Generate Rows tool after that. See Image 1 to see what to put in the tool. This will generate rows for one column (titled Year) and will give values of 1981 – 2018.
data:image/s3,"s3://crabby-images/f2894/f2894c184530676694fc21549402330de5aa72ec" alt=""
After that, insert another Generate Rows tool onto the workflow and see Image 2 to view what to put in the second Generate Rows tool. This will generate rows for page numbers up to 38 (random choice) for each year entry.
data:image/s3,"s3://crabby-images/a68ac/a68acf95b3de58084fec8e3018724427ebaab075" alt=""
Add a Select tool and change the Year and Page column types to V_WString, see Image 3.
data:image/s3,"s3://crabby-images/cd225/cd2252b19d2ec3d48e2f9d1696523a9717ad2e46" alt=""
Attach a Formula tool and replace the last four string characters (i.e. 2018) with a space as shown in Image 4.
data:image/s3,"s3://crabby-images/65da0/65da058d44a89ce2dc63bff70b4679fefdcd9799" alt=""
Lastly, insert another formula tool and add new a column titled ‘update’ and follow as seen in Image 5.
data:image/s3,"s3://crabby-images/8914b/8914b6d4a8b47bc0330be150986c10621a9b5b18" alt=""
data:image/s3,"s3://crabby-images/c97ad/c97adc12a70c3acc1a9c2be78bb9a2275203b724" alt=""