This Friday, our challenge to complete Preppin' Data challenges on Snowflake using SQL rather than using Tableau Prep as the challenge intended. As Carl gave us this challenges, I assumed abandoning Tableau Prep was not blasphemy.
I attempted challenge 2021 Week 01 as my second challenge of the day:
1) Split the 'Store-Bike' field into 'Store' and 'Bike'
The first challenge was using the right split function. Using SPLIT(), I realised I get one column back with the two values in it (see fig.1).
![](https://www.thedataschool.co.uk/content/images/2021/06/image-95.png)
To solve this I used SPLIT_PART() to only return one of the parts at a time (see fig.2).
![](https://www.thedataschool.co.uk/content/images/2021/06/image-96.png)
2) Clean up the 'Bike' field to leave just three values in the 'Bike' field (Mountain, Gravel, Road)
![](https://www.thedataschool.co.uk/content/images/2021/06/Group-1.png)
I didn't have the neatest solution but noticed a simple IF statement based on the length of the Bike values will do the trick in cleaning misspellings. Fig.3 shows the results.
3) Create two different cuts of the date field: 'quarter' and 'day of month'
This was very similar to Tableau. I just used the DAY() and Quarter() functions (see fig.4).
![](https://www.thedataschool.co.uk/content/images/2021/06/image-98.png)
4) Remove the first 10 orders as they are test values
A simple line of WHERE based on the ORDER ID which, in this dataset, is nice and chronological.
![](https://www.thedataschool.co.uk/content/images/2021/06/Group-2--1-.png)
5) Output a csv
You can see my final SQL code in fig.5 and the results table in fig.6. Note that I've ordered the code so the results view is in the same order as the output example on the challenge page.
![](https://www.thedataschool.co.uk/content/images/2021/06/image-102.png)
![](https://www.thedataschool.co.uk/content/images/2021/06/image-105.png)
If I had more time I would have liked to use other functions to use my solution more robust. For example, while the CASE function does well in this case, it wouldn't be a good solution when dealing with real misspellings.
Also, I ended up putting the "Bike" SPLIT_PART() code inside the LEN() of the "Bike Type" code to get a cleaner results table. I didn't do this for Fig.5 as to prevent confusion.