Last week I wrote a blog on completing the Preppin' Data 2021 WK01 challenge in SQL. This week I challenged myself to complete it in Python. I am uploading my solutions for each Preppin' Data challenge on Git Hub. Feel free to check it out here: https://github.com/Dan-Booth-Data/PreppinData/tree/main
Similarly to last week I am starting with this dataset:
![](https://www.thedataschool.co.uk/content/images/2025/01/image-249.png)
I loaded in the data using the read_csv() function from the Pandas library.
![](https://www.thedataschool.co.uk/content/images/2025/01/image-252.png)
Task 1:
![](https://www.thedataschool.co.uk/content/images/2025/01/image-250.png)
To do this I used the str.split() function and split at any '-' with a trailing and leading space.
![](https://www.thedataschool.co.uk/content/images/2025/01/image-253.png)
Task 2:
![](https://www.thedataschool.co.uk/content/images/2025/01/image-254.png)
I decided the easiest way to achieve this was to correct the names based on their first value. To do this I made a dictionary which I could run across the first letter of each value.
![](https://www.thedataschool.co.uk/content/images/2025/01/image-255.png)
Task 3:
![](https://www.thedataschool.co.uk/content/images/2025/01/image-256.png)
To complete this task, I firstly needed to convert the date column to a date datatype. I could then use the built in functions from the dt library to extract the quarter and day of month.
![](https://www.thedataschool.co.uk/content/images/2025/01/image-257.png)
Task 4:
![](https://www.thedataschool.co.uk/content/images/2025/01/image-258.png)
To achieve this I filtered out the first 10 Order ID's.
![](https://www.thedataschool.co.uk/content/images/2025/01/image-259.png)
Final Output:
Lastly I removed any columns that were no longer needed.
![](https://www.thedataschool.co.uk/content/images/2025/01/image-260.png)
![](https://www.thedataschool.co.uk/content/images/2025/01/image-261.png)