New Year, new prepping techniques... a 2025 personal development task of mine is to get my SQL up to scratch and so I thought I would begin by taking on some preppin data challenges (https://preppindata.blogspot.com/). These are an excellent resource for those looking to work on their data prepping chops.
CHALLENGE: 2024:Week 1 (https://preppindata.blogspot.com/2024/01/2024-week-1-prep-airs-flow-card.html)
This challenge involves manipulating the following dataset:
![](https://www.thedataschool.co.uk/content/images/2025/01/image-93.png)
in the following ways:
![](https://www.thedataschool.co.uk/content/images/2025/01/image-94.png)
So let us begin!
Step 1: Input the Data
First things first, I got into Snowflake and loaded in the dataset.
![](https://www.thedataschool.co.uk/content/images/2025/01/image-95.png)
A promising start.
Step 2: Split the flight details to form Date, Flight Number, From, To, Class, and Price fields
Split_Part on // will get us most of the way here:
![](https://www.thedataschool.co.uk/content/images/2025/01/image-96.png)
so now we just need to separate 'From and To'...
Which requires a handy CTE to place our last query within,
![](https://www.thedataschool.co.uk/content/images/2025/01/image-97.png)
and allow us to split the From and To column in the 'actual' query. Success!
Step 2: Convert Date to a Date format. Convert Price to a decimal value.
This should be simple enough. For the date conversion we should just need TO_DATE(), and for making Price a Decimal we will use TO_DECIMAL(). Both we'll handle inside our CTE.
![](https://www.thedataschool.co.uk/content/images/2025/01/image-99.png)
Step 3: Change the Flow Card field to Yes / No values instead of 1 / 0
I vote for a nested replace statement:
![](https://www.thedataschool.co.uk/content/images/2025/01/image-103.png)
Step 4: Create two tables, one for Flow Card holders and one for non-Flow Card holders
This we can do with a simple WHERE filter. Repeat for Flow Card? = 'Yes' and Flow Card? = 'No'
![](https://www.thedataschool.co.uk/content/images/2025/01/image-107.png)
![](https://www.thedataschool.co.uk/content/images/2025/01/image-109.png)
and there you have it! Here is the complete code for solving PreppinData 2024 Week 1 in SQL:
WITH "SplitCTE" AS (
SELECT
SPLIT_PART("Flight Details", '//', 3) AS "From and To",
TO_DATE(SPLIT_PART("Flight Details", '//', 1),'YYYY-MM-DD') AS "Date",
SPLIT_PART("Flight Details", '//', 2) AS "Flight Number",
SPLIT_PART("Flight Details", '//', 4) AS "Class",
TO_DECIMAL(SPLIT_PART("Flight Details", '//', 5)) AS "Price",
REPLACE(REPLACE("Flow Card?", '1', 'Yes'), '0', 'No') AS "Flow Card?",
"Bags Checked",
"Meal Type"
FROM "2024Wk1"
)
SELECT
"Date",
"Flight Number",
SPLIT_PART("From and To", '-', 1) AS "From",
SPLIT_PART("From and To", '-', 2) AS "To",
"Class",
"Price",
"Flow Card?",
"Bags Checked",
"Meal Type"
FROM "SplitCTE"
WHERE "Flow Card?" = 'Yes'
;
WITH "SplitCTE" AS (
SELECT
SPLIT_PART("Flight Details", '//', 3) AS "From and To",
TO_DATE(SPLIT_PART("Flight Details", '//', 1),'YYYY-MM-DD') AS "Date",
SPLIT_PART("Flight Details", '//', 2) AS "Flight Number",
SPLIT_PART("Flight Details", '//', 4) AS "Class",
TO_DECIMAL(SPLIT_PART("Flight Details", '//', 5)) AS "Price",
REPLACE(REPLACE("Flow Card?", '1', 'Yes'), '0', 'No') AS "Flow Card?",
"Bags Checked",
"Meal Type"
FROM "2024Wk1"
)
SELECT
"Date",
"Flight Number",
SPLIT_PART("From and To", '-', 1) AS "From",
SPLIT_PART("From and To", '-', 2) AS "To",
"Class",
"Price",
"Flow Card?",
"Bags Checked",
"Meal Type"
FROM "SplitCTE"
WHERE "Flow Card?" = 'No'
;