SQL and Snowflake #7: lateral split-to-rows, CTEs, filters

When learning a new skill, it is always important to practice, practice and practice! Thus, I have been using the Preppin' Data Challenges to hone my skills in SQL and Snowflake. These are weekly exercises to help you learn and develop data preparation skills. The challenges are designed for learning Tableau Prep but they can be completed in R, Python, SQL, DBT, EasyMorph and many other tools.

To start, I will be completing challenge 2023- Week 7. This challenge involves looking bank transactions for a fiction a bank. They want to flag potentially fraudulent transactions such as:

  • Being more than £1,000 in value
  • Excluding cancelled transactions
  • Platinum Bank Accounts have different rules for identifying fraudulent transactions so we will exclude them from our analysis

The link to the challenge is: https://preppindata.blogspot.com/2023/02/2023-week-7-flagging-fraudulent.html

Using select * , we can look at all the tables:

First looking at the account information table, we need to:

    • Ensure there is one row per Account Holder ID, Joint accounts will have 2 Account Holders, we want a row for each of them
    • Make sure there are no null values in the Account Holder ID

First we select the fields we need to create the table. The SPLIT_TO_TABLE syntax will split that column into rows based on the ", "
", LATERAL" is joining to the result of SPLIT_TO_TABLE . The ", " indicates an INNER JOIN, whilst LATERAL will let you join on that SPLIT_TO_TABLE.

We use WHERE to remove the nulls

We can now combine the two account tables using CTEs! The first CTE labeled t1 has the Account Holders info. We need to make sure the phone numbers start with 07, thus we concatenate a 0 on the contact_number field using ||.

The second CTE (t2) has the code in the picture above. We can now join these tables on the account holder id.

Next we need to join both transaction tables, using transaction id.

The joined account table is put into a CTE (t5), the joined transaction table is also put in a CTE (t6).

The final code looks like this. We join t5 and t6 based on account number and account_to (account number) respectively. Now we can add our filters to flag suspicious transactions.

  • Being more than £1,000 in value
  • No cancelled transactions
  • No Platinum Bank Accounts

And that's it! This challenge was great for practicing CTEs, SPLIT_TO_TABLE and the niche LATERAL join!

Author:
Harvey Joyce
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2024 The Information Lab