SQL Advanced Queries

In this blog post I will walk through some advanced SQL queries. Here's the entity relationship diagram for the database I'll be using:

The table 'Orders' is at the center, and links to 'OrderDetails', 'Customers', and 'Employees'. Notice that 'OrderDetails', 'Customers', and 'Employees' do not link to each other directly, but must be linked through 'Orders' which acts as central "hub". 'Shippers' is not linked to rest of database.

1) Get a list with all the orders displaying each column and calculate an extra TimeDelta column displaying the between OrderDate and ShippedDate, ordered by ascending TimeDelta:

JULIANDAY() converts date to float value representing number of days passed since Nov 24, 4714 BC. This allows us to do arithmatic on dates and calculate elapsed time , like subtracting 'OrderDate' from 'ShippedDate' to calculate number of days it took for each order to ship.

2) Return a list of all orders (order_id, customer.contact_name, employee.firstname) ordered by order_id:

Multiple joins in one query: 'Orders' to 'Customers' and 'Orders' to 'Employee's. This enables us to combine columns from all 3 tables. We must link 'Customers' -> 'Orders' -> 'Employees' to match who sold what to whom, because 'Customers' and 'Employees' tables are not linked directly.

3) return the total amount spent per customer ordered by ascending total amount (to 2 decimal places):

'FROM OrderDetails AS details' assigns alias 'details' to table 'OrderDetails'. This allows shorter referencing to the table. Now can use 'details.OrderID' instead of 'OrderDetails.OrderID') This alias only exists for duration of the query.

JOIN 'Customers -> 'Orders' -> 'details' tables so customers can be linked to purchases.

'GROUP BY ContactName' groups customers together to allow performing aggregate SUM() function for each customer..

'SUM(details.UnitPrice * details.Quantity) AS 'cumulative_amount' creates new column aggregating total spent per customer.

4) return employee who has the most sales:

This query starts the same as the as last query, but this time linking 'Employees' 'Orders' -> 'details' tables so that employees can be linked to purchases.

'GROUP BY Emploees.ID' groups employees together to allow performing aggregate SUM() function for each employee.

'GROUP BY ContactName' groups customers together to allow performing aggregate SUM() function for each customer.

'SUM(details.UnitPrice * details.Quantity) AS 'cumulative_amount' creates new column aggregating total sold per employ.

5) Rank the orders of each customer according to the order date:

RANK() OVER(...) is a type of window function. Window functions are similar to aggregate functions, but instead of using 'GROUP BY' to reduce rows and return one aggregate value per group, they use 'PARTITION BY' to group but *not* reduce rows, and perform calculations on them, which are based on their relation to other rows in the same partition In this case partitioning rows by 'CustomerID', ordering rows within each partition by 'OrderDate', and creating a new column 'OrderRank' which ranks the order in each partition according to order date (1 = first order, 2 = second order, 3 = third order, etc.)

Author:
Alan Grunberg
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2024 The Information Lab