In the first part of the calculation in SQL and Tableau, I compared the calculations how to get the percent of total, the difference from the previous and next row. You can take a look at the first part here. Continue to that first part, in this blog, I am going to compare the calculations in:
- Difference from the first/last row
- Rank
- Running total
- Moving Calculation
Difference from the first/last row
In the previous blog, I mentioned the difference from the previous/ next row. You can choose the First/ Last option in the "Relative to" dropdown menu to find the difference between the first/last row and other rows.
In Snowflake, to find the First/ Last value, I use FIRST_VALUE, LAST_VALUE or NTH_VALUE functions. They are in the Window function category.
SELECT DATE_TRUNC('month',order_date) as month_order_date,
ROUND(SUM(SALES),2) as sum_sales,
sum_sales - FIRST_VALUE(sum_sales) OVER (ORDER BY month_order_date) as different_first_row
FROM superstore
GROUP BY month_order_date
ORDER BY month_order_date;
Same as the difference in the previous/next row, I use the DATE_TRUNC function to return the month of the order date and find the sum of sales (when using an aggregate function, must have a GROUP BY clause at the end).
Then, I use the FIRST_VALUE function to find the first row of the sum of sales orders by the month of order date in ascending. To find the difference, the whole FIRST_VALUE function will be subtracted by the sum of sales.
You can apply the same logic with the LAST_VALUE function to find the difference between the last row with other rows.
Rank
In Tableau, there are 4 types of RANK (Competition, Modified Competition, Dense, and Unique). You can get more details about RANK in Tableau here.
- Competition (1, 2, 2, 4): if writing in the calculated field, the syntax would be RANK(expression, ['asc' | 'desc']). The row with the same value will be assigned the same rank as the first instance of the value, not get the consecutive rank.
- Modified Competition (1, 3, 3, 4): the syntax is RANK_MODIFIED(expression, ['asc' | 'desc']). The row with the same value will share the same rank with the last instance of the value and not the consecutive rank.
- Dense (1, 2, 2, 3): the syntax is RANK_DENSE(expression, ['asc' | 'desc']). The row with the same value will share the same rank with the first instance of the value, and no rank number is skipped (consecutive rank).
- Unique (1, 2, 3, 4): the syntax is RANK_UNIQUE(expression, ['asc' | 'desc']). In this option, each row has a unique rank.
In Snowflake, Rank function is in Window function category.
- Competition Rank: the syntax is RANK() OVER (PARTITION BY [<expr 1>] ORDER BY <expr 2> [ASC | DESC]). The result returns the same as the competition rank in Tableau.
SELECT DATE_TRUNC('month',order_date) as month_order_date,
ROUND(SUM(SALES),2) as sum_sales,
RANK() OVER (ORDER BY sum_sales DESC)
FROM superstore
GROUP BY month_order_date
ORDER BY month_order_date;
- Dense rank: the syntax is DENSE_RANK() OVER (PARTITION BY [<expr 1>] ORDER BY <expr 2> [ASC | DESC]). The result returns the same as the result in dense rank in Tableau.
- Unique rank: the syntax is ROW_NUMBER() OVER (PARTITION BY [<expr 1>] ORDER BY <expr 2> [ASC | DESC]). For the unique rank, in SQL, we use the ROW_NUMBER() to get the unique rank for each row.
There is no function in SQL returns the result same as the Modified Competition rank in Tableau.
There is another function in SQL called NTILE() to divide the dataset into a bucket from 1 to a specified number. The syntax is NTILE(<constant number>) OVER (PARTITION BY [<expr 1>] ORDER BY <expr 2> [ASC | DESC]).
Running Total
In Tableau, to return the running total, select Running Total from the Calculation Type and choose the aggregate function (SUM, AVG, MIN, MAX). In this example, I chose SUM. If writing in the calculation, it would be: RUNNING_SUM(SUM([Sales])).
In Snowflake, to return the running total value, you can use the Window function with the aggregation functions (SUM, AVG, MIN, MAX). The syntax is Aggregation function(<expr1>) OVER ([PARTITION BY <expr2>] ORDER BY <expr3>).
For example, to find the running total of the sum of sales in SQL, I will write the SQL query as:
SELECT DATE_TRUNC('month',order_date) as month_order_date,
ROUND(SUM(SALES),2) as sum_sales,
SUM(sum_sales) OVER (ORDER BY month_order_date) as running_total
FROM superstore
GROUP BY month_order_date
ORDER BY month_order_date;
Use the aggregate function SUM for the sum of sales, Order by the month of order date (month_order_date) in ascending order. It will return the running_total. You can change the aggregate function to AVG, MIN, MAX for running average or running min/max.
Moving Calculation (or Rolling Calculation)
In Tableau, to get the Moving Sum or Rolling Sum value, change the Calculation Type to Moving Calculation. Then, choose the aggregate function (SUM, AVG, MIN, MAX). After that, specify the previous values, and next values and check the box current value if including the current row.
For example: I want to find the rolling sum in 3 months. It means every 3 months, return the total of sales. The steps to do in Tableau:
- Right-click on the Sum(Sales) field, and choose Add Table Calculation
- Select Moving Calculation from the Calculation Type dropdown menu
- Choose Sum, previous values = 2, next values =0, and check the current value. Be careful when you specify these values. I selected the previous values as 2 and check current value box, so the total row is 3
- Close the Table Calculation window
The result:
The total sum of sales of Jan 2021, Feb 2021, and Mar 2021 = 14,518.055 + 4,519.892 + 56,933.909 = 75,971.856
The total sum of sales of Feb 2021, Mar 2021, Apr 2021 = 4,519.892 + 56,933.909 + 28,295.345 = 89,749.146
In Snowflake, to calculate the running sum of sales in 3 months, you can use the Window function.
SELECT DATE_TRUNC('month',order_date) as month_order_date,
ROUND(SUM(SALES),2) as sum_sales,
SUM(sum_sales) OVER (ORDER BY month_order_date
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) as moving_sum
FROM superstore
GROUP BY month_order_date
ORDER BY month_order_date;
Use the window function for the Sum of sales with aggregate function SUM outside, order by the month of order date in ascending order. Then, add the keyword ROWS to let Snowflake calculate from where to where. In this case, I want the rolling sum in 3 months. So I set ROWS BETWEEN 2 PRECEDING and CURRENT ROW. It means from the previous 2 rows until the current row.
Then, I get the result as Fig. 9:
In this blog, I compared the calculations between SQL and Tableau by going through some use cases: Difference from the first/ last value, rank, running total, and moving calculation. In Tableau, you can apply the dynamic calculation from the view by using a parameter to change the constant value (For example: the number of previous rows).
I hope my blog about calculations in SQL and Tableau is helpful to you if you are working in both SQL and Tableau or if you are planning to work on either platform.
Thank you and see you soon in the next blog!