SQL & Alteryx

Last month, there was a Data School New York x Fullstack Academy DATA Meet & Greet. It was a lively meeting, in which current students of the bootcamp program I just graduated from visited my company. I suggested a SQL quiz during the Alteryx demonstration to test their memory of SQL joins and emphasize how easy Alteryx is to use. Instead of writing a SQL query joining tables on related columns, an Alteryx user can drag the Join tool into the canvas and select the related columns.

My intention was not to disparage SQL; I love SQL and my understanding of SQL supports my Alteryx learning. For example, I was able to understand the Join tool in Alteryx due to my knowledge of SQL joins. Therefore, I will compare the SQL order of commands (SFWGHO) to Alteryx tools in this blog post.


SELECT = Select tool

The Select tool selects columns of interest and through this tool, we can rename (or alias) these columns.

FROM = Input tool

The Input tool connects to a file or database to provide data. After inputting data, we can work from the connected file or database.

WHERE = Filter tool

The Filter tool subsets the data into two: one where a given condition is met and the other where it is not.

GROUP BY = Summarize tool

The Summarize tool in Alteryx groups data based on specified column(s) and aggregates based on specified column(s).


This tool is similar to a SQL GROUP BY statement, but it can also perform an aggregation, such as COUNT() or SUM(), in a SQL SELECT statement.

HAVING = Filter tool

Yes, the Filter tool appears twice here.

In SQL, the HAVING clause is used with aggregate values. After aggregation via the Summarize tool, a Filter tool can filter the data after having met a given condition.

ORDER BY command = Sort tool

The Sort tool in Alteryx orders your data based on specified column(s) in ascending or descending order.


To conclude, I would like to show a SQL query and a similar Alteryx workflow. I am starting with the classic superstore dataset and I want to find the sum of sales by subcategories having a sum of sales greater than $50,000 in the East region, sorted by the sum of sales in descending order.

The SQL query is:


On the other hand, the same result can be generated in Alteryx like so:


One thing I like about Alteryx is there are many ways to complete a task. Meanwhile, SQL does not have the same level of flexibility. Though to be fair, Alteryx is software with over 270 tools and SQL is a language to query data. They’re not in the same playing field.

Nevertheless, I look forward to drawing more connections between SQL and Alteryx, as I learn additional Alteryx tools during my training. Will there be a second blog post on this topic? Maybe.

;

Author:
Elaine Yuan
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2024 The Information Lab