What is Alteryx?

If you're like me, you hadn't heard of Alteryx before the data school. So what exactly is it? Here's a good analogy: if Tableau is a shiny flat screen TV, then Alteryx is the bundle of cables hidden behind the stand. Alright... maybe that doesn't give Alteryx the credit it deserves.

All jokes aside, Alteryx is a powerful ETL (Extract, Transform and Load) tool. And it can be used to do exactly that: extract, transform and load data.

More specifically, here's an example of what we used Alteryx for in our third day of Alteryx training. We were presented with some source sales data shown below.

Source data.

This essentially shows quarterly regional sales for a business. A common business report could require you to find the 3 month moving average. Here we would technically be doing the 3 quarter moving average, but the ideas remain the same. To solve this issue we use the Multi-Field Formula Tool.

Alteryx tool.

Using this tool, we can reference rows above or below the row we are calculating for. Consider the following formula:

( [Sales] + [Row-1:Sales] + [Row-2:Sales] ) / 3

Imagine using this formula to go through source data table in order, row by row, and entering the answers into a new column. For the first and second rows you may expect an error, since there aren't any rows above the first one. However, for all other rows, you should be able to see that we calculate a 3 quarter moving average. Here's how we get around the issue for the first and second rows.

Configuration pane for Multi-Row Formula tool.

In the configuration pane for the Multi-Row Formula tool shown above, there is an picklist for "Values for Rows that don't Exist". One option allows you to "Set to Values of Closest Valid Row", which escapes the error we would otherwise get. There is also a "Group By" section which allows you to specify when the calculation restarts. Here we are telling Alteryx to partition the data by region - so our 3 quarter moving average wont mix values between different regions.

Here's our output.

Output data.

I'm sure you can imagine doing this manually for a dataset of 10,000+ rows can be very tedious. This is just one basic use case, but hopefully this gives you an idea of what Alteryx can do.

Author:
Alban Halili
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2024 The Information Lab