Alteryx Basics: Fixing Bad Headers

by Peter Silvester

It is not uncommon to come across data sets that will make it hard or impossible when it comes to analysing them and will need reshaping. One of those instances is when there are multiple dimensions being used as headers for a single column in the data (see the before image below). If we want to import this to Tableau and filter by the dimensions we will need to reshape the data which is where Alteryx comes in very useful.

before2

Before

after

After

 

 

 

 

 

 

 

 

 

1. Input Data

As always when using Alteryx we must start with a data input. It is important in this instance to make sure that the ‘First Row Contains Data’ option is selected so that we can create our own headers instead of them being automatically populated from our first row.

input-data

 

2. Sample and Summarise

In this example we want to merge the top two rows and make those into our header but if we tried to summarise the rows now it would apply the summary to every row in the data set. To isolate the rows we want create a sample of the first 2 and a second sample set to skip the first 2 rows that captures the rest of the data through. Setting the summarise tool to concatenate every field in the first sample will now successfully merge only the rows that were set as the headers.

summarise

 

3. Union

The samples can be put back together using the Union tool. Because the two samples will have different headers the configuration should be set to Auto Config by Position otherwise they will not line up correctly.

union

 

4. Dynamic Rename

The headers can now be fixed with the Dynamic Rename tool. Setting the Rename Mode to ‘Take Field Names from First Row of Data’ will replace the headers with the row that was created by summarising the sample.

dyn-rename

 

5. Transpose

Optimise the data for Tableau with the Transpose tool to make it tall and thin. Select the fields that you wish to keep as dimensions so the

transpose

 

6. Text to Columns and Tidying Up

The final stage of the process is splitting the column so that the two headers that we started with are represented by separate columns in the data and can be used independently. The new headers and columns that we created can be edited and tidied up using the Select tool and the data set can be browsed or output to a new data file.

text-to-col

Below is the full workflow.

example-workflow

 

 

Avatar

Peter Silvester