How to import files with different schema using a Batch Macro

Why we can’t use the wildcard input: this only works if all the files are of the same type, and have the same schema. Any files with schema that do not match the file that is first loaded in, will be skipped and a warning will show.

Similarly, although you could manually union the separate files together, this is usually time consuming if you’re working with multiple files e.g. one for each trading region.

What is a Batch Macro?

An Alteryx flow that runs one time for each record in a field, which is determined using a control parameter. The field chosen in the control parameter interface determines how many times the batch macro will run. The batch macro then unions all of its outputs together.

Pre-requisites:

If you have never created a macro before, you need to set up a macro repository where you will keep all your macros. Firstly, create an empty folder named ‘Macro’ somewhere on your computer, then navigate to Options > User Settings > Macros.

Add a file path by clicking the ‘+’ icon and navigate to the folder you just created. The newly created folder should then appear in your toolbar, if it doesn’t it should show up once you save a macro into it.

The Data: for this example, I will be using three different forms of generic superstore data:

The macro flow:

Input one of your files using the INPUT DATA tool: it doesn’t matter which one you choose.
Drag a CONTROL PARAMETER from the Interface section of the tool bar: connect the black Q anchor of the control parameter to the input tool’s lightning bolt anchor. An action tool will automatically appear.
Configure the action tool: this only requires you to change the drop down to Update Input data tool

Configure the Control parameter: enter the name Full Path - the reason why we need to call it this will be clearer later

Drag a MACRO OUTPUT tool and connect it to your input: configure it as below:

Whilst configuring the macro output navigate to the interface designer pane on the left - if you can’t see this, go to View > check Interface Designer.

Navigate to the Settings cog and change the Output Mode to auto-configure by Name (you’ll notice this looks a lot like the Union tool configuration window); this unions all the files together, regardless of difference in schema - any columns not present in some of the files will be filled with NULLS.
Save the Macro to your repository - you should now be able to see it in your Macro folder on the far right of your toolbar.

Using Directory to locate your files:

The directory tool returns a list of the files in a specified directory
In a new flow, drag in a DIRECTORY tool from the In/Out section: locate the folder where all your files are located, the specification is the pattern to match within that directory. Here, the directory will import a list of all files ending .csv in my folder Different Schema

This imports the Full Path - the field we will enter into our control parameter to determine the amount of times the batch macro runs. As there are three file paths, the macro will run three times.

Insert your newly created batch macro after the directory: connect the directory to the macro’s upside down ? anchor. Inside the batch macro's configuration window, choose your full path field:

And you’re done! Add a browse to see your final input, here we have records from Austria, Belgium and Denmark stacked on top of each other, as the Belgium file didn’t have a Quantity field, it has been filled with NULLS.

The best part? You never have to do all of this again - as the batch macro is now saved in your repository, you can just drag it in whenever you’re struggling with differing schema.

I hope this helps and happy inputting!

Author:

Vivien Ho

View Profile