Getting the Most Out of Your Data Preparation Tools
I've spent a fair amount of my time as a data analyst getting said data into a more workable format. As one of those weirdos who enjoys fiddling with Alteryx for weeks on end, I'm pleased to have been able to spend so much time learning the tool. I've also had to learn how to transpose the desired result from Alteryx to Tableau Prep - an ask that can sometimes be difficult, or even impossible. Knowing when to use which tool can save invaluable time and produce a more stable, long lasting data ecosystem. So, how do you choose?
1) Is this for me or for a client?
Who is going to be using this flow? If it's for personal use, you can more comfortably choose whichever tool you prefer. If it's for a client, you have to factor in whether or not they have access to Alteryx at all. While more powerful, Alteryx loses out to Tableau Prep in one major way - Prep is bundled with Tableau. In many cases, the problem begins and ends with this one question. Can't use something the client doesn't have.
2) How significantly am I going to be transforming the data?
Assuming you do have the choice, whether you're doing work for a client or for yourself, you now have to look at the data in its raw form and determine just how much manipulation you need to do. Alteryx is a powerful tool, but if all you need to do is add a totals column, remove duplicate rows, and sort everything alphabetically, Prep can handle it with ease. There's no need to bring artillery to a knife fight. On the other hand, if you expect to be doing a lot of complex transformations, especially stuff with regex, geospatial values, multiple joins, and calculations utilizing multiple fields, you may find Alteryx is better equipped to handle such asks effectively.
3) How do I keep track of what's happening to the data?
In cases where you feel Alteryx and Prep could serve you equally well, you should consider how you feel most organized and effective while working. Tableau Prep has a particular focus on minimizing tools used, and showcasing exactly what's going on in great detail. It brings as many transformations into the Clean step as possible, which makes it easier to see everything at once. Every tool also features a "Changes" section, where everything that's been done to the data will be listed in order of execution. You will be able to see at a glance what you've done, whether the change is throwing an error, and even be able to move the changes up and down in the order. Every tool also shows exactly what the output is going to look like at all times, updating every time you make a change. Alteryx is much more segmented, especially with very long and complicated flows. You will have to go through each individual transformation and add Browse tools if you want to see the full effect on the data, and the changes themselves are individualized as much as possible into their own tools. If you find it easier to look at the whole of the work you've done at once, and need to see immediately how every change you make is going to affect the data, Prep may be better suited for you. On the other hand, Alteryx favors those who get overwhelmed or confused when everything is being bundled into one step, and are not as concerned with seeing what every change they make is doing right in that moment.
Ultimately, there is a hierarchy of impact on choice. At the top are absolute limitations, such as whether or not the client has access to the tech. Then come the functional needs, where you have to consider what you want your output to look like and how easily Alteryx or Prep would get you there. And at the bottom are personal preferences. Do you like Alteryx or Prep more? Which feels more intuitive? Which leads to you making fewer mistakes? Starting on the right foot will make it much easier to accomplish your goals, so it's important to step back and look at all the factors before delving into hours and hours of work.