This blog has been in the bank for a bit but I was using Alteryx today and having difficulty reading in a single excel with multiple sheets and unioning the sheets with a new field that documented the sheet name. In Tableau prep it was really straightforward in the input data step:
![](https://www.thedataschool.co.uk/content/images/2023/02/image-173.png)
In Alteryx I was running into difficulties querying the correct sheets.
I really liked that the interface showed a summary distribution of the data it is efficient to have it built into the interface and gives a sense of distribution very clearly before making any cleaning decisions.
![](https://www.thedataschool.co.uk/content/images/2023/02/image-170.png)
The group by spelling feature is again an effective means of streamlining the basic processes of cleaning data. From a consulting standpoint it has strong demo capabilities, it takes little time to demonstrate, is simple and leaves a strong impression on those less data savvy.
I also really liked the descriptive bar chart that displays when you try to join data in Tableau Prep. It visualizes the different join conditions whilst also demonstrating their implications on the resultant dataframe’s dimensions. How many cases are used and how many are excluded based on current conditions.
![](https://www.thedataschool.co.uk/content/images/2023/02/image-172.png)
I did not like trying to remove a duplicate field – there did not seem to be an equally intuitive method for doing this. Going through the aggregate function and specifying a condition for which entry of a case with the same ID makes a degree of sense on reflection but I was expecting a case exclusion option given how simple other procedures were on the software.
Ending with a quality of life reminder for myself and any other Prep beginners, when working collaboratively, it is useful to save the flow as a “packaged” workflow in order to include the input data in the file that is shared with the collaborators or else you might encounter an error message where the software cannot identify the input data.