For Starters: Essential definitions
Granularity. Granularity is how ‘high-definition’ your data is in. If the data is more summarising, you loose granularity, the more specific, you have a lower level of granularity.
Data field: Column
Record: row
Main Course: What you can do with the meat of your data?
Merge:
Bringing your data together. If you want to merge different data fields into one data field (like an IBAN number or username), use a calculated field and simply add the data field names to each other…
IBAN: [Country Code] + [Check Digits] + [Bank Code] + [Sort Code] + [Account number]
If you merge your data by merging fields (using ctrl), the data fields will overlay each other. It will look like one of your data fields has gone missing. Look out for that if you do it by mistake!
Aggregation:
When you aggregate data, you are reducing the granularity of your data because you are losing some information. This can be a great tool if your dataset has too much going on! Think of this statement before making an aggregation:
'Aggregated field' BY the 'group field'.
The group field(s) is the one(s) ‘in-charge’. It is this data field that controls the granularity of your dataset going forward. This will be a categorical data type.
You want to aggregate when you are summarising or accumulating datasets e.g, changing sales by product to sun of sales. Here you aggregate ‘sales’ BY ‘product’.
Pivot:
You can pivot from ‘columns to rows’ or ‘rows to columns’. Remember to change the default depending on the type of pivot you are asking Prep to do.
Pivoting is very helpful if you want to change the axes of your table, or to pull out information to focus on, so that future analysis becomes easier.
Join:
Combining two different datasets. There is normally a shared data field (or more) in both datasets, that can act as a point of reference to make sure they join properly. This is an inner join.
Union:
New tables/ information can overlay on an existing data set. Ideally, the new data you are adding needs to have the same data fields, for a seamless addition.
Dessert: The sweet essentials
Calculated fields:
IF()
Based on specified criteria, if the criteria is present, THEN you can assign it a specific ‘name’ ELSE you can give it a different ‘name’. End.
ROUND()
Wrap your calculated field in this function to round the numeric data. Need to specify to how many decimal places (,1).
DATEDIFF()
Very handy to work out the number of days between two dates. E.g. the shipping time of a product: [shipping date] - [order date]
Don't forget the TIP: My lessons learnt from this week.
Tableau Prep should be used for transforming/ preparing your dataset only. Don’t try to start analysis because you will loose too much granularity and won’t be able to analyse anything in Tableau Desktop.
Plan what question(s) you will want to ask of the data/ consider the user’s needs. Please refer to my blog ‘Put Yourself in the User’s Shoes.’ Once I have cleaned up the dataset, my next thought is how am I (the client) going to use this dataset? What data fields do I need? What numeric data do I need to have ready for analysis?
And with that, your data should be transformed.