While it is useful to have relationships in Tableau, it is important to understand how the joins work and how it can be optimized for performance.
By default, relationships are set to Cardinality: many-to-many and Referential Integrity: Some records match and will select a join type based on the fields used.
Cardinality refers to how unique the data in a field or multiple fields are. Fields with low cardinality (low uniqueness) refers to those that have repeated data such as in a long table where each row represents a review for a book. A book can have multiple reviews, hence, there can potentially be 3 rows referring to the same book ID but each row contains a column that holds a different review.
If the related column/field has high cardinality (high uniqueness), then the column would have values that are unique in each row. For example, in a table with book description where each row is a specific book, then the column containing the Book ID would have high cardinality.
The options for cardinality are:
- One to One (1:1)
- One value of the related field is related to at most one value in the other table
- One to Many (1:m) or Many to One (m:1)
- One value in a table can relate to multiple rows in the other table
- Many to Many (m:m)
- Multiple values in the shared field can relate to multiple fields (the default setting)
Cardinality differs such that it helps Tableau determine whether to aggregate table data before or after joining the data during the building of a view in a sheet.
- With Many, Tableau will aggregate the data before joining (m:m)
- With One, the data will be aggregated after joining
- However, if your data is not unique and One is selected, then duplicate aggregate values would be shown in the view
- This would be the case for 1:1, m:1 and 1:m
- These options are more performative, however, you must be 100% certain that your data is unique. In most cases, using the default option of m:m is always the safer option
Referential Integrity is used to determine what join type is used to get the dimension values for a measure during an analysis.
- Some Records Match is used when some values do not match in the other table or if you are unsure. In this case, an outer join is used.
- All Records Match is used when you are sure that every value matches a value in the other table. In this case, fewer and simpler joins are used that optimizes queries through inner joins. Tableau does not join null keys.
It is recommended to use the default many to many and some records match to avoid any potential issues.
Photo by Alina Grubnyak on Unsplash