DS23 started their 3rd week with an introduction to Tableau Prep Builder and one of the new things I found useful was grouping values. To demonstrate this function, I will be using Preppin' Data 2019 Week 5 as an example.
The third requirement of the challenge was to find out how the customer contacted the company. Looking at the Notes field (figure a.), we can see that this information is the first word of each string: the customers either called or emailed.
The FIND() function here is used to find the location of the first space character in the string, this is then used by the LEFT() function to return only the first word. Now we have the methods of contact but the problem is that they're all spelt differently. Here's where grouping values by fuzzy match can come in handy. Tableau Prep Builder finds and groups values that match and replaces them with the value that occurs the most within the group.
The figure above shows us the different types of grouping we can do:
- Manual selection: Allows you to manually edit the groupings
- Pronunciation: Values that sound alike are grouped
- Common characters: Values that have common letters or numbers are grouped
- Spelling: Text values that are spelled alike are grouped
When grouping values by Spelling or Pronunciation, the results can be changed by moving the threshold slider to adjust the strictness of the grouping parameters. The slider affects the number of values included in a group and the number of groups created.
Moving the slider:
- Left: Stricter groupings, fewer matches, less groups
- Right: Looser groupings, more matches, more groups
In our case, using the settings above (figure d): Spelling with the threshold setting at 2, we are able to get the two groups that we want. Now we are sure that this column contains only the values "Called" or "Email".
Hopefully this has given you a nice and quick overview of the fuzzy match grouping function in Tableau Prep Builder!
Thumbnail used is from Providence Doucet on Unsplash.