When working with a dataset that has gaps in sequential data, you may need to generate new rows to fill in the gaps in the data. With Tableau Prep 2021.3.1, we now have a new feature: a new step called new rows. This feature offers great flexibility in generating missing rows for numeric values or date fields to reduce our work and avoid complex calculations.
What does it do and how does it work?
Let’s have a look at an example to showcase how to use this feature. Let’s say we are working with a sales dataset that only has records for the days when sales occur, and you need to have a record for each calendar day. The “New Rows” step will fill in the gaps for the missing days with zeros.
Now let’s try it in Tableau Prep. First I will connect to the dataset. I also added a clean step to have a better look at the data. So we see there are 6 rows with order dates and sales amounts. We see there are no records for the days between the 3rd of January and the 8th of January because there were no sales recorded. Since we want the missing day records, we will use the new step “New Rows” by clicking on the plus sign and selecting New Rows.
The first thing we need to specify is whether the gap fill will take the range of values from a single field or use the values of two fields to define the range. In this example, we need to add rows from one field "Order Date". After specifying a single field and the name of the field to be used, the interface offers the possibility to configure it. In this example, Tableau Prep was intuitive with the default settings and already did what we wanted but I will go over and explain each setting.
The first setting uses the minimum and maximum values of the reference field, in this case, Order Date. The next setting allows us to choose whether to append the new row to the existing field or create a new field for the original data. We will choose to update the existing field since we want to add the missing dates to our data.
When creating new rows with date fields, you will be asked for the date increment to add. For dates, the increment can be specified in days, weeks, or months. In our case, we want to add one day. The last thing remaining is to determine what values should the new rows have. We can add nulls to each new data row for all data fields or null or zeros or copy from the previous row. In our example, we want to add zeros for the days when there were no sales.
In the data grid we can see the dates and the final data table with the new rows. Of course, there are many more uses for this feature than what I showed here but I just wanted to show a simple example to understand this new feature.