Difference between Null, Empty and Missing Data

So while doing some practice questions for the tableau certified data analyst exam I came across a question where the distinction between null, empty and missing data came into play. Up until this point admittedly I haven't really been making a distinction between these 3 and while researching this distinction I thought it would make a good blog so here we are.  

I should make a disclaimer at this point that I only looked into the difference between these 3 in the context of tableau, in other contexts they may be treated differently.

So what is the difference between these 3 types of data?

Null data refers to a lack of a value or a missing value in a specific field or column. This can happen due to various reasons such as a data entry error, a missing field in the original data source, or a field that was not applicable to a certain record.

This occurs when the underlying data set contains records but the measure value(s) is NULL or blank. In the example below, "NULL and Missing Examples.xlsx", the project "Brosnya" is NULL for "In-Progress" and "Completed".

Missing data, on the other hand, refers to the absence of a row which is implied to still exist. This occurs when there are no records in the underlying data set. In the example data (Above), the project "Hibagon" is missing "In-Progress" and "Completed"

Empty data, also known as a "zero value," refers to a value that is explicitly entered as a zero or a blank space, rather than being null or missing. Do not confuse the blank space in empty data with a blank space in Null data.

In Null data a blank space is nothing - no data type

In Empty Data a blank space is not nothing it is still something i.e. the blank is still registered as a data type (string).

It's important to note that Tableau treats null, missing, and empty data differently in calculations and visualizations. For example, a calculation that uses the COUNTD function will not include null or missing values, while a calculation using the COUNT function will include empty values.

It is also important to note that Null or Missing Data would not have a data type (e.g. string, integer etc.) and no functions can be used on them e.g. you cant sum them or check there length with a LEN() function.

In contrast to this, empty does have a data type either an integer if a zero was entered or a string if left blank. Therefore, functions can be used on them.

Hopefully I've managed to convey the distinction between these 3 in a way that is understandable. Don't worry if you don't understand it at first it does take a bit of reading/ re-reading before the concept starts to land.

Author:
Shahbaz Khan
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2024 The Information Lab