Hello reader, it’s Day 3 of learning with The Data School and I wanted to touch on how to identify a data set that is out of date?
A great start is to check the publication date of the file to view how recently the file was created or updated, checking the properties of the file may also reveal additional information to help expose this. Checking for updates to the data file will give you a good idea on the history of alterations.
Visiting the source of where you found this data. For my Application Viz I found a dataset on Netflix from Kaggle which was out of date – however I was able go directly to Netflix’s data suppository to supplement my data with the titles that I was looking for. From the data’s source you may also be able to contacting the database’s administrator as a shortcut in finding this.
As an addendum to the above, when downloading a dataset be sure to read any comments by the uploader or other users which may help grant clarity of the date of this data – perhaps the uploader has left a comment warning future prospectives looking to download the data, that it is out of date.
It’s also important to note that the methods of assessing data freshness can vary depending on the type of information and the context of its use.