How do we get from data in storage to using it?
In the first week of the Data school, we were introduced to a concept called ETL, a process that explains how we get a dataset to a point where we can analyse it.
There are 3 main steps to this process :
Extract - Pull data from a source - typically a data source / flat file
Transform - Reshape/ aggregate / clean your data to a form that is usable for analytics.
Load - Upload/ load the cleaned dataset in an accessible place. So that analysts or those who will need to use it can!
Below is a comic doodle that helps me remember what each step of this process requires! Hope you find this useful as well!

