Mining for Goods !

How do we get from data in storage to using it?

In the first week of the Data school, we were introduced to a concept called ETL, a process that explains how we get a dataset to a point where we can analyse it.

There are 3 main steps to this process :

Extract - Pull data from a source - typically a data source / flat file

Transform - Reshape/ aggregate / clean your data to a form that is usable for analytics.

Load - Upload/ load the cleaned dataset in an accessible place. So that analysts or those who will need to use it can!

The order of these steps isn't fixed, so you could end up extracting, loading and then transforming. But what is generally done in each step tends to stay the same!

Below is a comic doodle that helps me remember what each step of this process requires! Hope you find this useful as well!

Michael has been tasked with creating the world's biggest diamond so that Tony can make the world's biggest ring.
Author:
Arushi Pant
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2026 The Information Lab