![](https://www.thedataschool.co.uk/content/images/2023/02/shoeib-abolhassani-ukDEbYnyDsU-unsplash-1.jpg)
Data analysis often requires combining data from multiple sources. For example, you may have student grades in one dataset and student names in another. To analyze the data effectively, you need to be able to see the grades and names data together. This is where joining datasets comes into play.
Joining datasets in Tableau Prep is a simple and effective way to combine two or more datasets into a single data source. It allows you to gain a better understanding of your data and simplify your subsequent treatment and analysis.
In this article, we'll explore the basics of joining datasets in Tableau Prep.
The following are the steps to join datasets in Tableau Prep:
Load the datasets into Tableau Prep: You can either connect to your data sources or import your data from a file. Make sure to choose the right type of data (CSV, EXCEL, JSON, XML...)!
![](https://www.thedataschool.co.uk/content/images/2023/02/Screenshot-step-1.png)
Select the datasets you want to join: You can select the datasets you want to join by clicking on the checkbox next to the data source name.
![](https://www.thedataschool.co.uk/content/images/2023/02/Screenshot-step-2.3.png)
Add a clean step to both your Datasets: This step is not compulsory but it is very helpful to understand your datasets better and start looking for a common column to join them on.
![](https://www.thedataschool.co.uk/content/images/2023/02/Screenshot-2.5.png)
You will now be able to visualize simple histograms of your dataset fields and get a better understanding of what data is at your disposal! In this example, we can see that the common column on which we will join our datasets will likely be the student id column present in both datasets.
Student Grades dataset:
![](https://www.thedataschool.co.uk/content/images/2023/02/Screenshot-3.1.png)
Student Names dataset:
![](https://www.thedataschool.co.uk/content/images/2023/02/Screenshot-3.2.png)
Join the two datasets: Simply drag one of the two datasets onto the other one and drop it on the 'Join' box that appears.
![](https://www.thedataschool.co.uk/content/images/2023/02/Screenshot-6.png)
Add the joining clause: If the common columns you are using to join your two datasets share the same name, the join will automatically will be automatic and you will be able to skip this step. Alternatively, you might get the following error message. Do not panic!
![](https://www.thedataschool.co.uk/content/images/2023/02/Screenshot-7.png)
Simply click on 'Add' under 'Applied Join Clauses' and choose the two columns you will be joining your datasets on.
![](https://www.thedataschool.co.uk/content/images/2023/02/Screenshot-step-5.png)
In this case, we will choose to join on 'id' and 'Student ID'. Congratulation! The error message disappears and your two datasets officially become one.
Choosing the Join Type: Now that you have joined your two datasets, Tableau Prep gives you the option to choose the type of join you want to use. By default, the join type is set to 'inner', but you could choose to apply other join types. To do so, simply click on the Venn Diagram below 'Join Type: Inner' (ex: to change the join type from 'inner' to 'left', click inside the orange circle). The join type choice changes how the data is combined and what data is included in the output.
![](https://www.thedataschool.co.uk/content/images/2023/02/Screenshot-step-6.png)
Here's a brief overview of the different join types in Tableau Prep:
- Inner Join: This join type returns only the rows that have matching values in both sources. In other words, only the rows where the join condition is true are included in the output.
- Left Join: This join type returns all the rows from the left (or first) source and only the matching rows from the right (or second) source. If there are no matching values in the right source, the result will contain NULL values.
- Right Join: This join type returns all the rows from the right (or second) source and only the matching rows from the left (or first) source. If there are no matching values in the left source, the result will contain NULL values.
- Full Outer Join: This join type returns all the rows from both sources, whether or not there are matching values. If there are no matching values, the result will contain NULL values.
Ultimately, each join type has its use cases, and the choice of which join type to use depends on the data you are working with and the results you want to achieve.