This blog post will talk you through an example of:
- Where to start with an API call
- How to retrieve data from an API
- How to prepare the data to use
APIs can vary and will have different levels of authentication required. They will also vary in terms of the detail of the accompanying documentation.
Luckily, in this example, the documentation gives us some clear and helpful information to get us started! See this link: https://rickandmortyapi.com/documentation
In this case, we are using a REST API. Just below the title we are given a URL to show us the Base URL. This is the root of the API to which we can add endpoints to access specific information.
INPUT
Step 1
In Alteryx Designer, we will begin by putting a Text Input tool on the canvas and inserting this Base URL into a cell and changing the header to ‘url’.
In this instance, we are trying to retrieve data (a GET request) and this is indicated in the documentation. What’s even more helpful is that we are told what type of data we should expect to retrieve, JSON, and within the black box we are shown what we will receive from this API call. Our 3 different endpoints are clearly stated as Characters, Locations and Episodes and we can see the respective endpoints for each.
Step 2
The Download Tool.
This tool can be found by typing in the search box (alternatively, look on the toolbar under the Developer tab). We need this tool to bring back data from a location on the internet into Alteryx.
When we configure the tool, we will automatically start on the Basic tab where we should find our 1 field (url) has been selected in the dropdown menu. We don’t need to change anything, by default “Encode URL Text” should be checked and String should be selected as the Output to a Field.
(Our GET request will be set as the default, which you can find as the HTTP Action under the Payload tab).
When we run this, we are told whether the call has worked by the message brought back in the “DownloadHeaders” field.
We are hoping for “ HTTP/1.1 200 OK “. If not, then go back and check the previous steps have been followed.
Step 3
We were told in the documentation that our API call would bring back JSON data, so our next step is to connect a JSON Parse tool to our Download. This can also be found either by searching or under the Developer tab on the toolbar.
The only configuration we need to do is to select the “DownloadData” field from our dropdown menu.
When we run this, we should find two new fields, JSON_Name and JSON_ValueString.
We’ve now received the results the documentation told us we would receive!
We have a different endpoint (JSON_ValueString) for each of the different resources available.
At this point, we can repeat our process with the Download tool and then the JSON Parse but this time we’re passing these 3 URLs through these tools.
I used a Select Tool to rename JSON_ValueString as “url”, and then connect the Download tool and JSON Parse.
NB. Make sure the “url” field is selected for the Download and “DownloadData” is selected for the JSON Parse.
DATA PREP
We now have data for Characters, Episodes and Locations, however, the documentation has a Pagination section.
I decided just to use Episodes and Characters, so at this point I used a Filter to separate into 2 streams of data, one for Characters and one for Episodes. However, I performed the same steps on each of these streams until just before I joined them.
This means that there is potentially more than one page of data for our endpoints.
Therefore, we need to find a way to bring back all the pages.
When we look at the data we have brought back, we can see that we have some extra information as well as our results. The two rows that are most helpful to us tell us that there are 3 pages in total and then we are given the URL for the next page.
- Filter the data to keep the rows where JSON_Name contains “info.”
- Text to Columns into 2 columns with a . delimiter
- Select to rename JSON_Name2 as Header
- Crosstab with Header as Column Headers and JSON_ValueString as Values, tick Concatenate
Now, we want to create an API call for every page.
- Generate Rows leave everything as default but set our new “pages” field as the upper limit.
- Select Tool to change the data type of “RowCount” to String (otherwise the next step will produce an error).
- Formula Tool to create a URL for every page of data
Now we can send an API call for every page!!
Repeat our Download process again followed by the JSON Parse.
At this point, we just want to use our results data.
- Filter JSON_Name contains “results.”
- Regex Tool used on JSON_Name to parse the header and the grouping within the text. Expression - results\.(\d+)\.(.*)
We then want to Crosstab our data so that our headers create new columns, however, I found that the group number restarted on each page which resulted in multiple values ending up in the same cells.
To avoid this I used the Formula Tool to create a combination of page number and group as the new grouping ID.
- Crosstab Select Header as the Column Headers, JSON_ValueString as the Values and tick Group and Concatenate as Group data by and Aggregation respectively.
PREPARE FOR JOIN
This is when we want to structure our different streams of data slightly differently, which will allow us to join them.
With the Episode data I used Transpose with all the characters as the Data columns and I kept everything apart from “url” in the Key columns.
This gives us a row for each character within an episode.
I then parsed out the Character ID from the end of the Value column, using Regex.
This ID can then be used in the join.
- Join - Character ID = id
Hopefully at this point, you can remove any additional unwanted columns.
I have kept 11 columns and ended with 1,037 rows.
And there you have it! Enjoy visualizing your Rick and Morty data!!