In some global projects, the user could come from many countries worldwide. Or the data owner must collect the data from everywhere in the world but the result should be in one consistent language. The language could be a barrier. In that case, this blog could help you translate the text into any language you want.
In this blog, I will share how to build a batch macro to help you translate a text column.
- Set up the environment
- Apply Python tool in Alteryx
- Build a Batch Macro
- Test the workflow
Are you ready? But don't open Alteryx now. Let's get started! :)
1/ Set up the environment
If you are using Windows OS, open the Command Prompt (open the Terminal if you are using MAC OS). Make sure that you already installed Python. You can check it by typing the command:
python --version.
If you see the result return Python 3.##.#, then your computer installed Python already (Fig. 1).
Then, you need to install the deep-translator package on the computer. You can install that package by typing:
pip install deep-translator
It will take a few seconds to install the package. You can check the documentation page from deep-translator here.
Now, everything is ready. Let's switch to Alteryx Designer.
2/ Apply Python tool in Alteryx
Make sure that you do this step because you can only run the Python tool with the Administrator role. In the search box, type Alteryx Designer, right-click on the icon, and choose Run as administrator.
The Alteryx Designer software opens, and I drag the Text Input tool to the Canvas. Then, I put the name for the field in Text Input. You can put any data in each row. (Fig. 3)
Then, I drag the Formula tool to connect to the Text Input tool in the canvas. In this Formula tool, I created a new column called "lang_code" where I will put the list of the language code that I want to translate the text to. You can check the language code from Google here.
For example: in Fig. 4, I want to translate my text into Spanish. The language code for Spanish language is "es". (Remember the quotation mark for the text)
From the Search box on the top right corner in Alteryx, I type Python. It will show the Python tool. Then, drag that Python tool to connect to the Formula tool in Canvas. (Fig. 5) Then run the workflow.
After running the workflow, it will transfer the data from the previous tool to the Python tool. (Fig.6)
In the Configuration window of the Python tool, I delete the "#" key before the line Package.installPackages(['pandas','numpy']) because I need to install the package for using the function from Pandas and Deep-Translator later.
I deleted the NumPy package because I didn't need to use it. I only need the Pandas package and Deep-Translator. So the code looks like in Fig. 7. Then hold and press Shift+Enter to run that code.
Next, I need to import the function that I want to use.
There are 3 lines I need to import:
- from ayx import Alteryx (This library will help us to input the data and output the data from the Python tool)
- from deep_translator import GoogleTranslator (We will need the GoogleTranslator function from the deep_translator package to translate the text)
- import pandas as pd (Import the pandas package to output as a DataFrame)
Then, hold and press Shift+Enter to run the code.
Now, let's transfer the data from the previous tool into the Python tool and print it out here.
In the next line, I type:
input_text = Alteryx.read("#1")
The line above will use the Alteryx package with the read function. It will read the data from the number 1 connection. Alteryx marked "#1" when you connect the Formula tool with the Python tool. Then, I assign the data in the "input_text" variable.
df = pd.DataFrame(input_text)
I store the input_text data inside a Pandas DataFrame and I called the variable is df. You can print out df to check the result.
Note that if you see the SUCCESS notification after Alteryx.read but there is no data when you print out df, then you should run the workflow again. Then, click on the Python tool, and choose Cell > Run All. You will see your dataset from df.
Next, I need to define a function to help me translate the text of each row.
In the next row, I define the function translate_text_deep:
def translate_text_deep(row):
try:
if not isinstance(row['lang_code'],str):
raise ValueError(f"Invalid Language Code: {row['lang_code']}")
translator = GoogleTranslator(source='auto', target=row['lang_code'])
return translator.translate(row['Text Input'])
except Exception as e:
return f"Translator Error: {e}"
The try will run the code to translate the text while the except will catch the error and print it out. (Be careful with indentation, Python is very sensitive with it)
In the try part:
- If the lang_code is not a string, then print out the error for Invalid Language Code
- Set the parameter for the Google Translator function. The source parameter will auto-detect the language from the text. The target parameter will decide the language that needs to be translated.
- Call the translate function to translate each row in the Text Input column and return.
Then run the code. Next, I will apply that function to my data frame df.
I apply the translate_text_deep function into my data frame df. Then, assign the result in a new column in df called "translated_output". Finally, I output the data frame df at the output anchor number 1. (Fig. 11)
Congratulations!!! You already translated the text!
But we are not done yet! I want to build a batch macro, so you can share it with anyone on the team to use. They can put any text column in and set the language code they want to translate to.
3/ Build a Batch Macro
If you are new to Alteryx or you want to review all types of Macro, you can read the 2 blogs I wrote before here (part 1, part 2).
First, I will copy my text input into a new worksheet. Then, right-click on it and choose Convert To Macro Input.
Then, I copy the rest of the tool from the previous workflow into this new worksheet. Also, convert the Browse tool to Macro Output. (Fig. 14)
Now, I want the user to list all languages in a column and my macro will translate them all and union at the output.
I click on the Interface palette tool. Then drag the Control Parameter tool into Canvas and connect the Q anchor from the Control Parameter to connect to the thunderbolt Z of the Formula tool. Automatically, the Action tool will be connected between those tools.
Then, click on the Action tool to replace the language code. In the Configuration window, Select Update Value (Default) > expand the Formula - Formula Fields - Formula Field > click on @expression - value row. Then check on the check box Replace a specific string. Remove the quotation mark around es (whenever we put the language code in, it will keep the quotation mark there). (Fig. 16)
Open the Interface Designer in Alteryx by going to View > Interface Designer. The Interface Designer window opens, and then I click on the Setting tool (the gear icon). You can change any icon for the Batch Macro you want. In the Output Mode, I choose Auto Configure by Name (Wait Until All Iterations Run). That option will auto-configure the Name and union of the data in the same column name. (Fig. 17)
Run the workflow again. Then, I save the macro by holding and pressing Ctrl+S (or go to File> Save). You will see that the file type is .yxmc (format type of the macro). (Fig. 18)
You got a Batch Macro!!! You can share it with everyone now. But hold on, let's test it first :)
4/ Test the workflow
I create 2 new tables: Language Code and Text Input. In the Language Code table, I will list all language codes I want to translate under a column named lang_code. In the Text Input Table, I list all the text under a column named Text Input (You see in Fig. 19, I put the text in different languages).
Then, in the canvas, I right-click, choose Insert > Macro > Select the Macro you just saved.
Now, we only need to connect the 2 tables to the batch macro. The language code table will connect to the Output anchor (up side down question mark). The Text Input table will connect to the blank input anchor. (Fig. 20)
Note that each table's field name should match the field name in the Python code (Text Input and lang_code). If not, the batch macro cannot recognize the column. However, you can put a Dynamic Rename in the batch macro to rename the field. By doing that, the user can put any column name but it will rename to the column name you set.
When you click on the macro, in the Configuration window, don't change anything in the Group By tab. Click on the Questions tab, and choose the field lang_code. By doing this, the macro will recognize this is a Control Parameter, it will iterate each row (which is a different language code) and input in the macro.
Then, run the workflow to get the result. Bump!!! All the text is translated!
You will see that each different language text will be union in the same output. You can use the Cross Tab tool to transpose data from rows into columns.
This blog is quite long. In this blog, I shared how to set up the environment, apply the Python tool to translate the text, build a batch macro, and test the workflow. This batch macro is good for you to translate the short text in each row. If the text is too long in each row or there are so many rows in the text table, it would take a long time to translate.
With this batch macro, you can share it with everyone in the team. They don't have to build the batch macro again. In case you cannot run Python on the server, you can translate the text while preparing data by doing this way.
I hope my blog is helpful to you.
Enjoy reading! See you soon in the next blog!