Build a Batch Macro to Translate a Text Column

In some global projects, the user could come from many countries worldwide. Or the data owner must collect the data from everywhere in the world but the result should be in one consistent language. The language could be a barrier. In that case, this blog could help you translate the text into any language you want.

In this blog, I will share how to build a batch macro to help you translate a text column.

  • Set up the environment
  • Apply Python tool in Alteryx
  • Build a Batch Macro
  • Test the workflow

Are you ready? But don't open Alteryx now. Let's get started! :)


1/ Set up the environment

If you are using Windows OS, open the Command Prompt (open the Terminal if you are using MAC OS). Make sure that you already installed Python. You can check it by typing the command:

python --version.

If you see the result return Python 3.##.#, then your computer installed Python already (Fig. 1).

Fig. 1: Check the Python version in the computer system

Then, you need to install the deep-translator package on the computer. You can install that package by typing:

pip install deep-translator

It will take a few seconds to install the package. You can check the documentation page from deep-translator here.

Now, everything is ready. Let's switch to Alteryx Designer.


2/ Apply Python tool in Alteryx

Make sure that you do this step because you can only run the Python tool with the Administrator role. In the search box, type Alteryx Designer, right-click on the icon, and choose Run as administrator.

Fig. 2: Run Alteryx Designer as the Administrator role

The Alteryx Designer software opens, and I drag the Text Input tool to the Canvas. Then, I put the name for the field in Text Input. You can put any data in each row. (Fig. 3)

Fig. 3: Drag the Text Input tool to Canvas and input data

Then, I drag the Formula tool to connect to the Text Input tool in the canvas. In this Formula tool, I created a new column called "lang_code" where I will put the list of the language code that I want to translate the text to. You can check the language code from Google here.

For example: in Fig. 4, I want to translate my text into Spanish. The language code for Spanish language is "es". (Remember the quotation mark for the text)

Fig.4: Create a lang_code column with the language code want to translate to

From the Search box on the top right corner in Alteryx, I type Python. It will show the Python tool. Then, drag that Python tool to connect to the Formula tool in Canvas. (Fig. 5) Then run the workflow.

Fig. 5: Drag the Python tool from the Search box to Canvas

After running the workflow, it will transfer the data from the previous tool to the Python tool. (Fig.6)

Fig. 6: Run the workflow after connecting the Python tool to the Formula tool

In the Configuration window of the Python tool, I delete the "#" key before the line Package.installPackages(['pandas','numpy']) because I need to install the package for using the function from Pandas and Deep-Translator later.

Fig. 7: Install Python packages

I deleted the NumPy package because I didn't need to use it. I only need the Pandas package and Deep-Translator. So the code looks like in Fig. 7. Then hold and press Shift+Enter to run that code.

Next, I need to import the function that I want to use.

Fig. 8: Import the function from the library

There are 3 lines I need to import:

  • from ayx import Alteryx (This library will help us to input the data and output the data from the Python tool)
  • from deep_translator import GoogleTranslator (We will need the GoogleTranslator function from the deep_translator package to translate the text)
  • import pandas as pd (Import the pandas package to output as a DataFrame)

Then, hold and press Shift+Enter to run the code.

Now, let's transfer the data from the previous tool into the Python tool and print it out here.

Fig. 9: Transfer the data from the previous tool into the Python tool

In the next line, I type:

input_text = Alteryx.read("#1")

The line above will use the Alteryx package with the read function. It will read the data from the number 1 connection. Alteryx marked "#1" when you connect the Formula tool with the Python tool. Then, I assign the data in the "input_text" variable.

df = pd.DataFrame(input_text)

I store the input_text data inside a Pandas DataFrame and I called the variable is df. You can print out df to check the result.

Note that if you see the SUCCESS notification after Alteryx.read but there is no data when you print out df, then you should run the workflow again. Then, click on the Python tool, and choose Cell > Run All. You will see your dataset from df.

Next, I need to define a function to help me translate the text of each row.

In the next row, I define the function translate_text_deep:

def translate_text_deep(row):
    try:
        if not isinstance(row['lang_code'],str):
            raise ValueError(f"Invalid Language Code: {row['lang_code']}")
            
        translator = GoogleTranslator(source='auto', target=row['lang_code'])
        return translator.translate(row['Text Input'])
    
    except Exception as e:
        return f"Translator Error: {e}"

The try will run the code to translate the text while the except will catch the error and print it out. (Be careful with indentation, Python is very sensitive with it)

In the try part:

  • If the lang_code is not a string, then print out the error for Invalid Language Code
  • Set the parameter for the Google Translator function. The source parameter will auto-detect the language from the text. The target parameter will decide the language that needs to be translated.
  • Call the translate function to translate each row in the Text Input column and return.
Fig. 10: Defind the translate_text_deep function

Then run the code. Next, I will apply that function to my data frame df.

Fig. 11: Apply the function to data frame df and output

I apply the translate_text_deep function into my data frame df. Then, assign the result in a new column in df called "translated_output". Finally, I output the data frame df at the output anchor number 1. (Fig. 11)

Fig. 12: The result at output anchor #1

Congratulations!!! You already translated the text!

But we are not done yet! I want to build a batch macro, so you can share it with anyone on the team to use. They can put any text column in and set the language code they want to translate to.


3/ Build a Batch Macro

If you are new to Alteryx or you want to review all types of Macro, you can read the 2 blogs I wrote before here (part 1, part 2).

First, I will copy my text input into a new worksheet. Then, right-click on it and choose Convert To Macro Input.

Fig. 13: Convert the Text Input into Macro Input

Then, I copy the rest of the tool from the previous workflow into this new worksheet. Also, convert the Browse tool to Macro Output. (Fig. 14)

Fig. 14: Copy the workflow from the previous workflow and convert input/output to Macro Input/output

Now, I want the user to list all languages in a column and my macro will translate them all and union at the output.

I click on the Interface palette tool. Then drag the Control Parameter tool into Canvas and connect the Q anchor from the Control Parameter to connect to the thunderbolt Z of the Formula tool. Automatically, the Action tool will be connected between those tools.

Fig. 15: Connect the Control Parameter to the Formula tool

Then, click on the Action tool to replace the language code. In the Configuration window, Select Update Value (Default) > expand the Formula - Formula Fields - Formula Field > click on @expression - value row. Then check on the check box Replace a specific string. Remove the quotation mark around es (whenever we put the language code in, it will keep the quotation mark there). (Fig. 16)

Fig. 16: Update the language code from the Action tool

Open the Interface Designer in Alteryx by going to View > Interface Designer. The Interface Designer window opens, and then I click on the Setting tool (the gear icon). You can change any icon for the Batch Macro you want. In the Output Mode, I choose Auto Configure by Name (Wait Until All Iterations Run). That option will auto-configure the Name and union of the data in the same column name. (Fig. 17)

Fig. 17: Configure the options in the Interface Designer window

Run the workflow again. Then, I save the macro by holding and pressing Ctrl+S (or go to File> Save). You will see that the file type is .yxmc (format type of the macro). (Fig. 18)

Fig. 18: Save the Batch Macro (.yxmc)

You got a Batch Macro!!! You can share it with everyone now. But hold on, let's test it first :)


4/ Test the workflow

I create 2 new tables: Language Code and Text Input. In the Language Code table, I will list all language codes I want to translate under a column named lang_code. In the Text Input Table, I list all the text under a column named Text Input (You see in Fig. 19, I put the text in different languages).

Then, in the canvas, I right-click, choose Insert > Macro > Select the Macro you just saved.

Fig. 19: Set up the workflow

Now, we only need to connect the 2 tables to the batch macro. The language code table will connect to the Output anchor (up side down question mark). The Text Input table will connect to the blank input anchor. (Fig. 20)

Note that each table's field name should match the field name in the Python code (Text Input and lang_code). If not, the batch macro cannot recognize the column. However, you can put a Dynamic Rename in the batch macro to rename the field. By doing that, the user can put any column name but it will rename to the column name you set.

Fig. 20: Click on the Macro to configure

When you click on the macro, in the Configuration window, don't change anything in the Group By tab. Click on the Questions tab, and choose the field lang_code. By doing this, the macro will recognize this is a Control Parameter, it will iterate each row (which is a different language code) and input in the macro.

Fig. 21: Connect the Input to the Batch Macro and run the workflow

Then, run the workflow to get the result. Bump!!! All the text is translated!

You will see that each different language text will be union in the same output. You can use the Cross Tab tool to transpose data from rows into columns.


This blog is quite long. In this blog, I shared how to set up the environment, apply the Python tool to translate the text, build a batch macro, and test the workflow. This batch macro is good for you to translate the short text in each row. If the text is too long in each row or there are so many rows in the text table, it would take a long time to translate.

With this batch macro, you can share it with everyone in the team. They don't have to build the batch macro again. In case you cannot run Python on the server, you can translate the text while preparing data by doing this way.

I hope my blog is helpful to you.

Enjoy reading! See you soon in the next blog!

Author:
Le Luu
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2024 The Information Lab