New Kid on the Doc(ument), Part 1 – Alteryx

Handoff vs. Hands-off

I started the VNGLOOKUP blog as a way of figuring out how to explain the more technical things I’ve been learning to myself in ways that I would understand (and share my puns with the world, of course)—and more importantly, remember—but that’s just one reason for why to document your work. It’s also a way to look back on what you’ve accomplished, for both yourself and others, whether they be other members of your team, your organization as a whole, or potential employers (to be clear, I’m referring to personal projects for the latter—don’t share anything you aren’t supposed to share!).

Creating documentation for others is especially important, whether you’re tracking developments as the project is going on or for whoever will be working on it after you, making for more effective team collaboration and smoother handoff. Moreover, when applicable, it’s just as important to be documenting what you have done, as well as why you have chosen to do something a particular way; something that I’ve been learning to be comfortable with is that there’s always more than one way to solve a problem, each with its own pros and cons (I’m decent at coming up with “one” solution and have difficulty coming up with other ways to accomplish the same results, so I like seeing other people’s approaches, in order to understand how they think about problems and learn how else I can approach a problem myself).

How to Document an Alteryx Workflow

For me, Alteryx workflows, even my own, take a lot of time to understand when a little time has passed, so to jog my memory more quickly, I’ve started documenting as I go, stopping after every one or two major steps completed to go back and annotate each individual tool. I’ve already posted a screenshot of my workflow for the remake of my HIPAA breaches dashboard in a previous blog post, but I’m showing it again here to call out specific things I like to do when documenting my workflow:

When annotating the individual tools, I write out what each tool is doing, balancing description and conciseness; I want to include just enough information that I can tell what the tool is doing without having to check the configuration pane for every single tool—which would defeat the purpose of these annotations—while also only needing just that one glance to understand the annotation. If I still need more information than that, that’s when I’ll check the configuration pane (and perhaps reword it, if it’s not clear enough).

Below is an example:

These tools are all involved in preparing the population estimate data I got from the U.S. Census, which I used for normalization when I was analyzing how HIPAA breaches affect individual states. Take the Select Tool—third tool from the left. From first glance, you would know that the Select Tool is changing some columns, but what? You’d have to click into the configuration pane to see the changes, and you might still forget what was changed after clicking out of the tool (this happens to me a lot). If you click on the tool, then the little tag on the very left of the configuration pane, and add an annotation, you won’t need to click on the configuration pane every time.

As you can also see from the screenshots, I’ve also grouped the tools into steps or sections after annotating; annotating the individual tools as the first step helps you better group the tools after, as it’ll refresh your memory of the tools’ purposes and how they serve the bigger goal. You can also color-code the sections so that all inputs and outputs are consistently associated with one color, data-cleaning with another, and joins with yet another, etc. (Just make sure to include a legend with the meanings of the colors for easy reference.)

You can use either the Comment or Tool Container to do your color-coding; each has its pros and cons. I like the Comment boxes because you can make the font size bigger and control the size of the box as well. Tool Containers also have the benefit of being able to turn off that part of the workflow if needed (e.g., you may not want to call an API too often if you’re testing other parts of the workflow, lest you hit the API rate limit), but you can’t really make the text size bigger, other than if you zoom into the canvas more. Not only that, you can’t customize the size of the Tool Container itself; it’s dependent on how many tools are in the container itself—but dragging groups of tools in a Tool Container to another spot on the canvas is much easier, as you’d have to manually select all the tools and the Comment box itself to move them at the same time. You also need to right-click on the Comment box and send it to the back, so that the tools are actually visible.

Compare:

Want the best of both worlds? Put the Comment inside the Tool Container! As long as you’ve colored in one of the two, you can leave the other as the default color; keep it consistent, but don’t make it so complicated that you get turned off from documenting altogether.

Either way, keep the descriptions helpful but concise, summarizing the overarching step you accomplished with all those tools (though for more complicated logical expressions, I tend to write a little more to explain everything that’s happening, to both check my understanding and help whoever’s looking at it next). For example, this step cleans up the data to make each row mean one state’s population for one particular year (before, each row was one state with the 2010 population, 2011 population, etc. in its own column—I wanted the populations in their own rows).

Finally, it’s time to give your workflow a title and description at the top of the canvas. Even though these are the first details you’ll see when you open up the workflow, I like to do this as the last part of my documentation. This is because by then, I’ll have thoroughly reviewed every bit of my workflow and figured out how to balance description with conciseness. Thus, I’ll be better able to summarize the purpose of the workflow well enough that whoever looks at it next will be able to understand what it’s doing that much more quickly.

As they go over each of the colored sections, they can always get more details or clarification from the individual tools’ annotations and configuration panes, but they won’t necessarily have to, since you’ve already put in the time to set them up for success. It’s a tradeoff of people’s time—a workflow that isn’t as well-documented will take you less time to make that workflow, but the next person may need more time to understand what each individual tool is doing before understanding the whole workflow’s purpose; the more time you take to document it, the less time the next person needs to spend understanding it. Moreover, as you’re documenting it, the process of doing so will increase your own understanding of the workflow, in addition to the next person’s, whereas not documenting a workflow (or not documenting it well) may keep you in the dark about what exactly it’s doing as well because there wasn’t an opportunity for reflecting on the steps you took and the tools you chose after the fact.

Depending on how complex your workflow is, your description may, of course, be longer, but here’s mine:

That’s the workflow bit done; if your organization uses something like Notion, that’s where you’d write… well, something like this blog post! Alteryx is great for processing and preparing data, but longform documentation should be stored elsewhere. I’ll be back next time to talk about how you can document your Tableau workbook :)

Author:

Vivian Ng

View Profile