I recently worked on a project where I had to collect data from a bunch of Excel files and push it all into an Azure database using Alteryx. Straightforward, right?
Not quite. Some of those Excel files weren’t just holding data — they were full of formulas referencing other Excel files (e.g. VLOOKUPs across multiple workbooks). In theory, I could just connect Alteryx directly to those source files and replicate the logic.
But here’s the catch: I was dealing with over 270 Excel files, many with custom macros and formulas that only update when the workbook is opened and closed. And as you probably know, Alteryx can only read the values as they were when the Excel file was last saved, not the live formulas.
So how do you actually extract those underlying formulas to understand what’s happening behind the scenes? One solution is to convert .xlsx files to .zip and extract the underlying XML code.
Why You Might Need to Extract Excel Formulas
When you’re auditing or migrating data from Excel to a proper database, it’s often not enough to just read the final values — you also need to know how those values were calculated.
This is especially true when:
- The Excel files use linked formulas (e.g. VLOOKUPs, INDIRECT references, or external connections).
- You’re trying to replicate logic in Alteryx, SQL, or dbt.
- You’re working with macros or volatile formulas that don’t refresh unless Excel opens the workbook.
The Problem with Alteryx + Excel Formulas
Alteryx can easily read Excel data, but it doesn’t “calculate” formulas. It just reads whatever value was stored the last time the file was saved.
That means:
- If the workbook wasn’t opened and refreshed recently, your data might be stale.
- If cells rely on other workbooks, Alteryx won’t resolve those links.
- There’s no way to directly “see” the formula behind a value using the normal Alteryx Excel Input tool.
The Workaround: Treat Excel Like a Zip File
Here’s the fun part: modern Excel files (.xlsx) are actually just zipped collections of XML files.
That means you can rename any .xlsx file to .zip and peek inside to see the structure, including the underlying formulas.
Convert .xlsx to .zip:
- Make a copy of your Excel file.
- Rename it from filename.xlsx to filename.zip.

- Extract the .zip file — you’ll see a bunch of folders. Inside the xl/worksheets folder, you will see files like sheet1.xml. Each sheet has its own .xml file.
Extract Formulas using Alteryx:
- Use an Input Data tool down with the following config to read the XML data.

- Use the XML Parse tool down with the following config to convert XML data into a readable format

- 'f' column is the formula
- 'v' column is the last updated cell value that Alteryx normally reads.

The Gotcha: Finding the Rest of the Cell Contents
So far, we’ve focused on pulling the formulas out of Excel — but what about the rest of the cell content? That information lives in a few different places inside the Excel file structure.
When you unzip a .xlsx, you’ll notice a folder structure like this:
/xl/
├── worksheets/
│ ├── sheet1.xml
│ ├── sheet2.xml
│ └── ...
├── sharedStrings.xml
├── styles.xml
└── workbook.xml
You can use the Alteryx workflow below to see how to connect your formulas to the rest of the cell contents from the sharedStrings.xml file.

Here’s what each of the different .xml files contain:
sharedStrings.xml
This is where Excel stores all the text values (non-numeric cells).
If a cell has t="s" in the XML, it means “shared string” — the value in <v> is just a reference number. You can look up that index in sharedStrings.xml to find the actual text.
For example:
<c r="A2" t="s"> <v>3</v></c>
means “the text in cell A2 is the fourth entry in sharedStrings.xml”.
styles.xml
This one handles the cell formatting — things like number formats, dates, and currencies. If you ever see s="2" in a cell tag, that’s a reference to a style definition here. You’ll only need this if you want the displayed format, not just the raw value.
workbook.xml
Defines which XML file maps to which worksheet name (so sheet1.xml = “Sales Data”, for example). If you’re pulling data from multiple sheets, this file helps you keep them straight.
