Yesterday I was doing Alteryx weekly challenge #37, which required me to parse the XML formatted text. Obviously, it demands the usage of the XML parse tool, which I have never tried before! Good chance to learn it. But I don't even know what XML means! So I will try to explain what is XML in this blog, with the modified examples I made from w3schools and some internet resources.
What is XML?
XML stands for Extensible Markup Language (XML), which focuses on text storage and transmission. Each XML element is a chunk of pure text wrapped by tags, while tags have no function - They are just named tags. Noted that it is somehow similar to the HTML element. However, tags in HTML have a predefined meaning, which can change the display of the text, like color and size. Tags in XML element is just a name indicating what the text is about. They did nothings!
data:image/s3,"s3://crabby-images/25ef9/25ef951ad5044eb7830a82dbf1c4b4b230a7feb6" alt=""
The XML element can either contain text or other contents to form a structure. The example below shows the title and author of a book. Can you see the <book> tag? It wrapped other elements (contents) rather than text. The wrapped elements should be indented to indicate they belong to the wrapper.
data:image/s3,"s3://crabby-images/2764a/2764a8760c7031667aa3327006240403029169c9" alt=""
The tags can have attributes to add extra information. Let's say we want to indicate the book is categorized as 'children'. We can use the format:
data:image/s3,"s3://crabby-images/dc108/dc10867f55f26259972e196900577977fafe3f49" alt=""
With this technique, we can build a more complicated structure. Below is a bookshelf with two books, each of which has its category information and details.
data:image/s3,"s3://crabby-images/290ae/290aed6b4cec27a3600072a9766bfcf9569fc1c5" alt=""
As you can see, XML can form a hierarchical/ tree structure, as known as an element tree. Each element tree at least contains a 'root' (most outer wrapper) and a 'child' (the wrapped element), but it may contain a 'sub-child', a 'sub-sub-child' and so-on when there are multiple levels of hierarchy.
data:image/s3,"s3://crabby-images/ae7e4/ae7e4141c34c4ded4f6b641cff0a443eb792cae9" alt=""
So Here is it! The introduction of XML. Next time I will teach how to parse the XML elements into columns in Alteryx
Feel free to check out my other post or connect with me on Linkedin - learn and improve together!
https://www.thedataschool.co.uk/blog/stanley-chan