Introduction of XML

Yesterday I was doing Alteryx weekly challenge #37, which required me to parse the XML formatted text. Obviously, it demands the usage of the XML parse tool, which I have never tried before! Good chance to learn it. But I don't even know what XML means! So I will try to explain what is XML in this blog, with the modified examples I made from w3schools and some internet resources.

What is XML?

XML stands for Extensible Markup Language (XML), which focuses on text storage and transmission. Each XML element is a chunk of pure text wrapped by tags, while tags have no function - They are just named tags. Noted that it is somehow similar to the HTML element. However, tags in HTML have a predefined meaning, which can change the display of the text, like color and size. Tags in XML element is just a name indicating what the text is about. They did nothings!

Most basic structure of an XML element

The XML element can either contain text or other contents to form a structure. The example below shows the title and author of a book. Can you see the <book> tag? It wrapped other elements (contents) rather than text. The wrapped elements should be indented to indicate they belong to the wrapper.

XML element can either contain text or other contents 

The tags can have attributes to add extra information. Let's say we want to indicate the book is categorized as 'children'. We can use the format:

XML tags with attributes

With this technique, we can build a more complicated structure. Below is a bookshelf with two books, each of which has its category information and details.

As you can see, XML can form a hierarchical/ tree structure, as known as an element tree.  Each element tree at least contains a 'root' (most outer wrapper) and a 'child' (the wrapped element), but it may contain a 'sub-child', a 'sub-sub-child' and so-on when there are multiple levels of hierarchy.

Structure of a XML element tree

So Here is it! The introduction of XML. Next time I will teach how to parse the XML elements into columns in Alteryx

Feel free to check out my other post or connect with me on Linkedin - learn and improve together!
https://www.thedataschool.co.uk/blog/stanley-chan

Author:
Stanley Chan
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2024 The Information Lab