Understanding XML: Structure, Namespaces, Attributes, and When to Use It in 2026
A developer's guide to XML's core concepts — elements vs attributes, namespaces, CDATA, schema validation, and the cases where XML is still the right choice.
XML is older than most frontend developers' careers and is still embedded in systems that process trillions of transactions per year. Understanding its structure — not just its syntax — is necessary for working with enterprise integrations, document formats, and legacy APIs.
The XML Data Model
XML is not just a serialisation format — it's a data model. The XML Information Set (Infoset) defines a tree of nodes, each of which can be:
- **Element nodes** — the `<tag>` things you see
- **Attribute nodes** — key-value pairs on elements (`id="42"`)
- **Text nodes** — character data inside elements
- **CDATA sections** — character data where markup characters are not interpreted
- **Comment nodes** — `<!-- ... -->`
- **Processing instructions** — `<?xml-stylesheet type="text/css" href="style.css"?>`
- **Document node** — the root of the tree
This richer node type system is what allows XML to model documents (where mixed content — text interspersed with elements — is natural) as well as data. JSON's data model is simpler and maps to programming language types more directly.
Elements vs Attributes
The fundamental design decision in any XML schema: should this information be an element or an attribute?
As an element:
<book>
<id>42</id>
<title>Clean Code</title>
</book>
As an attribute:
<book id="42" title="Clean Code" />
There are guidelines, not rules:
- **Attributes** are for metadata, identifiers, and simple scalar values. They cannot contain structure, cannot repeat, cannot have sub-elements.
- **Elements** are for content, complex values, repeated items, and anything that might need to be extended.
A common pattern: attributes for identifiers and type information, elements for data content:
<product id="P001" type="digital">
<name>Developer Handbook</name>
<price currency="USD">49.99</price>
</product>
This choice affects how XML converts to JSON. Attributes typically become @-prefixed keys in JSON (the convention of many converters), or get merged into the same object as child elements.
Namespaces
XML namespaces solve the collision problem: when two XML vocabularies use the same element name, namespaces disambiguate them.
<root
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title>My Document</dc:title>
<a xlink:href="https://example.com">Link</a>
</root>
The xmlns: declarations bind a prefix to a namespace URI. The URI is a unique identifier (typically a URL, but not necessarily resolvable). The prefix (xlink, dc) is a local alias.
SOAP envelopes use namespaces extensively:
<soap:Envelope
xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:ns="http://example.com/myservice">
<soap:Body>
<ns:GetUser>
<ns:UserId>42</ns:UserId>
</ns:GetUser>
</soap:Body>
</soap:Envelope>
When converting XML with namespaces to JSON, converters typically preserve namespace prefixes in key names (ns:GetUser → "ns:GetUser") or strip them (GetUser → "GetUser"). The right choice depends on whether the downstream consumer needs to know the namespace.
CDATA Sections
CDATA sections allow you to include characters that would otherwise need to be escaped (like < and &) as literal text:
<script>
<![CDATA[
if (a < b && b > c) {
alert("Hello");
}
]]>
</script>
Without CDATA, < would need to be < and & would need to be &. CDATA is common in XML files that embed HTML, JavaScript, or SQL.
When converting to JSON, CDATA content is treated as the text value of the element.
Schema Validation: DTD vs XSD vs RelaxNG
XML has three major schema languages:
**DTD (Document Type Definition)** — the oldest, limited to structural validation, cannot validate data types. Most XML validators support it.
**XSD (XML Schema Definition)** — the most widely used. Supports data types (integer, date, boolean, decimal), inheritance, complex type definitions. Required for SOAP/WS-* services.
**RelaxNG** — simpler and more powerful than XSD but less widely supported. Popular in open-source tooling.
For working with enterprise APIs (banks, insurance, government), you'll often receive an XSD and need to produce XML that validates against it. This is a different concern from formatting — a formatted XML file might be unreadable to an XSD validator if element order matters (XSD sequence constraints).
Well-Formed vs Valid
These are distinct concepts:
**Well-formed** — the XML follows basic XML syntax rules: one root element, all tags closed, attributes quoted, no illegal characters. A well-formed XML document can be parsed by any XML parser.
**Valid** — the XML conforms to a specific schema (DTD, XSD, or RelaxNG). A valid document is always well-formed; a well-formed document is not necessarily valid.
When DevConvert formats or converts XML, it requires well-formedness. Schema validation requires the schema file.
When XML Is Still the Right Choice in 2026
- **SOAP/WS-* APIs** — financial services, insurance, telecoms, government services that predate REST
- **EDI** — electronic data interchange in supply chain, healthcare (X12, EDIFACT)
- **Office document processing** — reading or generating DOCX, XLSX, PPTX programmatically
- **SVG** — scalable vector graphics are XML; if you're generating or processing SVG programmatically, you work with XML
- **RSS/Atom** — if you're building or consuming feed readers or podcast tools
- **Android development** — layout files, string resources, AndroidManifest.xml
- **Java enterprise** — Spring configuration, Maven pom.xml, legacy application servers
For new APIs and data interchange between modern services: use JSON. For the cases above: XML is not a legacy mistake, it's the right tool for its domain.