Why DevOps Chose YAML Over JSON for Configuration
The historical and practical reasons YAML became the dominant configuration format for Kubernetes, Docker Compose, GitHub Actions, Ansible, and Helm — and the tradeoffs that come with it.
If you've worked with Kubernetes, Docker Compose, GitHub Actions, Ansible, Helm, CircleCI, Travis CI, or Serverless Framework, you've written YAML. A lot of YAML.
This isn't a coincidence. The DevOps ecosystem converged on YAML for configuration almost universally, despite YAML having known problems and JSON being simpler. Understanding why explains how to work with both effectively.
The Case Against JSON for Config Files
JSON is excellent for data interchange. It's strict, unambiguous, and supported everywhere. But it has three properties that make it painful for config files:
**No comments.** Config files aren't just data — they're documentation. You want to explain why a timeout is 30 seconds, why a replica count is 5, or why a specific environment variable is set. JSON gives you no mechanism for this.
**Verbosity.** JSON requires double quotes around every key and string value, commas between items, and curly braces for every object. For deeply nested config structures like a Kubernetes pod spec, this adds enormous visual noise.
**No multi-line strings.** Writing a multi-line shell script or a SQL query as a JSON string requires manual \n escaping. It's painful to write and painful to read.
Why YAML Won
YAML solved all three:
# This timeout was increased from 10s after the K8s incident in Q3 2024
timeout: 30s
script: |
#!/bin/bash
echo "Starting deployment"
kubectl apply -f ./manifests/
kubectl rollout status deployment/my-app
The # comment explains a non-obvious value. The | block scalar lets you write a real multi-line script without escaping. No quotes on simple values. No commas.
For config files that humans write, review in pull requests, and maintain over years, YAML's readability is a significant advantage.
The Kubernetes Effect
Kubernetes, which launched in 2014 and became dominant by 2017, chose YAML for its manifest files. Because Kubernetes is the platform that everything else runs on, every tool in the ecosystem had to speak Kubernetes YAML.
Helm (the Kubernetes package manager) uses YAML. Kustomize uses YAML. ArgoCD uses YAML. Flux uses YAML. The entire cloud-native ecosystem standardised on YAML because Kubernetes did.
Docker Compose chose YAML before Kubernetes, reinforcing the pattern. GitHub Actions chose YAML, which means every CI/CD tool that wanted GitHub compatibility had to match the convention.
Ansible's Contribution
Ansible, which became the dominant configuration management tool in the early 2010s, made a different choice from Puppet (Ruby DSL) and Chef (Ruby DSL) — it used YAML for playbooks.
The reasoning: YAML is more approachable for sysadmins who know the systems but aren't software developers. A YAML playbook reads like a list of tasks; a Ruby DSL reads like code.
This established YAML as the choice for infrastructure-as-code that non-developers might read or write — a pattern that Kubernetes inherited.
The YAML Tradeoffs
The DevOps ecosystem's YAML commitment comes with real costs:
**Significant whitespace bugs.** Misindented YAML silently produces a different structure. A single tab character breaks YAML entirely. Tools like yamllint exist specifically to catch these issues.
**The type inference problem.** YAML tries to be helpful by inferring types — yes is true, no is false, 3.0 is a float, 1e2 is scientific notation. This creates subtle bugs. The classic: setting a version to 1.0 and having it parsed as the integer 1 instead of the string "1.0".
**YAML anchors and aliases.** These are YAML-specific features (& to define an anchor, * to reference it) that reduce duplication in configs. But they're not in JSON, which means tools that convert YAML → JSON must resolve them inline.
**Multiple valid syntaxes.** YAML has block style and flow style. Both are valid. A YAML file can mix them. This creates inconsistency in configs that are edited by multiple people.
JSON for Config: When It's Better
Despite YAML's dominance, JSON is sometimes the right choice for config:
**When your config is programmatically generated.** If a script or tool writes the config, JSON's strictness prevents encoding errors. Generated YAML with incorrect indentation is silent corruption; generated JSON with syntax errors fails immediately.
**When non-YAML tooling consumes it.** tsconfig.json, package.json, .eslintrc — these are JSON because the JavaScript ecosystem uses JSON. Mixing in YAML would require additional parsers.
**When you don't need comments or multi-line strings.** If the config is simple and flat, JSON's simplicity can be an advantage.
**When schema validation matters.** JSON Schema is more mature than YAML Schema tooling. If you're building a tool that needs to validate user-provided config strictly, JSON + JSON Schema is better supported.
Converting Between Them
The practical reason to convert JSON → YAML: you have a JSON config from a tool or API and need to put it in a Kubernetes or GitHub Actions YAML file.
The practical reason to convert YAML → JSON: you're writing code that reads a YAML config and needs to pass it to a JSON-consuming API, or you want to inspect a YAML file's structure as JSON to run a JSON Schema validator against it.
Both conversions are lossless for standard data types (comments are lost going YAML → JSON, and anchors/aliases are resolved). The round-trip works cleanly.