How to Convert YAML to JSON: Common Gotchas to Avoid
Why YAML and JSON Aren't as Interchangeable as They Look
YAML and JSON have a close relationship — YAML 1.2 is technically a superset of JSON, meaning valid JSON is also valid YAML. That sounds reassuring until you start converting real-world YAML files and hit your first silent data corruption. The two formats were designed with different priorities. JSON was built for machines: strict syntax, no ambiguity, no comments. YAML was built for humans: indentation-based structure, multi-line strings, inline comments, and a type inference system that tries to guess what you mean. That last feature is exactly where conversions go wrong. When a YAML parser reads the string 'yes', it may interpret it as the boolean true rather than the string 'yes'. When it reads '1.0', it might produce a float instead of a string. These aren't bugs — they're the spec working as intended. The problem is that JSON has no equivalent ambiguity. Once your YAML value becomes a boolean true in the parsed tree, the JSON output will faithfully write true, and the string you started with is gone. If you're converting configuration files for a Kubernetes cluster, an OpenAPI spec, or a CI/CD pipeline definition, these silent type coercions can break downstream tooling without throwing a single error. Understanding the structural and semantic differences between the two formats is the only way to convert reliably.
The Fastest Way to Convert: Using CocoConvert
For straightforward conversions, the quickest path is to use a dedicated tool rather than writing a one-off script. CocoConvert's [YAML to JSON converter](/convert/yaml-to-json) handles the parsing and serialization for you, with output that is properly formatted and UTF-8 encoded by default. The workflow is simple: paste your YAML content into the input panel or upload a .yaml or .yml file, then click Convert. The result appears in the output panel and can be copied to clipboard or downloaded as a .json file. A few things worth knowing about how CocoConvert handles the conversion: it uses YAML 1.2 parsing rules, which means the Norway Problem (where 'NO' was parsed as false under YAML 1.1) does not apply. Indentation errors in your source file will surface as a parse error with a line number rather than producing silently malformed output. Multi-document YAML files — files that contain multiple documents separated by '---' markers — are converted to a JSON array, with each document becoming one element. That behavior is intentional and matches what most developers expect, but it's worth knowing before you process a file that has a leading '---' separator and then wonder why your output is wrapped in an array. One honest limitation: CocoConvert does not currently support YAML anchors and aliases that reference nodes across multiple documents in the same file. If your file uses cross-document anchors, you'll need to resolve them manually or use a local script before uploading.
YAML Type Coercion: The Gotchas That Bite Hardest
Type coercion is the single most common source of data loss during YAML-to-JSON conversion. Here are the specific cases you need to audit before converting any production file. **Booleans from unexpected strings.** Under YAML 1.1 (used by many older parsers including PyYAML's default mode before version 6.0), the values yes, no, on, off, true, false, and their uppercase variants are all parsed as booleans. YAML 1.2 restricts this to only true and false, but if your YAML was generated by an older tool, it may contain 'yes' meaning the string 'yes'. Always check your source file's origin. **Octal integers.** The value 0755 in YAML is parsed as the octal integer 493 in decimal. This is a well-known trap in Kubernetes manifests where file permission modes are written as octal literals. After conversion, your JSON will contain 493, not the string '0755'. If that value feeds into a chmod call downstream, the behavior will be wrong and the error will be silent. **Floating-point edge cases.** YAML supports .inf, -.inf, and .nan as valid float values. JSON has no equivalent. CocoConvert will convert these to the strings 'Infinity', '-Infinity', and 'NaN' respectively, because there is no standards-compliant JSON representation for them. If your application is strict about types, this will require post-processing. **Null representations.** YAML allows null, ~, and an empty value to all represent null. JSON only has null. The conversion itself is lossless here, but be aware that an empty YAML value (a key with nothing after the colon) will produce a JSON null, not an empty string.
Handling Multi-Line Strings and Comments
YAML has two multi-line string syntaxes that JSON cannot natively represent in the same way: literal block scalars (introduced with the '|' character) and folded block scalars (introduced with '>'). A literal block scalar preserves newlines exactly as written. A folded block scalar converts single newlines to spaces and preserves double newlines as single newlines. Both of these map to a single JSON string, but the newline handling is different and matters if the string contains structured content like a shell script, a SQL query, or a certificate. For example, this YAML: ```yaml script: | echo hello echo world ``` becomes this JSON: ```json {"script": "echo hello\necho world\n"} ``` The trailing newline is preserved because the literal block scalar includes it by default. If you use the chomping indicator '|-', the trailing newline is stripped. Getting this wrong can cause issues when the string is used as a shell script or passed to an API that is whitespace-sensitive. Comments are a harder problem. YAML supports inline and standalone comments with the '#' character. JSON has no comment syntax at all. This means every comment in your YAML file is permanently discarded during conversion. If your YAML configuration file uses comments to document why a specific value is set — which is extremely common in infrastructure-as-code files — that documentation is gone from the JSON output. There is no workaround for this within the JSON format itself. Some teams address it by moving to JSONC (JSON with Comments) or by keeping the YAML as the source of truth and treating the JSON as a build artifact.
Anchors, Aliases, and Merge Keys
YAML's anchor and alias system is one of its most useful features for reducing repetition, and it's one of the trickiest things to handle correctly when converting to JSON. An anchor is defined with '&anchor-name' and referenced with '*anchor-name'. When the YAML parser processes the file, it expands aliases to their full values before the data tree is built. This means the JSON output will contain the fully expanded content with no reference to the original anchor. Consider this YAML: ```yaml defaults: &defaults timeout: 30 retries: 3 production: <<: *defaults host: prod.example.com staging: <<: *defaults host: staging.example.com ``` The '<<' syntax is a YAML merge key. After conversion, the JSON output will be: ```json { "defaults": {"timeout": 30, "retries": 3}, "production": {"timeout": 30, "retries": 3, "host": "prod.example.com"}, "staging": {"timeout": 30, "retries": 3, "host": "staging.example.com"} } ``` The expansion is correct and complete. The downside is that any deduplication benefit from anchors is lost — if you have 50 services all inheriting from the same defaults anchor, the JSON will repeat those defaults 50 times. For machine consumption this is fine; for human readability or file size concerns it may matter. Merge key support (the '<<' operator) is technically a YAML extension, not part of the core spec, and some strict parsers reject it. CocoConvert supports merge keys, but if you're writing your own conversion script using a library like Python's PyYAML, you need to load with yaml.full_load() or yaml.safe_load() — both support merge keys — and avoid yaml.load() without a Loader argument, which has been deprecated for security reasons since PyYAML 5.1.
Converting YAML to JSON Programmatically
When you need to convert files in bulk, integrate conversion into a build pipeline, or process YAML files programmatically, a command-line or scripted approach is more practical than a web tool. Here are the most reliable methods. **Python (most portable option):** ```python import yaml, json, sys with open(sys.argv[1], 'r') as f: data = yaml.safe_load(f) print(json.dumps(data, indent=2, ensure_ascii=False)) ``` Use yaml.safe_load(), not yaml.load(). The safe variant disables Python-specific object deserialization that could execute arbitrary code in malicious YAML files. The ensure_ascii=False parameter preserves Unicode characters rather than escaping them to \uXXXX sequences. **Node.js:** ```javascript const yaml = require('js-yaml'); const fs = require('fs'); const data = yaml.load(fs.readFileSync(process.argv[2], 'utf8')); console.log(JSON.stringify(data, null, 2)); ``` js-yaml uses YAML 1.2 by default from version 4.0 onward. If you're on an older version, check your package.json — versions below 4.0 use YAML 1.1 rules and will coerce 'yes'/'no' to booleans. **yq (command-line tool):** ```bash yq -o=json eval '.' input.yaml > output.json ``` yq is a purpose-built YAML processor that supports JSON output with a single flag. It handles multi-document files, anchors, and merge keys correctly. Install it via Homebrew on macOS ('brew install yq') or download the binary from the GitHub releases page for Linux and Windows. For one-off conversions without installing anything locally, the [CocoConvert YAML to JSON tool](/convert/yaml-to-json) remains the fastest option.
Validating Your Output Before Using It
Converting a file without validating the output is how subtle bugs make it into production. A JSON file can be syntactically valid while still containing semantically wrong data — the type coercions described earlier are a good example. Here's a practical validation checklist. **Syntax validation.** Run your output through a JSON linter. Most code editors (VS Code, JetBrains IDEs) will flag syntax errors automatically when you open a .json file. From the command line, Python's built-in json.tool is reliable: 'python3 -m json.tool output.json > /dev/null' exits with code 0 on valid JSON and prints the error location on failure. **Schema validation.** If you have a JSON Schema for your target format — OpenAPI specs, AWS CloudFormation templates, and Kubernetes CRDs all publish schemas — validate against it using a tool like ajv-cli: 'ajv validate -s schema.json -d output.json'. This will catch type mismatches that a syntax check would miss. **Diff against expected output.** If you're converting a file that has a known-good JSON equivalent, diff the two files after normalizing key order. The jq tool can sort keys deterministically: 'jq --sort-keys . output.json > normalized.json'. Key ordering in JSON objects is not semantically significant, but unsorted keys make diffs noisy. **Check numeric types explicitly.** If your YAML file contained values like '1.0' or '0755' that might have been coerced, grep the JSON output for the expected strings. A quick 'grep -n "0755" output.json' will immediately tell you whether the octal literal survived as a string or was converted to an integer. Taking five minutes to validate output before committing it to a repository or deploying it to a service is almost always faster than debugging a production incident caused by a boolean that should have been a string.