informational

What Is Zstandard (zst)? Facebook's Compression Algorithm

2026-05-17 8 min read

The Short Version: What Zstandard Actually Is

Zstandard — usually shortened to zstd and identified by the .zst file extension — is a lossless data compression algorithm developed by Yann Collet at Facebook (now Meta) and released as open source in 2016. The Linux kernel adopted it in version 5.16. Facebook uses it across its own infrastructure to compress everything from database snapshots to log files at massive scale. The format is now an internet standard, documented in RFC 8878. At its core, zstd is a general-purpose compressor built around a combination of a dictionary-based approach (LZ77) and a modern entropy coder called Asymmetric Numeral Systems (ANS). What that means in practice: it compresses and decompresses data very fast, often without sacrificing much in terms of final file size compared to slower algorithms. The .zst extension is the single-stream compressed format. You will also encounter .tar.zst, which is a tar archive compressed with zstd — the same concept as .tar.gz or .tar.bz2. If you downloaded a Linux package, a database backup, or a large dataset from a research repository recently, there is a reasonable chance it arrived as a .zst or .tar.zst file.

How Zstandard Compares to gzip, bzip2, and xz

Compression tools are usually judged on three axes: compression ratio (how small the output gets), compression speed, and decompression speed. Zstandard was explicitly designed to beat gzip on all three simultaneously, which sounds implausible but largely holds up in benchmarks. Facebook's own published benchmarks on the Silesia corpus — a standard test set used across the compression community — show zstd at its default level (level 3) achieving a compression ratio of roughly 2.884x at about 500 MB/s compression speed and over 1,600 MB/s decompression speed. Gzip at its default level achieves around 2.743x at about 130 MB/s compression and 400 MB/s decompression. So zstd is faster in both directions and compresses slightly better by default. Bzip2 gets a better ratio than gzip (around 3.0x on the same corpus) but is dramatically slower — often under 20 MB/s for compression. xz at its default settings achieves ratios above 3.2x but can compress at under 10 MB/s, making it impractical for anything time-sensitive. The interesting wrinkle is that zstd has 22 compression levels. At level 1, it prioritizes speed above everything else — useful for real-time compression of network traffic. At levels 19–22 (the ultra range), it starts to rival xz's ratio while still decompressing faster. Most users and systems never leave the level 3–9 range. On Linux, you can specify the level directly: `zstd -9 myfile.tar` produces a more compressed file than `zstd -3 myfile.tar`, at the cost of more CPU time during compression.

Dictionary Training: The Feature Most People Skip

One of zstd's less-publicized but genuinely powerful features is dictionary compression. Standard compression works by finding repeated patterns within the file being compressed. For small files — say, a 2 KB JSON payload — there simply is not enough data for the algorithm to build up a useful internal model of patterns, so the compression ratio is poor or even negative (the compressed file ends up larger than the original). Dictionary training solves this. You feed zstd a representative sample of your data — hundreds or thousands of similar small files — and it generates a dictionary file that captures common patterns. Both the compressor and decompressor then reference this shared dictionary. Facebook reported using this technique to achieve 6x compression on small JSON blobs that would otherwise compress to nearly their original size. In practice, you train a dictionary like this on the command line: `zstd --train /path/to/samples/* -o mydict.zst-dict`. Then compress with `zstd -D mydict.zst-dict smallfile.json`. The decompressor needs the same dictionary file, which is the main operational constraint — you have to distribute or store the dictionary alongside your compressed data. This feature is most relevant for database engineers, backend developers compressing API responses, and anyone dealing with large volumes of structurally similar small files. For typical end users compressing a folder of photos or documents, standard zstd without a dictionary is perfectly adequate.

Where You Actually Encounter .zst Files

Zstandard has moved well beyond Facebook's internal infrastructure. Here are the concrete places you are likely to run into .zst files: **Linux package managers.** Arch Linux switched its package format from .tar.xz to .tar.zst in 2020, citing dramatically faster installation times. Fedora followed. When you run `pacman -S` or `dnf install`, the packages being downloaded and unpacked are .zst compressed. **The Linux kernel itself.** Since kernel 5.16, zstd is a supported compression format for the kernel image (bzImage) and initramfs. Some distributions now ship zstd-compressed kernels by default because boot times improve noticeably. **Database and storage systems.** Facebook's RocksDB supports zstd natively. So does ClickHouse, a popular analytics database, where zstd is one of the recommended codecs for column compression. PostgreSQL 15 added zstd support for logical replication messages. **Large dataset downloads.** Many machine learning datasets on Hugging Face and academic repositories are now distributed as .zst or .tar.zst files. If you work with Common Crawl data, for example, you will encounter .warc.gz files — but newer exports are increasingly .zst. **Game assets and software distribution.** Mozilla uses zstd in Firefox's update mechanism. The Zstd format is also used internally by some game engines for asset streaming. For most of these cases, if you just need to open or extract the file on your own machine, your operating system's built-in tools or a utility like 7-Zip (version 19.00 and later supports .zst) will handle it without any conversion needed.

Opening and Converting .zst Files Without the Command Line

Not everyone wants to install command-line tools or learn compression flags. If you received a .zst file and need the contents, there are a few routes depending on your platform. **Windows:** 7-Zip added zstd support in version 22.00 (released July 2022). Right-click the .zst file, choose '7-Zip > Extract Here', and you are done. If you have an older version of 7-Zip, update it — the interface is identical, just the underlying support was missing in earlier builds. **macOS:** The Keka archiver supports .zst natively. The built-in Archive Utility does not as of macOS Sequoia. You can also install zstd via Homebrew (`brew install zstd`) and run `zstd -d file.zst` in Terminal. **Linux:** Zstd is almost certainly already installed, or available in your package manager as the `zstd` package. `zstd -d file.zst` decompresses it in place. For .tar.zst files, `tar --use-compress-program=zstd -xf file.tar.zst` works on most systems, or simply `tar -I zstd -xf file.tar.zst`. **Browser-based conversion:** This is where CocoConvert comes in. If you have a .zst file you need to decompress without installing anything locally, you can upload it to CocoConvert and extract the contents directly in your browser. This works well for single-stream .zst files of reasonable size. For very large archives (multi-gigabyte .tar.zst files) or files requiring a custom dictionary, a local tool will be more practical — browser-based tools have upload size limits and cannot reference external dictionary files. CocoConvert is honest about this: it handles the common case well, not every edge case.

Creating .zst Files: When It Makes Sense and When It Does Not

Zstandard is an excellent choice for compressing files you intend to share with technically sophisticated recipients, store in systems that already support it, or transfer over a network where decompression speed matters on the receiving end. It is a less obvious choice if you are sending a compressed file to someone who will open it on a default Windows installation without any additional software. Zip remains the most universally supported format for that use case — every modern operating system handles it natively with no extra tools. Gzip is universally understood on Linux and macOS. Zstd is gaining ground fast, but it is not yet at that level of ubiquity for casual file sharing. For archiving personal files, zstd at level 9 or higher is worth considering if you want good compression without the painful compression times of xz. Compressing a 10 GB folder of mixed documents and code: xz might take 8–12 minutes; zstd at level 9 might take 90 seconds with a slightly larger output. Whether that tradeoff is acceptable depends entirely on your priorities. To create a .zst file with CocoConvert, upload your source file, select .zst as the output format, and optionally choose a compression level if the interface exposes that setting. The default level will be appropriate for most purposes. Note that CocoConvert currently handles individual file compression to .zst; if you need to bundle multiple files into a .tar.zst archive, you would need to create the tar archive first locally, then compress it — or use a local tool like `tar -I zstd -cf output.tar.zst folder/`.

The Honest Summary: Is Zstandard Worth Learning?

Zstandard is not a niche format that might fade away. It is embedded in the Linux kernel, used by major databases, adopted by package managers across multiple distributions, and maintained by a large open-source community with corporate backing. The RFC standardization means it will be supported in software for the foreseeable future. For developers and system administrators, understanding zstd is increasingly a baseline skill rather than an advanced one. Knowing the difference between compression levels, when dictionary training pays off, and how .zst relates to .tar.zst will save real time when working with modern infrastructure. For everyday users, the main practical knowledge is simpler: if you encounter a .zst file, you now know it is a compressed file (not a video, not a document — just a container holding something else), and you have several straightforward ways to open it. Update 7-Zip on Windows, install Keka on macOS, or use CocoConvert if you prefer not to install anything. The one area where zstd genuinely has not won yet is casual consumer file sharing. Until operating systems support .zst natively out of the box the way they support .zip, it will remain a format that requires at least a small amount of deliberate setup on the recipient's end. That is a real limitation worth acknowledging. For everything else — server workloads, package distribution, database compression, large dataset archiving — zstd has become the sensible default, and for good reason.

← Browse all articles