Revolutionary Codec or Overhyped Experiment?

The Claims

I first encountered this on Reddit, when user u/Background-Can7563 posted this on the r/compression subreddit:

ADC Codec - Version 0.80 released

The ADC (Advanced Differential Coding) Codec, Version 0.80, represents a significant evolution in low-bitrate, high-fidelity audio compression. It employs a complex time-domain approach combined with advanced frequency splitting and efficient entropy coding.

As a person who is very interested in lossy audio compression, I just had to check it out.

Subband Division (QMF Analysis)

The input audio signal is meticulously decomposed into 8 discrete Subbands using a tree-structured, octave-band QMF analysis filter bank. This process achieves two main goals:
Decorrelation: It separates the signal energy into different frequency bands, which are then processed independently.
Time-Frequency Resolution: It allows the codec to apply specific bit allocation and compression techniques tailored to the psychoacoustic properties of each frequency band.

Okay, so it is time-domain, but splits the audio into multiple subbands, and processes those differently. I see.

Advanced Differential Coding (DPCM)

Compression is achieved within each subband using Advanced Differential Coding (DPCM) techniques. This method exploits the redundancy (correlation) inherent in the audio signal, particularly the strong correlation between adjacent samples in the same subband.
A linear predictor estimates the value of the current sample based on past samples.
Only the prediction residual (the difference), which is much smaller than the original sample value, is quantized and encoded.
The use of adaptive or contextual prediction ensures that the predictor adapts dynamically to the varying characteristics of the audio signal, minimizing the residual error.

Okay, so it uses DPCM. Or rather, ADPCM since it's adaptive, I guess.

Contextual Range Coding

Just range coding? No other details? Okay.

This is extremely... interesting, but also rather confusing. It is basically QMF + ADPCM + range coding. What's "a significant evolution" about it?

Fortunately, there is a link to its homepage.

I was hoping that it would answer all the questions I had, but instead, I was surprised:

A closed source project? At this day and age?
This contains nothing but marketing copy and hype
It's patented at Zenodo? You can't file patents at Zenodo. It's not a patent office.

I had to take a deeper look.

The Marketing Copy

Wow. Reading the homepage gives off a lot of red flags. Claims about

"redefining audio compression"
"exceptional compression efficiency"
"perceptual quality"
"challenging [...] frequency-domain processing"
"advanced [...] time-domain compression"
"sophisticated filter bank architecture"
"state-of-the-art entropy coding"
"notable leap in performance".

That's a lot of claims for a new codec. Never mind that it's a time domain codec. And that's just the first paragraph of the homepage! But I mean, maybe we shouldn't dismiss something easily--what if those are all true, and I just haven't heard of it earlier?

This reminds me of that one image format FLIF, which (while it didn't make exaggerated claims like ADC) turned out to become one of the technologies in the lineage that led to the creation of JPEG XL. So yes, new codecs can become awesome codecs.

But ADC is a completely different thing. FLIF had charts, comparisons, standardization, specs, source code, examples, explanations. Even a proper LGPL v3 license. ADC had none of those but vague and unverifiable claims.

Extraordinary Claims

Since it's a lot to tackle in one paragraph, let's take some of the most... outrageous claims and verify them one by one.

"Operates directly on the raw audio waveform"

Unlike codecs that map the entire signal to the frequency domain (e.g., MDCT-based codecs), ADC operates directly on the raw audio waveform, capitalizing on temporal correlation for extreme efficiency. This approach results in ultra-low latency and robustness against temporal artifacts often associated with block-based transforms.

So it operates directly on the raw audio waveform. Any person knowledgeable enough would understand it to mean exactly one thing: take the ADC (analog-to-digital converter) output as raw float or integer data, and store them with or without lossless compression to a file. That's it. So I guess that includes raw LPCM WAV, FLAC, and even WavPack.

However, we know that this is not the case. It applies QMF analysis, so it's no longer operating directly on the raw audio waveform. It's operating on the resulting QMF subbands, and that is not the raw audio waveform.

"Four-Band Filter Bank Processing"

ADC employs a highly optimized four-band filter bank to decompose the input signal.

Waveform Division: The incoming audio waveform is precisely split into four distinct subbands.

Perceptual Isolation: Each subband is then processed independently. This isolation allows the compressor to apply tailored, high-resolution processing to specific frequency ranges, optimizing the noise floor distribution without relying on complex psychoacoustic models across the entire spectrum.

So we can't really say anything about "highly optimized" but if you read this entire section, this is basically just explaining how QMF works.

However, at 4 subbands, that means each band occupies 5,512.5 Hz of bandwidth, and that is far from specific. Even if you were to "improve" this to 8 subbands (*wink wink*), that is still just 2,756.25 Hz. Again, extremely wide. So I can say that, yeah, the compressor can apply tailored processing to each specific frequency range, but I really can't see the possibility of "high-resolution processing to specific frequency ranges" that would enable any meaningful "psychoacoustic model across the entire spectrum".

"Advanced Contextual Coding"

The primary compression gain is achieved within the subbands using a novel, predictive entropy method:

Contextual Time-Domain Compression: Each subband's waveform is compressed using an Advanced Contextual Coding scheme. This algorithm learns and exploits the statistical dependencies between successive samples and neighboring subbands, resulting in highly efficient prediction and reduced residual energy.

High-Performance Range Coding: The final stage employs Range Coding (an evolution of Arithmetic Coding) to represent the predictive residuals. This powerful entropy encoder achieves near-optimal compression ratios, making ADC exceptionally efficient even at very low bitrates.

So basically, each subband is encoded with something similar to ADPCM. However, "Advanced Contextual Coding" is definitely something that needs checking. (Spoiler alert: it's unsupported. There is no clear evidence of a novel or existing algorithm of that kind.)

It also claims that "it learns [...] the statistical dependencies between [...] neighboring subbands" but I highly doubt it (spoiler alert: it doesn't).

Also, "High-performance range coding" is basically redundant. Not that range coding doesn't help, but it's similar to saying "sweet sugar" or "solid granite"--it's just... true. An inherent attribute. That doesn't really say anything.

"A New Era of Audio Quality"

Quality Over Perfect Reconstruction

The previous iteration of the ADC lineage was defined by its commitment to Perfect Reconstruction (PR) after the first encoding pass. ADC decisively moves beyond the PR constraint. This strategic decision—trading the strict mathematical PR property for perceptual optimization—is the source of ADC’s massive quality improvement.

Dude. This is a lossy codec. It's not something to brag about. This codec is expected to prefer perceptual audio quality over perfect reconstruction. Why was this codec even aiming for lossless reconstruction when it's supposed to be a lossy codec?

Performance Benchmark

ADC is engineered for the modern audio landscape, where perceptual quality is paramount:

Frequency-Domain Comparable: ADC’s performance on individual waveforms is often comparable to the most established frequency-domain codecs (e.g., LC3, AAC), challenging the notion that transforms are mandatory for high-fidelity compression.

Superior Transient Handling: By leveraging the time-domain filter bank, ADC exhibits inherent strengths in preserving fast temporal transients and minimizing pre-echo artifacts.

Support for joint stereo.

When was perceptual audio quality not paramount? It's been the standard metric for all lossy audio codecs, ever since the beginning. What is this even trying to say?

Also, did you notice that exact phrasing? "ADC's performance on individual waveforms". Hmm. Sounds like cherry-picking.

And this literally claims that ADC's performance is often comparable to AAC. Where are the data? The comparison tables? How much is "often"? "Comparable" at what bitrate? What metric?

Worst: "[...] challenging the notion that transforms are mandatory" is a strawman. Nobody argues that transforms are mandatory. I mean, FLAC doesn't use transforms, and yet it achieves high-fidelity (actually perfect reconstruction) compression. The argument this codec is pushing is that "MDCT bad, ADPCM not frequency-domain so it's good automatically" (even though it's using QMF which is in a way frequency domain but let's ignore that for now). But the industry settled on MDCT after decades of psychoacoustic research, because it's just superior for lossy compression purposes. Codecs do not use MDCT "just because it's what everyone else uses", but because it's the best choice for this purpose.

Besides, I have to make something extremely clear for all readers, in case you are also slightly swayed by the marketing in most places.

MDCT is mathematically perfectly reconstructible. Yes. I said it. You can push random values through MDCT, and get the exact original values after doing IMDCT, given proper overlap-add (TDAC). Except for floating point inaccuracies, but that's literally negligible, inaudible, and more of a limitation of computers and not a limitation of math or MDCT.

Just because an audio waveform went through MDCT, doesn't mean it will automatically get its transients smeared. That's a huge, blatant lie. Neither do huge window sizes matter. It's not practical, but you can literally push a 5 minute song as one window through MDCT, do IMDCT on it again, and get practically the exact same audio back. No mathematical difference.

MDCT with proper TDAC results in perfect reconstruction

The truth is that it's quantization of MDCT coefficients that cause smearing of transients within the window. Not MDCT nor the window size. Modern codecs already fixed this issue by using various short window sizes (to contain the smearing to a very small window) and recognizing frames with transients so they don't quantize those frames too much. That's the truth.

So it's not MDCT that is bad. It's quantization. And guess what--ADC quantizes too.

...and yeah. ADC supports joint stereo. That's revolutionary.

Criticism of the Claim is not Criticism of the Experiment

Look, I know that creating a lossy codec is not easy. It takes a lot of knowledge, effort, research, experimentation, and even after all that, the result might not be as good as you thought it would be.

I'm all for experimentation--every good idea begins with a weird experiment. Nothing wrong or embarrassing about that.

But what this codec does is make absurdly unbelievable, contradicting, and maybe even outright false claims, and it's an insult to all the people who actually create the codecs the world is running on top of. Such claims imply a burden of proof, and I don't see proof.

Roadmap

This article is the first of a series of articles focused on debunking ADC and its claims.

I have performed tests on the format's design, its audio artifacts, its bitrate behavior, created spectrogram comparisons, performed ABX listening tests, and even verified its GPL compliance. And the results I discovered were disturbing.

Article map

The following articles will be published over time. They will become links as each article is published.

Revolutionary Codec or Overhyped Experiment? <- you are here
"Audio DNA" and Other Things that Mean Nothing
Inside ADC: A Deep Analysis of the Codec
The Spectrograms Don't Lie: ODG and Where the Bits Actually Go
Bonus: The SIC Image Codec and the Same Pattern of Overclaiming

Closing

If the homepage alone raised this many red flags, just wait until we look at what's actually inside. The deeper you dig into ADC, the worse it gets. The evidence speaks for itself, and I intend to let it.