Menu

"Audio DNA" and Other Things That Mean Nothing

"Audio DNA" and Other Things That Mean Nothing

Spectrogram view of extreme chessboard aliasing in a linear sine sweep
Category:
Article
Tags:
ADC
Languages:
ENJA
Signature:
Download

This is part 2 of a series of articles about ADC. The previous article is here.

In the previous article, we discussed how I found out about the ADC codec and how it presented its homepage that time. However, it has since went multiple iterations of websites (as if asking an LLM to generate marketing copy endlessly).

Let's go through what happened after the events in the previous article.

My Reddit Post

I decided to give ADC v0.82 a scientific but unconventional test, using only spectrograms without ABX. I clearly indicated in the Reddit post that the test was done without ABX, so take it with a grain of salt, and all opinions are subjective.

But still, I think spectrograms are useful in showing how codecs work.

All images in the post showed 4 codecs in order:

  • lossless WAV (16-bit, 44100 Hz)
  • ADC (16-bit, 44100 Hz)
  • Opus (16-bit, resampled to 44100 Hz using --rate 44100 on opusdec)
  • xHE-AAC (16-bit, 44100 Hz)

Dynamic Range

-88.8 dBFS sine wave, then silence, then 0 dBFS sine wave
-88.8 dBFS sine wave, then silence, then 0 dBFS sine wave

We see that ADC just... removed the quiet sine wave while other codecs preserved it. Also, the harmonic distancing is uneven.

Noise

White noise, brown noise, then bandpassed noise
White noise, brown noise, then bandpassed noise

We now see that ADC has this weird -6 dB attenuation at around 13 kHz, and it's very audible.

Pure Tone

1 kHz sine, 10 kHz sine, then 15 kHz sine, all at almost full scale
1 kHz sine, 10 kHz sine, then 15 kHz sine, all at almost full scale

We see in ADC lots of irregularly spaced harmonics, and for 10 kHz, there was a 12 kHz harmonic that was just -6 dB from the main tone.

Sine sweep

Now this is where things get interesting.

Sine sweep from 20 Hz to 20 kHz, in increasing amplitudes
Sine sweep from 20 Hz to 20 kHz, in increasing amplitudes

That's a chessboard pattern. This amount of aliasing and mirroring is extremely unacceptable.

Sure, looking at Opus, there's also noticeable aliasing, but at -37 dB, it's pretty much inaudible compared to the 0 dB sine sweep.

Quality Degradation Over Time

Holes in the spectrum
Holes in the spectrum

As the encoded ADC song plays, it continues to degrade starting from 10 seconds, until it reaches the point where the audio becomes unbearable.

The Author's Response

The author was very quick to post a rebuttal in their HydrogenAudio thread.

Thank you for the comprehensive analysis. You've highlighted precisely the fundamental limitations of transform-based codecs that ADC's DPCM architecture was designed to avoid.

I mean, yeah, Opus and xHE-AAC, along with most other codecs of modern times, are transform codecs. But the analysis actually shows way worse problems in ADC's architecture that the other codecs seem to do just fine.

Your spectral displays are perfect demonstrations of the inherent flaws of MDCT/FFT approaches:

  1. Pre-echo & Temporal Smearing: xHE-AAC's "clean" tones come at the cost of temporal resolution. That -37dB aliasing in Opus' sweep? Classic window function trade-off. ADC's block structure maintains exact temporal boundaries.

No. xHE-AAC has switchable block sizes (short windows for transients, long for stationary). It has better temporal resolution than DPCM when it matters. That aliasing in Opus only happened during the full-scale 0 dBFS sweep, so I am guessing it might be due to clipping, or due to internal 48 kHz resampling. It is not a "classic window function trade-off".

  1. Artificial Harmonic Generation: The "many harmonics" in xHE-AAC/Opus on pure tones are mathematical artifacts of basis function mismatch, not signal content. ADC's predictor either reconstructs or doesn't - no spectral splatter.

ADC had a -6 dB harmonic. That's six decibels. That's 50% the power of the pilot tone. Modern codecs have harmonics -60 to -80 dB. That's like, I dunno, a thousand times quieter. Don't lie.

  1. Band-Limiting Artifacts: That 16kHz low-pass in xHE-AAC at 60kbps? Transform codecs must hard-cut to preserve bits. ADC's noise shaping preserves full bandwidth until absolutely necessary.

The test literally showed a -6 dB notch at 13 kHz. Not to mention it degrades to... mess over time. It's performing worse at bandwidth preservation, if we're going to be honest here.

Also, xHE-AAC cheating by doing a 16 kHz low-pass is not a "must hard-cut to preserve bits". Most people including me can't even hear above 16 kHz anyway, so xHE-AAC is right in putting bits where it matters.

  1. Synthetic Signal Failure: -88.8dBFS sine waves expose quantization inefficiency in frequency domain. In perceptual coding, that signal is inaudible and should be discarded, not preserved with harmonics.

The author is talking about the source lossless WAV file that had the -88.8 dBFS sine wave. That's the intended lossless audio. The fact that Opus and xHE-AAC preserved such tiny detail, while ADC just literally dropped it, means something is wrong with ADC. It might be inaudible, but why remove it when you can still probably fit it at 64 kbps? Lossy codecs are only supposed to throw away audio when the bitrate budget is already full, and throwing it away will not result in audible artifacts.

Your "chessboard pattern" in sweeps is actually ADC's greatest strength: consistent 1-second block independence enabling perfect parallelization and [...]

Did the author just... say that having chessboard aliasing patterns in sweeps is actually ADC's greatest strength? "The catastrophic aliasing / mirroring isn't a bug, it's a feature"? Wow.

[...] 0ms seeking - something no transform codec can achieve due to overlap-add requirements.

Opus has like... 2.5 to 5 ms latency for a packet. AAC can seek to any frame. What are you even talking about? This is a lie.

The "quality degradation over time" you observed? That's transform codecs' temporal masking struggle with sustained complexity. DPCM's adaptive predictor resets cleanly each block - no state accumulation.

But... but that's literally what their codec did, not Opus nor xHE-AAC. It's their codec, not the transform codecs. ADC is literally struggling with sustained complexity. And if ADC is indeed working in the temporal space, then masking shouldn't be an issue, right? Unless... they're saying they use DPCM, but in fact, they're still working in frequency space somehow. Makes me wonder. The "adaptive predictor" (is that ADPCM? or DPCM?) doesn't seem to be resetting cleanly each block. Lots of state accumulation.

While transform codecs excel at synthetic signals (where their mathematical basis matches the input), they struggle with real-world audio's transient nature. ADC prioritizes temporal precision over spectral perfection - a deliberate architectural choice.

The 64kbps limit you tested is indeed the archiver's floor. For streaming, 128kbps+ is the target where these trade-offs become advantageous.

I mean, one of the encoded ADC files used up to 185 kbps even when I explicitly requested 64 kbps in one test I did. That's far from the 128 kbps target they're talking about. Opus and xHE-AAC followed the request just fine, even if not exactly due to inherent VBR.

Your analysis confirms ADC achieves something remarkable: competitive performance without inheriting transform codecs' fundamental mathematical constraints. The artifacts you see aren't bugs - they're features of a different paradigm.

...this isn't remarkable nor competitive. Stop lying.

As a final note, consider this: ADC is developed by a single individual without an audio engineering team, corporate funding, or the decades of research backing MPEG codecs. That it can even be compared to xHE-AAC (developed by hundreds of engineers over 15+ years with millions in funding) on any metric is remarkable. The artifacts you've identified are essentially the "price" of an architecture that delivers zero-latency seeking and perfect parallelization—features transform codecs fundamentally cannot match due to their windowed overlaps and spectral dependencies.

For a solo developer's project to provoke this level of analysis against industry giants speaks volumes. Most transform codecs would collapse completely if forced into ADC's architectural constraints. Perhaps the question isn't "why does ADC have these artifacts?" but rather "why, after 30 years of transform coding, do we still accept 50ms seek delays and poor parallel scaling as inevitable?"

That explains everything, I guess.

Author Gets Banned from HydrogenAudio

The author kept on arguing with HydrogenAudio users, and once I was able to provide ABX test results showing just how easy ADC is to ABX, they got very defensive, and perhaps because they've been using ChatGPT to think of replies, they got a lot of facts wrong, such as

The recent discussions, including tests at very low bitrates (e.g., 22 kbps), have highlighted where the current implementation meets its natural limits versus its intended operational range.

Nobody tested ADC at 22 kbps. It was xHE-AAC that used 22 kbps, and that's because xHE-AAC decided it didn't need to use all the requested 64 kbps for a simple sine wave. ChatGPT is getting a lot of facts wrong, among other things.

Since they got too persistent, the HydrogenAudio mods decided to delete the thread.

And behold! They created another thread (which is against HydrogenAudio rules) just to start the conversation again!

They got a proper warning again:

[...] Repeating over and over how your software is so much better than everything else, is not "scientific discussion" material. Let users praise you for your work. The next offense to my ruling will be an official warning, affecting your ability to post.

I also added more ABX test results, and after the author created another HydrogenAudio account (which is forbidden), they finally got banned for TOS 16 (use of AI), 7 (creation of 2 threads on the same topic after the first got locked), and 12 (using more than one account)

Current Website

Current website as of 2026 March
Current website as of 2026 March

So as of 2026 March, the website talks about stuff like "Redefining the DNA of Audio Compression" and "Experience real native sound".

Unfortunately, none of those mean anything. If you read the page, it talks about how audio compression relied on MDCT and this causes temporal smear and pre-echo, which "blurs the sharper edges of your music," and how ADC operates directly on the "raw audio waveform," which preserves the sound's "DNA".

Let's talk about audio compression for a second here.

Audio Compression

As you might have guessed, there is no such thing as an "audio DNA". But audio compression is very much real, and we encounter it everyday whether we know it or not.

MDCT Is Not Evil

As I stressed in the previous article, MDCT itself does not cause pre-echo or temporal smear. MDCT is mathematically reversible, and with proper TDAC, exhibits one of the features ADC is so obsessed about: perfect reconstruction.

So what causes pre-echo and temporal smear, then? It's quantization. So let's just not quantize then? Unfortunately, the real compression happens due to quantization so it's not exactly unavoidable. But if the MDCT coefficients were never quantized in the first place, performing the inverse (IMDCT) would restore the original audio samples. Perfectly. Exactly.

However, the audio compression industry has relied on MDCT because by quantizing in the MDCT (frequency) domain, we can be smart about it: quantizing heavily the frequencies that the human ear wouldn't notice anyway, and putting extra care in quantizing frequencies that are very audible to the human ear. That way, we save on bitrates where it matters, and the ear doesn't know any better that it's listening to a degraded (compressed, lossy) audio.

So Why Use QMF subbands?

Honestly, I don't know. Subband coding is not some escape from MDCT. It isn't a new thing anyway--many old codecs such as MP2 (the predecessor to MP3), MP3 itself, and SBC (that Bluetooth audio codec that is infamous for sounding bad) all use QMF subbands as an important part of the codec.

But besides the fact that it's pretty much ancient already, it doesn't really provide an advantage of MDCT. It does not offer "total immunity to pre-echo" as the ADC website says. Not only do we lose the ability to perform psychoacoustics, but we also introduce aliasing to the audio.

Despite the website's claims of "Surgical accuracy: Our 8-band filter bank enables high-granularity noise distribution without ever going out of the time domain", I highly doubt that 8-bands is enough to call it "surgical".

So it's basically losing the benefits of MDCT, and gaining the disadvantages of QMF. It's a lose-lose situation.

Psycho-what?!

Psychoacoustics is the study of how the human brain interprets audio. Or at least, that's how I understand it. When we implement psychoacoustics during quantization, we understand what frequencies can be thrown away, and what needs to be protected to maintain sound quality.

With MDCT, and assuming AAC long frame, each coefficient is only 21.5 Hz wide. That is extremely precise. But with that "surgical accuracy" of 8 QMF subbands, each subband is a gaping 2,750 Hz per band. That's like, 128 times wider than a typical MDCT coefficient. How is that surgical?

That means we lose the capability to precisely model how psychoacoustics work, so we can't quantize selectively. At least not as selectively as with MDCT.

ADC not having psychoacoustics is not an advantage. It's a limitation of what QMF can properly do. The author of ADC proudly applauding ADC for not removing the inaudible frequencies above 20 kHz is a failure in itself. A lossy codec's job is to discard inaudible audio. The human ear can only hear up to 20 kHz, and that's actually being optimistic. Even I can't hear above 16 kHz. A smart lossy audio codec such as AAC or Opus, heck, even MP3 will throw those frequencies away to save bitrate, without making an audible difference on the audio, and yet ADC is willing to waste bits on data that is inaudible in the end.

As the website says,

Psychoacoustic Magic: We don't just focus on mathematical identity; we prioritize transparency and "air," modeling noise where the human ear can't find it.

Press X to doubt.

Roadmap

This article is the second of a series of articles focused on debunking ADC and its claims.

Article map

The following articles will be published over time. They will become links as each article is published.

  1. Revolutionary Codec or Overhyped Experiment?
  2. "Audio DNA" and Other Things that Mean Nothing <- you are here
  3. Inside ADC: A Deep Analysis of the Codec
  4. The Spectrograms Don't Lie: ODG and Where the Bits Actually Go
  5. Bonus: The SIC Image Codec and the Same Pattern of Overclaiming

Closing

The marketing language on ADC's website is a string of buzzwords that either mean nothing, describe ordinary existing techniques, or actively misrepresent what those techniques do. "Audio DNA" is not a thing. "Operates directly on the raw audio waveform" is not true once you apply QMF. "Surgical accuracy" with 8 bands is the opposite of surgical. And "psychoacoustic magic" that proudly preserves inaudible frequencies above 20 kHz is not a feature--it's a sign that the codec doesn't understand what it's supposed to be doing.

The author's response to criticism confirmed this pattern. Rather than engaging with the actual data, every failing was reframed as an intentional design choice, and every advantage held by competing codecs was dismissed as a "fundamental flaw" of those codecs. Chessboard aliasing is perfect recreation. Dropped quiet sine waves are correct perceptual decisions. Quality degradation over time is transform codecs' problem, apparently, for some reason, even though it's ADC that exhibited this problem.

The words "Audio DNA" are on the website to sound impressive to people who don't know just how complex the math and research behind audio codecs are, and how they actually work. But now you do.

In the next article, we will go deeper, into the codec itself: what it actually does under the hood, and what that tells us about those claims.