Diffing LPCM vs. BRR Formats [Archive]

View Full Version : Diffing LPCM vs. BRR Formats

Karl Uppiano[_2_]

April 18th 09, 08:59 PM

I am interested in hearing what kind of information is being discarded when
going from linear PCM to bit-rate reduced formats like AAC, MP3 etc. In
theory, one could get the difference between the two formats by flipping the
phase 180 degrees on one file and then summing them back together. Listening
to audible comparisons of different formats and bit rates might be quite
interesting.

I have Googled a number of different search strings, but I got no hits.
Before I embark on my own experiments, I thought I'd ask around a bit. Has
anyone here done something like this, or can you point me to some existing
results?

Karl Uppiano[_2_]

April 22nd 09, 06:50 PM

"Karl Uppiano" > wrote in message
...
> I am interested in hearing what kind of information is being discarded
> when going from linear PCM to bit-rate reduced formats like AAC, MP3 etc.
> In theory, one could get the difference between the two formats by
> flipping the phase 180 degrees on one file and then summing them back
> together. Listening to audible comparisons of different formats and bit
> rates might be quite interesting.
>
> I have Googled a number of different search strings, but I got no hits.
> Before I embark on my own experiments, I thought I'd ask around a bit. Has
> anyone here done something like this, or can you point me to some existing
> results?

Anybody?

Dave Platt

April 22nd 09, 09:24 PM

> I am interested in hearing what kind of information is being discarded
> when going from linear PCM to bit-rate reduced formats like AAC, MP3 etc.
> In theory, one could get the difference between the two formats by
> flipping the phase 180 degrees on one file and then summing them back
> together. Listening to audible comparisons of different formats and bit
> rates might be quite interesting.

As a control, you should experiment with the same sort of
cancellation/nulling using two identical audio signals, time-shifted
by one or more samples.

This will help you learn to recognize the sorts of sonic artifacts
which occur purely as a result of time offsets in the signals being
nulled (it's a comb-filtering effect). This may help you adjust the
nulling tests you do between lossy-coded and original signals, so that
you actually end up hearing the effect of the coding (i.e. removed
or altered information content) and not just minor errors in setting
the timing between the two signals.

> I have Googled a number of different search strings, but I got no hits.
> Before I embark on my own experiments, I thought I'd ask around a bit. Has
> anyone here done something like this, or can you point me to some existing
> results?

A Google search on "lossy audio coding" turned up the following:

http://en.wikipedia.org/wiki/Audio_compression_(data)

which is a reasonable start.

As I understand it, most lossy audio codecs operate through two basic
mechanisms:

- Figuring out which portions of an audio signal are likely to be
audible to the human ear/brain system. The encoder attempts to
preserve these, but may discard (not-encode) other portions of the
audio which are below the expected threshold of audibility.

- Figuring out how accurately (or inaccurately) it can encode the
aspects of the signal it wants to keep. The intent here is to use
a more accurate encoding (e.g. more bits) in cases where inaccuracy
would probably be audible, and a less accurate encoding (fewer
bits) where the resulting distortion is likely to be inaudible. A
common example of this is that higher audio frequencies can usually
be encoded with fewer bits of accuracy... the distortion
(quantization noise) which results from this inaccuracy will lie at
frequencies near or above the human hearing range, and will thus be
difficult or impossible to hear.

So, in general, a lossy codec of this sort will discard some
frequencies entirely, and will have reduced accuracy at reproducing
the amplitude of other frequencies... thus adding some amount of
harmonic and intermodulation distortion.

Lossy codecs are usually block-oriented - they break the incoming
signal up into chunks, and encode each chunk individually (often with
some overlap between the chunks). Some encoders can reportedly be
prone to create a sort of "warbling" or "watery" sound, which might
occur if the encoder makes different decisions about encoding a
particular frequency range when it goes from one block to the next
(i.e. some frequencies might appear, disappear, reappear, etc.).

I understand that some of the encoding algorithms/implementations
can also have the effect of "smearing" transient signals or
creating pre- and post-echos of such transients. This can be seen as
a loss of accurate *timing* information about the signal, as opposed
to *frequency* information.

--
Dave Platt > AE6EO
Friends of Jade Warrior home page: http://www.radagast.org/jade-warrior
I do _not_ wish to receive unsolicited commercial email, and I will
boycott any company which has the gall to send me such ads!

Karl Uppiano[_2_]

April 22nd 09, 11:45 PM

"Dave Platt" > wrote in message
...
>> I am interested in hearing what kind of information is being discarded
>> when going from linear PCM to bit-rate reduced formats like AAC, MP3 etc.
>> In theory, one could get the difference between the two formats by
>> flipping the phase 180 degrees on one file and then summing them back
>> together. Listening to audible comparisons of different formats and bit
>> rates might be quite interesting.
>
> As a control, you should experiment with the same sort of
> cancellation/nulling using two identical audio signals, time-shifted
> by one or more samples.

One of my assumptions was that I would need some kind of vernier adjustment
capability (of both time and amplitude), since it is unlikely that two
completely different file formats would have exactly the same gain and time
offset. I would need to null out the majority of the sound by ear, most
likely.

> This will help you learn to recognize the sorts of sonic artifacts
> which occur purely as a result of time offsets in the signals being
> nulled (it's a comb-filtering effect). This may help you adjust the
> nulling tests you do between lossy-coded and original signals, so that
> you actually end up hearing the effect of the coding (i.e. removed
> or altered information content) and not just minor errors in setting
> the timing between the two signals.

We used to do "flanging" using analog tape recorders playing back identical
signals, sometimes with one inverted in phase. I assume the effect will be
similar.

>> I have Googled a number of different search strings, but I got no hits.
>> Before I embark on my own experiments, I thought I'd ask around a bit.
>> Has
>> anyone here done something like this, or can you point me to some
>> existing
>> results?
>
> A Google search on "lossy audio coding" turned up the following:
>
> http://en.wikipedia.org/wiki/Audio_compression_(data)
>
> which is a reasonable start.

Yes, there is lots of information about BRR on the web, but I could not find
anything specifically about differencing source files and their compressed
version. I was hoping to find a site featuring samples from someone who had
already tried this.

> As I understand it, most lossy audio codecs operate through two basic
> mechanisms:
>
> - Figuring out which portions of an audio signal are likely to be
> audible to the human ear/brain system. The encoder attempts to
> preserve these, but may discard (not-encode) other portions of the
> audio which are below the expected threshold of audibility.
>
> - Figuring out how accurately (or inaccurately) it can encode the
> aspects of the signal it wants to keep. The intent here is to use
> a more accurate encoding (e.g. more bits) in cases where inaccuracy
> would probably be audible, and a less accurate encoding (fewer
> bits) where the resulting distortion is likely to be inaudible. A
> common example of this is that higher audio frequencies can usually
> be encoded with fewer bits of accuracy... the distortion
> (quantization noise) which results from this inaccuracy will lie at
> frequencies near or above the human hearing range, and will thus be
> difficult or impossible to hear.
>
> So, in general, a lossy codec of this sort will discard some
> frequencies entirely, and will have reduced accuracy at reproducing
> the amplitude of other frequencies... thus adding some amount of
> harmonic and intermodulation distortion.
>
> Lossy codecs are usually block-oriented - they break the incoming
> signal up into chunks, and encode each chunk individually (often with
> some overlap between the chunks). Some encoders can reportedly be
> prone to create a sort of "warbling" or "watery" sound, which might
> occur if the encoder makes different decisions about encoding a
> particular frequency range when it goes from one block to the next
> (i.e. some frequencies might appear, disappear, reappear, etc.).

The first ever bit rate reduction I ever heard was at a National Association
of Broadcasters convention about 20 years ago. The "wateriness" was quite
objectionable. I don't remember what technology they were using at the time.
The compute horsepower was certainly not what it is today.

> I understand that some of the encoding algorithms/implementations
> can also have the effect of "smearing" transient signals or
> creating pre- and post-echos of such transients. This can be seen as
> a loss of accurate *timing* information about the signal, as opposed
> to *frequency* information.

This is exactly the kind of information that I'm trying to get an intuitive
sense of by listening to the mix-minus, or residual error information. I
think it would be fascinating to hear the effects of time smearing vs. lower
resolution.