AudioBanter.com - View Single Post - Why DBTs in audio do not deliver (was: Finally ... The Furutech CD-do-something)

(Nousaine) wrote in message ...

See his full text below:
On 1 Jul 2003 15:10:59 GMT, (ludovic mirabel)
wrote:
_ I can't possibly answer separately and in detail all those disagreeing with me I have to live whatever remains to me of life.In particular I'll not get in between Mrs. Miyaguchi and Nousaine in the subargument of their own.
So 4 interactive complementary answers for economy.
First let's define what I am NOT saying. (No apologies for the
capitals. Google doesn't allow italics or underlining)
I'm NOT saying that the local variant of DBTs known as ABX is the
wrong tool to use in research, with careful design,proper statistics,
selected research topics and last but not least SELECTED, TRAINED
SUBJECTS.

I have no knowledge, opinion, interest in the particulars of such
research because it does not research differences between real-life
audio components.
I'm NOT saying that those audiophiles who enjoy ABX and believe that
they get valid results should not use it. As long as they do not think
that their results are valid for anyone but themselves.
ALSO as long as they keep in mind that their negative, null
results are valid for them at this particular stage of their ABX
training and musical experience. As long as they remember that if they
do not revisit their negative results they may be shutting themselves
off FOREVER from enlarging and enriching their hi-fi scope.
I'm NOT , emphatically NOT saying that individuals shouldn't use
methods of their choice to disguise the brand of components that they
compare. I have one such method, myself which serves me well but may
not suit others..
What I do object to is the belief that to "prove" your opinions re
"difference" or "no difference"- a necessary preliminary to
preference- (more about difference/preference in my answer to
Audioguy.) one has to undergo a procedure known as ABX.
Please, don't one of you tell me that "nobody says that". Speak
for yourself. Every week somebody says just that in RAHE. Sometimes
with a pseudo-objective semantical word games.: "But you have not had
a "controlled test" (meaning ABX of course) so it is only your
impression..".- as though it could be anything else. Or "Reviewers
should undergo cotrolled test for credibility" as though it msde
stupid reviewers into clever ones. And of course you Mr. Nousaine said
it many times.

There are a few unspoken assumptions here
FIRST UNPROVEN ASSUMPTION: everyone performs just as well when ABXing
and when listening for pleasure blinded or not blinded.
Perhaps they do, perhaps they don't. In fact I presented evidence
suggesting that there is MARKED DIFFERENCE in people's performance
when listening to simple pink noise signal as opposed to listening to
a musical signal
I then asked Mr Miyaguchi a simple, straightforward question Is
there a difference? Instead of of a simple YES= there is a
difference (or NO- there isn't any- if anyone really wanted to play
the fool) I got convoluted explanations of WHY there is , further
beautified by you Mr. Nousaine. Incidentally I asked you the very same
question one month ago and never got an answer.
Why this dodging?. You can bet I'd get plain NO if there were NO
DIFFERENCE. and EVERYBODY who performed flawlessly listening to
pink noise would do just as well listening to music. But MOST
performed abominably, (consistent with random guessing) when listening
to music.. Explanation? Music is "more complex". Quite! And what else
is new?.Most of us use our components to listen to music, not pink
noise. If your test has problems with music is it the right one to
assess the differences in MUSICAL performance of components?
Where is the evidence ? Evidence where I come from ( see my answer to
Audioguy concerning that) is EXPERIMENTAL evidence not circular
arguments like : "Why shouldn't it be so? It is good enough for
codecs isn't it?" I listen to music not codecs. And I need convincing
that pink noise is the best way to test a component for its transient
response to a cymbal, or rendition of a violin or a cello. I'd suggest
to researchers:"Try again"
All you can name since 1990 Mr. Nousaine are your own cable tests from
1995. Where I come from (again-sorry1) one gives the mag's name, the
date, the page so that I can find out what was the design, who
proctored, how many subjects, how many tests etc. Why so shy with
details?
You say: "Likewise the body of controlled listening test results
can be very useful to any individual that wishes to make use of them to guide
decisions (re component choice)"
Where does one find that body? Buried in the pyramids or Alberta tar
sands?
Why not name a few ,recent representative offshoots?. Just to show
that Consumer ABXing is any use other than for discomfiting the naive.

Where are the current tests on current components?. You're not
serious saying that there's nothing new under the sun. Any difference
between the 24 bit and 16 bit cdplayers? Which dacs truly cope with
jitter?, Are the Meridian and Tact room equalising systems different
from any other equaliser? Any differences between the various digital
sound processing systems. Come on Mr. Nousaine. Winning verbal contest
against L. Mirabel is not everything. There is such a thing as putting
more information on the table.
Of course you are at least consistent, You truly believe that
there is no difference between any components at all: By your lights
such objectivist luminaries as Krueger and Pinkerton are pinkish
dissidents: " Actually if nominally competent components such as
wires, parts, bits and
amplifiers have never been shown to materially affect the sound of reproduced
music in normally reverberant conditions why would ANYONE need to conduct more
experimentation, or any listening test, to choose between components?"
And again. The profession of faith: " Examination of the extant body
of controlled listening tests available contain enough information to
aid any enthusiast in making good decisions."

And what information is it?:
" As I've said before; there are many proponents of high-end sound of
wires, amps and parts ... but, so far, no one (in over 30 years) has
ever produced a single repeatable bias controlled experiment that
shows that nominally competent
products in a normally reverberant environment (listening room) have any sonic
contribution of their own.
Nobody! Never! How about some evidence? I'll believe in BigFoot ....just show
me the body!

And I'll believe that there are no diferences between components
when you prove that your "bias =controlled test" does not have biases
of its own
My other unanswered question to you one month ago was:
Where is the evidence that untrained, unselected
individuals perform identically, when comparing complex musical
differences between components for a "test" as they do when they just
listen. Reasoning that they should is plausible but reasoning that at
least some of them don't is not irrational. Convincing, controlled
experiment with random control subjects etc. is missing.
Next consider the anti-common sense assumption that Tom and
Dick do their DBT assignement equally well and both are an identical
match for Harry. Should they? They are not identical in any other task
aptitude, in their fingerprints or their DNA.
If you agree that they would differ how do you justify YOUR
challenges to all and sundry to prove their perceptions by ABX.
Perhaps they are as hopeless as I'm at that task. Perhaps a violinist
will hear differences in the rendition of violin tone when not
bothered by a "test" but be a terrible subject for ABXing. Impossible?
Where is your experimental evidence that this does not happen?.

Where is the experimentation to show that the poor ABX test
subjects
would perform identically sitting and listening at home?
Finally your telling ME that " I think Ludovic is "untrainable"
because he will accept only answers he already believes are true"
really takes the cake. .I'm not propunding any "test". You are. I have
no faith to stick to. You BELIEVE in ABX. It is MY right to ask YOU
for evidence. And it is your job to give it.
You know it all perfectly well because you know what "research" and
"evidence" mean. Why copy the tactics of those who have only ignorant
bluster to offer?

Ludovic Mirabel

We're talking about Greenhill old test not because it is perfect but
because no better COMPONENT COMPARISON tests are available. In fact
none have been published according to MTry's and Ramplemann's
bibliographies since 1990.

This is simply not true. I have published 2 double blind tests personally, one
of which covered 3 different wires subsequent to 1990.

I frequently see the distinction being made between audio components
and other audio related things (such as codecs) when it comes to
talking about DBT's. What is the reason for this?

In my opinion, there are two topics which should not be mixed up:

1) The effectiveness of DBT's for determining whether an audible
difference exists
2) The practical usefulness of using DBT's for choosing one audio
producer (component or codec) over another.

Actually if nominally competent components such as wires, parts, bits and
amplifiers have never been shown to materially affect the sound of reproduced
music in normally reverberant conditions why would ANYONE need to conduct more
experimentation, or any listening test, to choose between components? Simply
choose the one with the other non-sonic characteristics (features, price,
terms, availability, cosmetics, style...) that suit your fancy.

Indeed 20 years ago, when I still had a day job, Radio Shack often had the
"perfect" characteristic to guide purchase, which was "open on Sunday."
I am not knowledgeable enough to decide on differences between
your and Greenhill's interpretation of the methods and results.
In my simplistic way I'd ask you to consider the following:
PINK NOISE signal: 10 out of 11 participants got the maximum possible
correct answers: 15 out of 15 ie. 100%. ONE was 1 guess short. He got
only 14 out of 15.
When MUSIC was used as a signal 1 (ONE) listener got 15 corrects,
1 got 14 and one 12. The others had results ranging from 7 and 8
through 10 to (1) 11.
My question is: was there are ANY significant difference between
those two sets of results? Is there a *possibility* that music
disagrees with ABX or ABX with music?

No: it just means with the right set of music 2 dB is at the threshold. Don't
forget that listening position affects this stuff too. Also Mr Atkinson would
say that perhaps the lower scoring subjects didn't have personal control of the
switching.

Even between two samples of music (no pink noise involved), I can
certainly believe that a listening panel might have more or less
difficulty in determining if they hear an audible difference. It
doesn't follow that music in general is interfering with the ability
to discriminate differences when using a DBT.

Actually it simp,y shows that pink noise and other test signals are the most
sensitive of programs. It may be possible to divulge a 'difference' with noise
that would never be encountered with any known program material.

It's also possible that certain programs, such as Arny Kreuger's special
signals, might disclose differences that may never be encountered with
commercially available music (or other) programs. So?

I would aoppreciate it if would try and make it simple leaving
"confidence levels" and such out of it. You're talking to ordinary
audiophiles wanting to hear if your test will help them decide what
COMPONENTS to buy.

As before; you haven't ever been precluded from making any purchase decisions
from scientific evidence before; why should any disclosure affect that now or
in the future.

Examination of the extant body of controlled listening tests available contain
enough information to aid any enthusiast in making good decisions. Even IF the
existing evidence shows that wire is wire (and it does) how does that preclude
any person from making any purchase decision? In my way of thinking it just
might be useful for a given individual to know what has gone before (and what
hasn't.)

I still don't know how this cannot do anything but IMPROVE decision making?

See my first comments. It's too easy to mix up the topic of the
sensitivity of DBT's as instruments for detecting audible differences
with the topic of the practicality of using DBT's to choose hifi
hardware. The latter is impractical for the average audiophile.

No it's not. Just like 0-60 times, skid-pad and EPA mileage tests simply cannot
be made by the typical individual that doesn't mean that they cannot be used to
improve decision-making. Likewise the body of controlled listening test results
can be very useful to any individual that wishes to make use of them to guide
decisions.

Otherwise the only information one has is "guidance" from sellers, anecdotal
reports and "open" listening tests. The latter , of course, is quite subject to
non-sonic influence.

So IMO, a person truly interested in maximizing the sonic-quality throughput of
his system simply MUST examine the results of bias controlled listening tests
OR fall prey to non-sonic biasing factors, even if they are inadvertent.

Who can argue with motherhood? The problem is that there are NO
ABX COMPONENT tests being published- neither better nor worse, NONE.
I heard of several audio societies considering them. No results.
Not from the objectivist citadels: Detroit and Boston. Why?. Did they
pan out?

Given the two dozen controlled listening tests of power amplifiers published
through 1991 doesn't it seem that no one needs to conduct more? Wires? The last
test I published was in 1995. Not late enough?

Why not? No manufacturer has EVER produced a single bias controlled experiment
that showed their wires had a sound of their own in over 30 years. Why should
one expect one now?

I certainly can't do it; although I've given it my level (no pun intended)
best. IOW, I can't produce an experiment that shows nominally competent wires
ain't wires .... 'cuz they ain't.

I can think of a couple of reasons:

1. It's expensive and time consuming to perfom this type of testing
2. The audible differences are, in actuality, too subtle to hear, ABX
or not. Why bother with such a test?

Why bother in performing a sound quality "test" that the manufacturers of the
equipment can't produce? IF amps ain't amps; wires ain't wires and parts ain't
parts then why haven't the makers and sellers of this stuff produced repeatable
bias controlled listening tests that show this to be untrue?

Then there is the possibility that you seem to be focussing on,
ignoring the above two:

3. DBT's in general may be decreasing the ability to hear subtle
differences.

Actually they preclude the ability to "hear" non-sonic differences.

Which of the the above reaons do you think are most likely?

Moving away from the question Greenhill was investigating
(audible
differences between cables) and focusing only on DBT testing and
volume differences: it is trivial to perform a test of volume
difference, if the contention is being made that a DBT hinders the
listener from detecting 1.75 dB of volume difference. Especially if
the listeners have been trained specifically for detecting volume
differences prior to the test.
However, such an experiment would be exceedingly uninteresting, and I
have doubts it would sway the opinion of anybody participating in this
debate.

The volume difference was just a by-effect of a comparison between
cables.
And yes, TRAINED people would do better than Greenhill's "Expert
audiophiles" ie rank amateurs just like us. Would some though do
better than the others and some remain untrainable? Just like us.

I think Ludovic is "untrainable" because he will accept only answers he already
believes are true.

I have no doubt that there are some people who are unreliable when it
comes to performing a DBT test. In a codec test using ABC/HR, if
somebody rates the hidden reference worse than the revealed reference
(both references are identical), his listening opinion is either
weighted less or thrown out altogether.

What you are describing is 'reverse significance' which is typically a
inadvertant form of internal bias.

For what it's worth, I have performed enough ABX testing to convince
myself that it's possible for me to detect volume differences 0.5 dB
using music, so I doubt very highly that a group test would fail to
show that 1.75 dB differences on a variety of different music are not
audible using a DBT.

I can easily hear 1db difference between channels, and a change
of 1 db.
What I can't do is to have 80 db changed to 81 db, then be asked if
the third unknown is 80 or 81 dbs. and be consistently correct.
Perhaps I could if I trained as much as you have done. Perhaps not
Some others could, some couldn't. We're all different. Produce a test
which will be valid for all ages, genders, extent of training, innate
musical and ABxing abilities, all kinds of musical experience and
preference. Then prove BY EXPERIMENT that it works for COMPARING
COMPONENTS.
So that anyone can do it and if he gets a null result BE CERTAIN that
with more training or different musical experience he would not hear
what he did not hear before. And perhaps just get on widening his
musical experience and then rcompare (with his eyes covered if he is
marketing susceptible)
Let's keep it simple. We're audiophiles here. We're talking about
MUSICAL REPRODUCTION DIFFERENCES between AUDIO COMPONENTS. I looked
at your internet graphs. They mean zero to me. I know M. Levinsohn,
Quad, Apogee, Acoustat not the names of your codecs. You assure me
that they are relevant. Perhaps. Let's see BY EXPERIMENT if they do.
In the meantime enjoy your lab work.
Ludovic Mirabel

Are you really telling me that you didn't understand the gist of the
group listening test I pointed you to?

For one thing, it says that although people have different individual
preferences about how they evaluate codec quality, as a group, they
can identify trends. This, despite the variety of training, hearing
acuity, audio equipment, and listening environment.

Another point is that it would be more difficult to identify trends if
such a study included the opinions of people who judge the hidden
reference to be worse than the revealed reference (simultaneously
judging the encoded signal to be the same as the revealed reference).
In other words, there are people whose listening opinions can't be
trusted, and the DBT is designed to identify them.

That result identifies a form of experimental bias, does it not?

The last point is that I can see no reason why such procedures could
not (in theory, if perhaps not in practical terms) be applied to audio
components. Why don't you explain to me what the difference is (in
terms of sensitivity) between using DBT's for audio codecs and using
DBT's for audio components?

Darryl Miyaguchi

There is no difference. It seems to me that this poster may have never taken a
bias controlled listening test or, if he has, the results didn't fit with prior
held expectations. It's much easier to argue with the existing evidence than
prove that you can hear things that no human has been able to demonstrate, when
not peeking.

As I've said before; there are many proponents of high-end sound of wires, amps
and parts ... but, so far, no one (in over 30 years) has ever produced a single
repeatable bias controlled experiment that shows that nominally competent
products in a normally reverberant environment (listening room) have any sonic
contribution of their own.

Nobody! Never! How about some evidence? I'll believe in BigFoot ....just show
me the body!