View Single Post
  #58   Report Post  
Posted to rec.audio.high-end
Scott[_6_] Scott[_6_] is offline
external usenet poster
 
Posts: 642
Default A Brief History of CD DBTs

On Dec 18, 12:18*pm, wrote:
On Tuesday, December 18, 2012 12:18:29 PM UTC-5, Scott wrote:
That being typically breaking out ABX and failing
to ever control for same sound bias


There is no such phenomenon as same sound bias.


That is plainly wrong. Biases come in all sorts of flavors including a
bias towards components sounding the same.

It has never been demonstrated experimentally. If you have data that shows otherwise, please share it with us.


Well, I am so glad you asked. I have some pretty good data of one
clear cut example of same sound bias at work.
Let's take a trip down memory lane with Mr. Howard Ferstler and an
article he wrote for The Sensible Sound in which he dud an ABX DBT
between two amplifiers and concluded that it demonstrated the two
sounded the same. Let's look a little closer at what really went down

issue 88 of The $ensible Sound (Nov/Dec
2001, pp.10-17)

Howard wrote in his article on page 14

"According to the statistical analysis, and given the number of
trials I did, the likelihood of those scores being theresult of
anything but chance (even the one where I scored more than 60%right)
exceeded 95%." "Even though a 68% correct score looks like there may
have been significant audible differences with the 17 out of 25
mindnumbing trials I did, that score does achieve a 95% confidence
level, indicating that the the choices were still attributable to
chance."

John Atkinson point out to him the following facts.

“As has been pointed on this newsgroups, not only by myself but also
by Arny Krueger, you were misrepresenting the results, presumably
because they were "blatantly at odds with [your] belief systems." Yes,
scoring 17 out of 25 in a blind test does almost reach the 95%
confidence level (94.6%, to be pedantic). But this means that there is
almost 19 chances in 20 that you _did_ hear a difference between the
amplifiers. You incorrectly wrote in a published article that your
scoring 17 out of 25 was more than 95% due to chance. However it's
actually almost 95% not_ due to chance. In other words, your own tests
suggested you heard a difference, but as you already "knew" there
wasn't an audible difference, you drew the wrong conclusion from your
own data.
Curiously, The Sensible Sound has yet to publish a retraction. :-)”
John Atkinson
Editor, Stereophile

So we have here a classic example of same sound bias affecting the
analysis of the data of an ABX DBT between amps. But wait, it gets
better. Check out how Howard tries to reconcile his positive result
with his same sound bias.

Howard Ferstler:

" The data you are referring to was but a small part of the series.
It was a fluke, because during the last part of that series of trials
I was literally guessing. I just kept pushing the button and making
wild stabs at what I thought I heard. After a while, I did not bother
to listen at all. I just kept pressing the same choice over and
over."

IOW he was deliberately falsifying data in order to get a null result.
I’d say that is proof positive of a same sound bias on the part of Mr.
Ferstler wouldn’t you? And this ABX DBT was published in The Sensible
Sound despite the fact that the analysis was corrupted by a clear same
sound bias but so was the data, deliberately!
Ironically, due to an apparent malfunction in Tom Nousaine’s ABX box
the attempt at spiking the results to get a null serendipitously
wrought a false positive. So on top of that we have a mal functioning
ABX box that Tom Nousiane has been using for all these ABX DBTs.

Didn’t you at some point cite this very test and other tests conducted
with Tom Nousaine’s ABX box as "scientific evidence?"

Ouch.


or even calibrate the sensitivity
of the test. Without such calibration a single null result tells us
very little about what was and was not learned about the sound of the
components under test.


There is no need to "calibrate the sensitivity" of an ABX test of audio components, anymore than there is a need to calibrate the sensitivity of a DB pharmaceutical trial.


My goodness gracious talk about getting it all wrong. First ABX DBTs
involves playback equipment. Pharaceutical trials do not so there is
nothing to "calibrate" in pharmeceutical trials. BUT they do use
control groups! That is in effect their calibration. without the
control group the results mean nothing because there is no
"calibrated" base to compare them to. So in effect they most
defintiely are calibrated or they are tossed out as very very bad
science and just plain junk. That is bias controlled testing 101.

In both cases, we care only about subjects' sensitivity to a given dose (or in the case of ABX, a given difference). We aren't trying to determine the minimum dose/difference the subjects
might respond to.


Wrong! In the pharmaceutical tests we don't care a bit about a
subject's sensitivity to a given dose. we care about the subject's
sensitivity as compared to the *control group* That is the
calibration!


Regardless, of course a single null result tells us very little.


Gosh that is what i have been saying. So you agree. Great.


But my original post did not present a single result. It presented a substantial number of tests (not all null, btw) conducted over a long period of time by widely disparate groups.


And my comments about how it is very unscientific to put so much
weight in one null result was not a response to your original post.

However I do have to ask, did you include any of the tests by Howard
Ferstler? That would be most unfortunate. Did you include tests
conducted with Tom Nousaine's defective ABX box? That would also be
unfortunate. Funny what we learn when we dig a little. Such is the
point of peer review. To say that some the evidence presented in these
audio magazines is anecdotal is to be overly generous. That should be
obvious after what The Sensible Sound allowed to pass and be reported
as an ABX DBT of amplifiers.