View Single Post
  #33   Report Post  
Bob Marcus
 
Posts: n/a
Default Why DBTs in audio do not deliver (was: Finally ... The Furutech CD-do-something)

(S888Wheel) wrote in message ...
I said


I said I have never seen any scientifically valid empirical evidence on the
matter.You sent six articles only two of which had the raw data I was

asking
about.The two that had raw data had not been published in a peer reviewed
scientific journal so they do not qualify as scientifically valid.


Bob said


Huh? You seem not to understand the purpose of peer review. It does
not determine and is not the arbiter of "scientific validity."


Perhaps you don't understand it. It more or less is such an arbiter of such
things. Any scientific claims that have not been through peer review and
publishing is regarded as junk science without merit.


This is oversimplistic. There is plenty of good scientific research
conducted every day which never makes it into peer reviewed journals
for all sorts of reasons. (And I've seen some real garbage pass peer
review, too.) The mere fact that something didn't appear in a
peer-reviewed journal may mean nothing more than that the
peer-reviewed journals in the field weren't interested in that
particular topic. Frankly, I'm not sure any peer-reviewed journal
would be interested in whether there were audible differences between
a couple of consumer-grade amps.

Well at least according
to the research scientists I have asked. Maybe you know more about it than they
do.In the world of science when one does research via experimentation that
value of that data hinges on peer review.


Actually, its value depends on its replicability. Data which hasn't
been peer-reviewed may be replicable, and peer-reviewed data may not
be.

snip

Bob said

Let's remember that ALL negative ABX results are inconclusive. That
doesn't mean they can't tell us anything.

I was refering to the positive results in that test.


I got the distinct impression that there were no positive results in
that test. Isn't that what you were complaining about?

I said

Wrong.In the Dave Clark test listener #2 got 30/48 correct with a

statistical
relaibility of hearing a difference of 94% Listener #6 got 26/48 with a
statistical probablity of 84% chance of hearing differences. Listener #15

got
15/21 correct with an 81% chance of hearing a difference. Given the fact

that
no tests were done to measure listener's hearing acuity and no tests were

done
to varify test sensitivty to known barely aduible differences one cannot
conclude anything other than those listeners may have heard differences.

Bell
curves have no meaning without data on the listener's hearing acuity. The
logicqal thing would have ben to do follow up tests on those listeners to

see

if it was just a fluctuation that fits within the predicted bell curve or

if
they really could hear differences as the results suggest. Hence there is

no
conclusive evidence from this test that as you say"no single listener was

able
to reliably identify amps under blind conditions.



Bob said


I don't recall this article, but this conclusion seems to be well
supported by the data you cite. If the best performance of the group
wasn't statistically significant at a 95% confidence level, then it's
perfectly reasonable to say that no listener was able to identify the
amps in the test. (Note: Saying they couldn't is not the same as
saying they can't. As I noted above, we can never say definitively
that they can't; we can only surmise from their--and everybody
else's--inability to do so.)


That is ridiculous. If all of them scored 94% it would be reasonable to say
this?


If all of them had scored 94%, then we would have had a statistically
significant aggregate result. The fact that one of a panel of at least
15 did so is rather unsurprising. It's called an outlier.

No. It all depends on how they fall into the bell curve. but even this is
problematic for two reasons. 1. the listeners were never tested for
sensitivty
to subtle diferences. The abilities of the participants will profoundly
affect
any bell curve.


Sure, but is there any reason to believe this was a particularly badly
chosen panel? If a difference is audible, somebody in a randomly
selected panel of young to middle-aged men will probably nail it. I
gather from your report (remember, I don't have this article) that
nobody did.

2. many different amps were used. We have no way of knowing
that we didn't have a mix of some amps sounding different and some sounding the
same.


Aggregating data across different comparisons is meaningless. That
goes for individuals as well as panels, by the way.

The Counterpoint amp not only was identified with a probablity of 94% .
Given there wre 8 idfferent combinations it fell out of the predicted bell
curve if my math is right. Bottom line is you cannot draw definitive
conclusions either way. If the one listener had made one more correct ID he
would have been well above 94%. I doubt that one can simply make 95%
probability a barrier of truth. It is ridiculous. Bell curves don't work that
way.


But statistics DOES work that way. You have to specify your confidence
interval before you do your analysis. You can't say, "Well, he got
94%, and that's close enough."

Besides no follow up on any scores that push or cross the predicted the
outcome of the bell curve is an incomplete study IMO. now lets not forget the
failure to even test the test for sensitivty to subtle differences.


What subtle differences would you have tested for? Since we don't know
what it is that would distinguish these amps, how could we have
screened the panel in advance?

snip

Bob said

Yeah, that's close enough that it might be worth someone's while to
redo the test. But the original researcher is under no obligation to
second-guess his own work.


It isn't second guessing it is reasonable follow up on inconclusive data. No he
is under no obligation to do more testing but as the test stands anyone who is
trying to be scientific about this or even reasonable would have to acknoledge
that the tests as they stand are quite inconclusive. That is my primary
position on the data in those tests.


To repeat: ANY result that falls below the confidence level is
inconclusive. If you think some result is wrong, you need only
replicate the test.

Bob said

People who doubt that result, however, have
been free to try to replicate it for a couple of decades, I think.


My only criticism of the test itself is the lack of testing for listener and
system sensitivity. It would be a mistake to repeat that mistake.


By replication, we usually include efforts to improve on the test and
correct for methodological weaknesses, to see if we get a different
result. And now you've tossed in a second red herring: system
sensitivity. Care to define that and explain just how you'd expect the
researchers to "test" for it?

Bob said

That's how science works, my friend. You can't just stamp your foot
and say, "I don't find this conclusive!"


No foot stamping is needed. The test was quite inconclusive. Had the test been
put infront of a scientific peer review with the same data and the same
conclusions that panel would have sent it back for corrections. The analysis
was wrong scientifically speaking.


I thought we'd already agreed that you were unqualified to determine
what would and would not pass scientific muster.

Bob said

You have to come up with a
new result.


No. One does not have to do the test over to argue that the *conclusions* drawn
from that test are in eror.


Um, yes you do. At the very least, you need some conflicting data on
your side. Otherwise, you are merely talking through your hat.

One does not have to do a test over to point out
errors in protocol.


Errors in protocol do not automatically invalidate findings. They
merely suggest reasons why the findings MIGHT not be replicable. To
know whether they are indeed not replicable, somebody must do another
experiment, or at least find other, conflicting data.

The lack of testing for sensitivty lead the results open to
multiple interpretations. That is a fact.


That is a baseless opinion. You have offered no plausible reason to
doubt that either the panel or the equipment used was sufficiently
sensitive to produce reliable results. You haven't even defined in
measurable, technical terms what you mean by "sensitivity," let alone
offered a scientifically sound basis for claiming that any particular
level of "sensitivity" is necessary.

Bob said

That nobody--nobody!--has come up with the slightest bit
of real evidence to cast doubt on that conclusion in all this time is,
while not conclusive, certainly revealing.


Given that there seems to be an issue of interpretation of data, You are
welcome to yours as is Tom. Since I haven't agreed with such interpretations so
far I am only interested in the data. The data I have seen thus far on the
audibility of amps is inconclusive. Very inconclusive.


If so, then why is it that no one, in any university psychology or
electrical engineering department in the world, has published anything
on this subject in the last decade? Inconclusive science tends to
invite feverish research. And yet the leading experts in the field
appear to have no curiosity about this matter at all. Could it be that
they--who are a bit more expert in these matters than you or
I--interpret this data differently than you do? Could it be that
they're right?

bob

BTW: Would you be willing to return Tom's favor to you by forwarding a
copy of this research to me?