View Single Post
  #12   Report Post  
Posted to rec.audio.high-end
Arny Krueger Arny Krueger is offline
external usenet poster
 
Posts: 17,262
Default AES article: hi-rez more like analog?

"bob" wrote in message

On Dec 14, 1:01 pm, "Arny Krueger"
wrote:

I pasted their matrix into Excel and tried to do some
quick sums. I came up with Test Condition 1 = 23/60 and
Test Condition 2 = 31/60.


To be precise, Condition 2 is 31/54, since one subject's
results were not available. (It happens.)

IOW, one test
produced an outcome that was worse than random guessing,
and the other was random guessing.


Actually, both could be random guessing, since neither
hits even the 90% confidence level.

Results that are worse than random guessing may cause
some head scratching, but they are not all that unusual
in experiments like this where communication between the
listeners can affect the outcome.


Not the case here, where each subject was tested
individually.


Perhaps not that individually.

Also, unlike an ABX test, there is no "wrong" answer here.


Sure there is.

Inconsistency is wrong.

Either conversion could be judged
better. (Unless, of course, you're Philips and you sell
hi-rez converters.)


The most probable explanation for worse-than-random
results is that that some of the listeners were basing
their results on their perceptions of what other
listeners were perceiving, and the total number of
independent responses was far less than what you get
from a naive count of the actual responses.


IOW, the actual situation was not 23 independent
responses of 60 trials, but maybe more like 4
independent responses of 10 independent trials, so the
true numbers were so small that statistics doesn't
really apply.


Again, this is an inaccurate description of the actual
test. The responses were independent.


But how independent?

I think these results would pretty well explain
themselves to just about anybody, were they reproduced
any place within the actual paper.


Short answer - the outcome was random guessing, and both
the test itself and the statistical analysis were
greviously flawed.


The test itself was not grievously flawed.


It acted that way. A good test of perception is either random or correlated
with the stimulus. When the correlation is negative, something went badly
wrong.

The only
obvious problem was the one Scott mentioned--using
different mikes for the two conditions. But that only
matters if you're comparing the results of under the two
conditions. Looked at individually, the two test
conditions tell us nothing.


That's a sign of a flawed test - it tells us less than we expected to find
out.

The statistical analysis is another story.


I guess we can chalk this paper up as yet another "Proof
by complex statistical analysis" which defies common
sense.


Agreed.