View Single Post
  #5   Report Post  
Keith A. Hughes
 
Posts: n/a
Default Why DBTs in audio do not deliver (was: Finally ... The Furutech

S888Wheel wrote:

snip

I said


That is ridiculous. If all of them scored 94% it would be reasonable to

say
this?



Kieth said


I believe Bob is talking about 95% confidence interval, *not* 95%
scores.


Yes I know.


Then talking about "scoring 94%" is meaningless.

Kieth said

And yes, it is very common to require a 95% confidence
level.


How often is 94% confidence level results regarded as a null when one is
seeking 95% confidence results which is in fact a subjective choice. Bottom
line is the tests were inconclusive in this particlular case.


You clearly don't understand statistical analysis. There is a null
hypothesis that states, basically, that the populations (in this
case, say the sound of Amp A and the sound of Amp B) are not
different. You set your confidence level *prior* to your analysis,
and yes 95% is ubiquitous, and you either meet that level, or you
don't. There is no gray area.

Irrespective of how "close" you may get to reaching the confidence
level, you're either in, or out. When out, the results *ARE*
conclusive - always. That conclusion is "you cannot reject the
null hypothesis at the 0.05 level", *by definition*. Period. You
cannot say for example "well, it was close, so they're probably
different". You *can* calculate what confidence level within which
you *can* reject the null hypothesis, if you want (it's
approximately 78%, by my calculations, for a result of 30/48
versus baseline expectations of 24/48). A number that would be
considered insignificant by all statisticians I've worked with.

But again, by expanding the CI, you risk erroneously rejecting the
null hypothesis unless the population is very large (i.e.
sufficiently so to approximate a two-tailed bell curve).
snip

No, the variance in those abilities are the *cause* of the bell
curve.


I was using the predicted bell curve outcome one would get if there are no
audible differences. That bell curve is dictated by the number of samples.


The "Bell" curve is dictated by varying response levels. The
degree to which the specific curve approximates a normal
distribution *is* a function of population size (or number of
samples if you prefer). Hence the need for a very high CI when
using small populations (like 15 panelists for eg.), as you do
*not* have a high degree of confidence that the population has a
normal distribution.

Kieth said

That's why sample size (panel size in this context) is so
important, and why the weight given to one or two individuals
performances, in any test, must be limited.


I took that into consideration when I claimed the test results were
inconclusive.


Well, again, I think your missing the common usage of "conclusive"
relative to statistical analysis. As stated previously, there are
only two outcomes of statistical analysis (i.e. typical ANOVA),
reject or accept the null hypothesis. Either is conclusive, as
this one apparently was. The data don't allow you to reject the
null hypothesis at even the 0.1 level.

So, you cannot say a difference was shown. You can't say there was
no difference either. This appears to be the genesis of your
"inconclusive" apellation. But this is an incorrect
interpretation, as detailed above.

That is why I suggested the thing that should have been done was
further testing of those individuals and the equipment that scored near or
beyond that which the bell curve predicts. That is why i don't run around
claiming this test proves that some people hear differences. Let's also not
forget that we have no tests on listener or system sensitivity to subtle
audible differences.


Acutally, we *assume* variation in listener abilities and
responses. Without such, there would be no bell curve, and
statistical analysis would be impossible. The *only* requirement
is that the test population is sufficiently large such that they
approximate the population as a whole, relative to reponse to the
stimuli under test. Again, the smaller the test population, the
tighter the CI *must* be due to the lower confidence in the test
population having the same distribution as the total population.

So we have unknown variables.


Yes, always. Given appropriate controls and statistics, the
effects of most are accounted for in the analysis.

Further the use of many amps
in no obvious pattern for comparisons introduces another variable that could
profoundly affect the bell curve in unknown ways if some amps sound like each
other and some don't.


Well, not having the article, I can't comment one way or another.

All of this leaves the interpretation of those results
wide open.


Well, no, as stated previously. It may, however, leave the
*question* of audibility open to further study.

snip

You should read more scientific literature then. It is the most
commonly used confidence level IME. Especially with limited
population size.


I will look into it but i will be quite surprised if that number does not
heavily depend on the sample sizes. It has to vary with sample size.


Not usually. It varies with the criticality of the results. The
confidence one has in extrapolating the results to the general
population, irrespective of CI, increases as a function of sample
size.

snip

Nonsense, of course they do. The fact that there are two-tailed
bell curves, in most population responses, is the genesis for use
of high confidence intervals. Because there *are* tails - way out
on the edges of the population, allowance for these tails must be
made when comparing populations for significant heterogeneity.


Maybe you didn't get what i was trying to say. What does and does not fall
within the predictions of a bell curve depends heavily on the number of
samples.


You don't seem to understand what a Bell curve is. ALL data is in
the curve, so I don't know what you're trying to say by "does and
does not fall within the predictions of a bell curve". Bell curves
don't predict anything, they merely incorporate all data points,
and the shape is the result of a two-tailed normal distribution.
You can, with a normal population (your Bell curve) predict the %
of the population that fits within any X standard deviations from
the mean. That's what were doing with CI's, we're *assuming* a
normal distribution - a requirement for statistics to work
(unfortunately).

snip

So you are drawing definitive conclusions from one test without even knowing
the sample size?


Only from the data you presented (as I've said, I don't have the
article). You picked individual results (such as the 30/48 vs
baseline of 24/48) as indicative of a significant outcome. I've
shown you, through the most common statistical tool used, that
that 30/48 and 24/48 are not significantly different.

Based on the data you provided, the conclusion of not rejecting
the null hypothesis seems perfectly sound.

I think you are leaping without looking. You are entitled to
your opinions. I don't find your arguements convincing so far.


Maybe you need a better understanding of statistical analysis, and
its limitations.

Where do I claim definitve results are required in support of some postulate?


Everywhere, as far as I can tell.

I would say definitve results are required for one to make claims of definitve
results.


The definitive result of failing to reject the null hypothesis, to
a chosen confidence level, requires no more data than was
apparently presented in the article. You seem to be confusing "no
difference was shown at the x.xx level" with "there are no
differences". The latter can never be said based on failure to
reject the null.

Certainly such a test as a part of a body of evidence could be seen as
supportive but even that I think is dodgy given some of the results as far as I
can see would call for further testing and the absense of listener sensitivty
alone makes it impossible to make any specific claims about the results. I


As long as the panel sensistivity is representative of the general
population, sensitivity is irrelevant. No matter the test
protocol, one *must* assume the sensitivities are representative,
unless the sensitivity of the whole population is known, and a
panel is then chosen that represents a normal distribution
congruent with the overall population. And this never happens.

never said the test was not valid due to it's failure to reject the null. I
simply said I think the results are such that calls for further investigation
and are on the border between a null a positive. Not the sort of thing one can
base definitive conclusions on.


It is clearly definitive for the panel, under the conditions of
the test.


It's instructive to also note that peer reviewed journals publish,
not infrequently, two or more studies that contradict one another.
You seem to believe this can't be the case, because the "wrong"
ones would be sent back for "correction".


Nonsense. what I think will be sent back is a poor analysis.


Which you've yet to illustrate in this case.

If the analysis
fits the data it won't be sent back. If someone is making claims of definitive
conclusions based on this test one is jumping the gun. Scientific research
papers with huge sample sizes and apparent definitive rsults are ususally
carefully worded to not make definitive claims. That is a part of propper
scientific prudence.


True. Does this article say "there are no sonic differences
between amps", or does it say (paraphrasing of course)
"differences between amps couldn't be demonstrated"? I understood
it to be the latter.

Kieht said

snip
Kieth said

snip

I never suggested the data should be ignored or was valueless, only that it was
inconclusive. I think the many uncontroled variables in this specific test are
problematic. Don't you?


I don't have the article, so I don't know how many variables were
uncontrolled. The sensitivity issue is, I believe, a red herring
however.

BTW, if you're going to repeatedly use my name in replying, common
courtesy would be to spell it right - at least once.

Keith Hughes