View Single Post
  #15   Report Post  
Nousaine
 
Posts: n/a
Default Blindtest question

"Harry Lavo" wrote:

Thomas -

Thanks for the post of the Tag Mclaren test link (and to Tom for the other
references). I've looked at the Tag link and suspect it's going to add to
the controversy here.


Actually there's no 'controversey' here. No proponent of amp/wire-sound has
ever shown that nominally competent amps or wires have any sound of their own
when played back over loudspeakers.

The only 'controversey' is over whether Arny Kreuger's pcabx tests cab with
headphones and special programs can be extrapolated to commerically available
programs and speakers in a normally reverberant environment.

The Tag-M results are fully within those expected given the more than 2 dozen
published experiments of amps and wires.

y comments on the test follow.

From the tone of the web info on this test, one can presume that Tag set out
to show its relatively inexpensive gear was just as good as some
acknowledged industry standards. But....wonder why Tag choose the 99%
confidence level?


Why not? But you can analyze it any way your want. That's the wonderful thing
about published results.

Being careful *not* to say that it was prechosen in
advance? It is because had they used the more common and almost
universally-used 95% level it would have shown that:

* When cable A was the "X" it was recognized at a significant level by the
panel (and guess whose cable probably would "lose" in a preference test
versus a universally recognized standard of excellence chosen as "tops" by
both Stereophile and TAS, as well as by other industry publishers)

* One individual differentiated both cable A and combined cables at the
significant level

Results summarized as follows:

Tag Mclaren Published ABX Results

Sample 99% 95% Actual Confidence

Total Test

Cables
A 96 60 53 e 52 94.8% e
B 84 54 48 e 38 coin toss
Both 180 107 97 e 90 coin toss

Amps
A 96 60 53 e 47 coin toss
B 84 54 48 e 38 coin toss
Both 180 107 97 e 85 coin toss

Top Individuals

Cables
A 8 8 7 6 94.5%
B 7 7 7 5 83.6%
Both 15 13 11 11 95.8%

Amps
A 8 8 7 5 83.6%
B 7 7 7 5 83.6%
Both 15 13 11 10 90.8%

e = extrapolated based on scores for 100 and 50 sample size

In general, the test while seemingly objective has more negatives than
positives when measured against the consensus of the objectivists (and some
subjectivists) in this group as to what constitutes a good abx test:


This is what always happens with 'bad news.' Instead of giving us contradictory
evidence we get endless wishful 'data-dredging' to find any possible reason to
ignore the evidence.

In any other circle when one thinks the results of a given experiment are wrong
they just duplicate it showing the error OR produce a valid one with contrary
evidence.

TEST POSITIVES
*double blind
*level matched

TEST NEGATIVES
*short snippets
*no user control over switching and (apparently) no repeats
*no user control over content
*group test, no safeguards against visual interaction
*no group selection criteria apparent and no pre-training or testing


OK how many of your sighted 'tests' have ignored one or all of these positives
or negatives?

The results and the summary of positives/negatives above raise some
interesting questions:


No, not really. All of the true questions about bias controlled listening tests
have been addressed prior.


*why, for example, should one cable be significantly identified when "x" and
the other fail miserably to be identified. This has to be due and
interaction between the characteristics of the music samples chosen, the
characteristics of the cables under test, and perhaps aggravated by the use
of short snippets with an inadequate time frame to establish the proper
evaluation context. Did the test itself create the overall null where
people could not differentiate based soley on the test not favoring B as
much as A?

* do the differences in people scoring high on the two tests support the
idea that different people react to different attributes of the DUT's. Or
does it again suggest some interaction between the music chosen, the
characteristics of the individual pieces, and perhaps the evaluation time
frame.

* or is it possible that the abx test itself, when used with short snippets,
makes some kinds of differences more apparent and others less apparent and
thus by working against exposing *all* kinds of differences help create more
*no differences* than should be the result.

* since the panel is not identified and there was no training, do the
results suggest a "dumbing down" of differentiation from the scores of the
more able listeners? I am sure it will be suggested that the two different
high scorers were simply random outliers...I'm not so sure especially since
the individual scoring high on the cable test hears the cable differences
exactly like the general sample but at a higher level (required because of
smaller sample size) and the high scorer on the amp test is in much the same
position.

if some of these arguments sound familiar, they certainly raises echoes of
the issues raised here by subjectivists over the years...and yet these
specifics are rooted in the results of this one test.

I'd like to hear other views on this test.


These results are consistent with the 2 dozen and more other bias controlled
listening tests of power amplifiers and wires.


"Thomas A" wrote in message
news:ahwVa.6957$cF.2308@rwcrnsc53...
(Nousaine) wrote in message
...
(Thomas A) wrote:

Is there any published DBT of amps, CD players or cables where the
number of trials are greater than 500?

If there difference is miniscule there is likely that many "guesses"
are wrong and would require many trials to reveal any subtle
difference?

Thomas

With regard to amplifiers as of May 1990 there had been such tests. In

1978
QUAD published an erxperiment with 576 trials. In 1980 Smith peterson

and
Jackson published an experiment with 1104 trials; in 1989 Stereophile

published
a 3530 trial comparison. In 1986 Clark & Masters published an experiment

with
772 trials. All were null.

There's a misconception that blind tests tend to have very small sample

sizes.
As of 1990 the 23 published amplifier experiments had a mean average of

426 and
a median of 90 trials. If we exclude the 3530 trial experiment the mean

becomes
285 trials. The median remains unchanged.


Ok thanks. Is it possible to get the numbers for each test? I would
like to see if it possible to do a meta-analysis in the amplifier
case. The test by tagmclaren is an additional one:


Thanks for the reference.