View Single Post
  #16   Report Post  
Stewart Pinkerton
 
Posts: n/a
Default Blindtest question

On Wed, 30 Jul 2003 03:26:24 GMT, "Harry Lavo"
wrote:

From the tone of the web info on this test, one can presume that Tag set out
to show its relatively inexpensive gear was just as good as some
acknowledged industry standards. But....wonder why Tag choose the 99%
confidence level? Being careful *not* to say that it was prechosen in
advance? It is because had they used the more common and almost
universally-used 95% level it would have shown that:


Can anyone smell fish? Specifically, red herring?

* When cable A was the "X" it was recognized at a significant level by the
panel (and guess whose cable probably would "lose" in a preference test
versus a universally recognized standard of excellence chosen as "tops" by
both Stereophile and TAS, as well as by other industry publishers)


No Harry, *all* tests fell below the 95% level, except for one single
participant in the cable test, which just scraped in. Given that there
were 12 volunteers, there's less than 2:1 odds against this happening
when tossing coins. Interesting that you also failed to note that the
'best performers' in the cable test did *not* perform well in the
amplifier test, and vice versa.

You do love to cherry-pick in search of your *required* result, don't
you?

* One individual differentiated both cable A and combined cables at the
significant level

Results summarized as follows:

Tag Mclaren Published ABX Results

Sample 99% 95% Actual Confidence

Total Test

Cables
A 96 60 53 e 52 94.8% e
B 84 54 48 e 38 coin toss
Both 180 107 97 e 90 coin toss

Amps
A 96 60 53 e 47 coin toss
B 84 54 48 e 38 coin toss
Both 180 107 97 e 85 coin toss

Top Individuals

Cables
A 8 8 7 6 94.5%
B 7 7 7 5 83.6%
Both 15 13 11 11 95.8%

Amps
A 8 8 7 5 83.6%
B 7 7 7 5 83.6%
Both 15 13 11 10 90.8%

e = extrapolated based on scores for 100 and 50 sample size

In general, the test while seemingly objective has more negatives than
positives when measured against the consensus of the objectivists (and some
subjectivists) in this group as to what constitutes a good abx test:

TEST POSITIVES
*double blind
*level matched

TEST NEGATIVES
*short snippets
*no user control over switching and (apparently) no repeats
*no user control over content
*group test, no safeguards against visual interaction
*no group selection criteria apparent and no pre-training or testing

The results and the summary of positives/negatives above raise some
interesting questions:

*why, for example, should one cable be significantly identified when "x" and
the other fail miserably to be identified. This has to be due and
interaction between the characteristics of the music samples chosen, the
characteristics of the cables under test, and perhaps aggravated by the use
of short snippets with an inadequate time frame to establish the proper
evaluation context.


No it doen't Harry, I doesn't *have* to be due to anything but random
chance.

Did the test itself create the overall null where
people could not differentiate based soley on the test not favoring B as
much as A?

* do the differences in people scoring high on the two tests support the
idea that different people react to different attributes of the DUT's. Or
does it again suggest some interaction between the music chosen, the
characteristics of the individual pieces, and perhaps the evaluation time
frame.


No, since the high scorers on one test were not the high scorers in
the other test. It's called a distrinution, harry, and it is simply
more evidence that there were in fact no audible differences - as any
reasonable person would expect.

http://www.tagmclaren.com/members/news/news77.asp


--

Stewart Pinkerton | Music is Art - Audio is Engineering