View Single Post
  #4   Report Post  
Watch King
 
Posts: n/a
Default Comments about Blind Testing

I'm not sure this will work because Google groups seems to be an
unreliable post portal for this .rec group but here goes.

Are we assuming this is a double blind test with an indicator display
and that you are testing something other than loudspeakers? CD
players, tuners and interconnect cables are the easiest to test. Phono
cartridges and loudspeakers are difficult and headphones are nearly
impossible. Power amps and preamps are in the middle difficulty-wise.

Whenever possible it is best to give the test listeners a sense of the
music when music is the source. So for an easy to test item there
would be a reasonable period of musical lead-in and then a countdown
as the music came up to test level. Then during the sustained passage
(some operatic overtures are good for this and some symphonic passages
as well, and of course much of the quartet music produced is great for
this, as well as some repetative piano music), the music would be
brought to "test" level after which a number of 8-10 second
comparisons could easily be made. Usually it is best to have 3 or 4
direct head to head comparisons with one passage because that allows
the listener to be absolutely sure they can hear clearly which test
item is better than the other. (Of course Unsure should always be a
choice, but if Unsure is the most common response then there would
likely be no difference between test product X vs test product Y).
With test program other than music like pure spoken voice or natural
sounds, the listeners need to know what the material really sounds
like or the test isn't really valid. This would also be the case with
material like single classical guitar (eg. Segovia plays Bach) or
single flute, or single "a capella" voice. This is also especially
important for any pipe organ music. Musical memory is "helpful" here.
Go listen to an organ concert, record it binaurally and then use
headphones and when you test the CD player or interconnects, your
musical memory will help you judge.

Of course speakers shouldn't be tested this way. Speakers should be
played both together, loudish to warm up, with the test listeners "not
listening" hands over ears will help and then the warm speakers can be
compared. Alternatively a different speaker not part of that
particular head to head comparison can be played with the musical lead
in and then take a 1 second break and start the comparisons at full
test level between speaker X and speaker Y. Or alternatively with
speakers (only), a "test" can be run at full loudness but the results
not counted just to give the listeners the sense of what's coming
(more like a countdown 8 switch 7 switch 6 switch etc), then after the
first chorus and a return to the main, a real comparison test can be
run for useful results (eg. 4-5, 4-5, 5-4, 5-4 END, with 4 and 5
randomly chosen numbers for the two tested speakers for this one
portion of the test). Follow that with another comparison and another
until the good parts of that song are used up.

But for CD players and interconnect cables the testing can be very
straightforward. The "moderator" cannot alas partake of the test for
the sake of non-biased presentation. All lead-ins, song intros and
explanations have to be prerecorded and "played" to the listening
testers. Once the test starts it must finish or all results are
unusable. There are many many "control restrictions" needed and
requiring pre-test documentation of procedures. The musical program
material should be rotated throughout the program during different
tests to reduce the biases that "program material position" in the
test program can create. As often as possible the order which any item
is tested first should change. For high power switching of items like
amps, speaker cable and speakers make the loudness turndown steps
between test items pretty short on the order of .1 seconds from full
loudness to 0, then switch, then turn up in .1 seconds. By putting
time code onto a CD and having the switches time code driven this can
be accomplished. We used telephone touchtone signals to activate the
numerical display box. The switch shutdown/turnup can be programmed
right onto the CD material although duplicate disks would need to be
synchronized somehow if 2 CD players were being compared.

The most listening testers can seem to hold their sonic concentration
is betwen 20 minutes and 40 minutes. It is an intense experience. On
the other hand testers don't seem to be able to fully concentrate
until about 2 comparisons into the test or about 1-2 minutes. 20
minutes gives you barely enough time for one throwaway opener and then
6 bits of test material and 40 minutes can allow for 12 or so tests
passages but people start getting headaches and listening fatigue. If
need be, run the test a number of times with different program
material and with intervals of 30-75 minutes between tests. Don't
drink too many liquids before a test session. Getting up for the
bathroom ends any test with "No valid results". In other words no
distractions should be tolerated (no cellphones, no doorbells, no
chatting or physical communication between test listeners, sadly-no
crying babies and especially no "Just listen to this" kind of cueing.)
It's either done professionally or it's useless.

This is not to say that perhaps the character of test items A & B will
not be immediately noticable after 15 minutes of testing. They may
well be different enough to be immediately recognizable, but keep
concentration so as to provide results which can be used to determine
which item is more accurate or "better". When using one of the very
rare "transparent" test listening speakers to test other items,
between one and three chairs is about all a Quad ESL 63 or Martin
Logan CL-3 can accomodate in the sweet listening spot. Only a very
tiny (point souce) loudspeaker can produce the kind of superior
quality and wide soundstage with pinpoint imaging needed to make tests
with perhaps as many as a dozen possible test seats. Very small
loudspeakers with high power handling, very low spurious noise
generated by the cabinet, constant directivity, a single driver for
the voice band, reasonable bandwidth and phase alignment capability,
limit the number of louspeakers that can be used to perhaps 2 or 3
models that have ever been made in the history of audio. Big boxes
will not work for this kind of testing because front row seats will
hear something dramatically different from middle and back seats.
Remove any chairs not full of test listeners. Use preprinted pages
with only 2 columns of numbers on them to allow the two test item
numbers to be circled or a box to be checked. Don't be surprised if
the choice changes with program materials.

Listening tests may be exciting but they may not be fun. No matter how
people have travelled and might be leaving or how tight their
schedules are, if some component used but not being tested develops a
buzz or glitch or if the test aparatus malfunctions don't use any of
the results. Use the prerecorded "moderator" intros to cue listeners
as to what they might listen for, (eg. "the following quartet is
composed of flute, cello, violin and trumpet", or "on this recording
the piano is the only acoustic instrument and it is mic'd with 2
overstring and one soundboard mix microphone", or "the test comparison
will be done during the middle of the 3 minute drum solo") because if
there are anomolies to be heard let the testers know when to
concentrate the most closely. Watchking

listening isn't a competative sport, buying equipment is.

We don't get enough sand in our glass.

chung wrote in message news:4uVPb.124026$8H.329218@attbi_s03...
watch king wrote:
The loudspeakers
used in home hifi systems do not cool nearly as well as pro
loudspeakers, and the negative effects on sonic characteristics due
to temperature change are much greater than those demonstrated by
professional loudspeakers. And that is just one of the factors
involved with loudspeaker compression. These changes in sonic
characteristics make it nearly impossible to do any real research on
speaker wires that could be relevant to audiophile listening because
the test to make such comparison "fair" would be literally impossible
to design. By the time listeners could focus on the sound playback of
of one of the test wire/products the second product in the test would
already be unfairly tested because the test loudspeaker system
(acoustic microscope equivalent) will not likely "sound" the same as
it did 30 seconds ago. This means that the test passage would need to
be made longer and restarted after a specific cooling off time and by
then the human acoustic memory is gone. This could be why so many
anecdotal testimonials involve hearing "things" after the time was
taken to disconnect one set of speaker wires and connect a second
set. The speakers probably cooled down and sounded better after the
"wire changing period".


So quick A/B switching and using short snippets of sound are the most
effective for discrimination. I also found pink noise to be very
revealing for detecting level and frequency response differences.