Reply
 
Thread Tools Display Modes
  #121   Report Post  
Keith Hughes
 
Posts: n/a
Default

Harry Lavo wrote:

snip

..and I
know damn well what monadic, proto-monadic, and comparative tests can and
cannot measure, both real and imputed (or as you would say, imagined).


Really? Hmmm, let's see...

You tell me how else, other than using ABX itself, you can determine whether
a real perceived difference exists in one piece of audio gear versus
another. You can't...it has to be done across a large enough group of
people to have statistical significance


This is sheer nonsense, in the current context, as has been pointed out
to you previously (by me, and I'll note that you did *not* reply).
Statistical significance requires *ONLY* one participant, with multiple
trials. Take *you* for example; you can easily do a sufficient number
of trials, using whatever *blind* methodology you would like, on two
components (say cables) for which *you* have identified, sighted, a
consistent audible difference, to determine whether the chosen method
statistically 'validates' your sighted results. That's it, finis.

We are not talking about frequency distributions within a population,
the only thing that would require a large population sample size, we're
talking about using *your* method to detect (blind) the differences that
*you* clearly hear...sighted.

...so one can say...tested blind,
this group of (audiophiles, I presume) listening to movements (X,Y,Z) found
"P" to have significantly higher ratings thatn "Q" on "transparency" and on
"overall realism of the orchestra" (simply used as an example).


Again, irrelevant. This approach presupposes that presence of
difference is an unknown, and/or that the frequency of detection
capability within the population is not known, neither of which is the
case here.

Then you
know the difference is real (albeit perceived subjectively).


*You* already say that you know the difference is real, right? That's
the point, and that's why this is mere obfuscation.

You then use
an ABX test among a broad sample of yet another similarly-screened group of
people, using short-snippets of movements X,Y,Z,


A fictitious constraint you gratuitously apply, yet again.

to see if in total they can
detect the difference. If they test allows them to do so, you have
validated the test.


Right, you would have validated your 'monadic' testing for detection of
'difference', as that is what the ABX protocol is designed to do.

If it does not, you have invalidated the test.


Clearly incorrect. This presupposes the superiority of the 'monadic'
test protocol. You get statistically significant results from *BOTH*
tests, and your conclusion is "see...ABX doesn't work!". Sorry, but
that just ignores basic statistical precepts. You cannot discount one
method just because it gives the results you don't want, when it has the
same level of significance (albeit with results in the opposite
direction) as your 'pet' method.

Finally,
once you have validated the test, it can be used by single individuals to
determine if they can reliably hear a difference between audio components
similar to what they would experience in a more normal listening situation.
If you are so sure that ABX testing works for open-ended evaluation of audio
components playing music, you should be supporting such an effort, not
ridiculing it. Because until you do, you are ****ing into the wind among
the large majority of audiophiles.


No, you have it backwards. Until *one* single individual can
demonstrate confirmation of differences they have already confirmed
sighted (level matched naturally), then there is no point in applying
the method to a large population to establish the frequency of
discrimination capability within the larger population. Whether you use
one person, or a thousand, makes no difference whatsoever.

If *one* person using *any* blind protocol consistently identifies a
difference, then that difference is real (to the level of significance
the data allows). *Then* you can compare against ABX, or any other
method. This is a test that you, personally, could easily conduct if
you were truly interested. So where's your data?

Keith Hughes
  #122   Report Post  
Stewart Pinkerton
 
Posts: n/a
Default

On 7 Oct 2005 21:54:54 GMT, "Harry Lavo" wrote:

Stewart Pinkerton wrote:


That's how Science works. You observe something unusual, come up with
a theory to explain it, use that theory to predict something else, and
observe the truth or falsity of your prediction.


The light-bending experiment came *after* the theory. What was it he
"observed" that led to the theory?


He developed his theory from the Lorentzian interpretation of
Maxwell's work, and arguably had prior knowledge of the
Michelson-Morley experimental results. He had many giants on whose
shoulders to stand....

In our case, we are observing the fact that 90% of audiophiles find the
*sounds same* postulate so rediculous and at odds with experience (not in
just a few instances, but in many, many instances) that the postulate is
rejected.


From whence comes this magical 90%, Harry? It seems as speculative as
your other comments. I suspect that the real number is exceeded by
those who believe that the world is ruled by shape-shifting reptiles.

And since we are dealing with strictly subjective phenomenon,
this rejection must be dealth with as a "fact". Mark and Michael have been
working to point out why in theory the short-snippet, comparative testing
may have missed a crucial element...an element that seems to square with
what many audiophiles instinctively or intuitively feed is missing. Now it
is time for some experimentation.


Indeed it is - so go do some, instead of railing against the entire
body of accumulated knowledge about audio.

You might also bone up on his complete reluctance to embrace
quantum mechanics, despire being one of the founding fathers
(e.g., for explaining the photoelectric effect, for which he
won his only Nobel proze) and in spite of overwhelming physical
evidence.

I do know of his reluctance to aceept quantum mechanics....never said he
didn't have his weaknesses. But it goes to show what happens when science
as faith replaces science as science.


Oh dear. No, it goes to show that even the greatest scientists can
sometimes refuse to accept scientific facts. I suppose that gives you
*some* excuse............


He refused to accept them because they were so far from what his entire
training had taught him "ought* to be. Thus his famous "roll of the dice"
quote. That's called "belief" and it overcame his scien

  #123   Report Post  
Stewart Pinkerton
 
Posts: n/a
Default

On 8 Oct 2005 01:59:58 GMT, "Harry Lavo" wrote:

wrote in message ...
Harry Lavo wrote:
Actually not, for the research it grew out of. But my contention is that
IMO and that of many others, it *may* be fatally flawed as a device for
the
open-ended evaluation of audio components. And that until it has been
*validated* for that purpose, it should be promoted and received with
substantial skepticism. My monadic test proposal is a legitimate way of
doing that validation.


Lest anyone think that Harry has been vested with any authority to
declare what is and is not a valid psychoacoustic test, this is
completely bass-ackwards. ABX is validated both by its constant use in
the field and by its ability to make and confirm predictions aboout
audibility. Whereas nobody in the field has ever used a monadic test to
determine the audibility of anything. Not once. Ever. And for good
reason.


Tell that to Harman International and see my comments below.


Harman International uses quick-switch level-matched DBTs. As do many
other major audio manufacturers.

So the first thing Harry needs to do, before he starts his Annus
Mirabilis Project, is to validate that monadic testing can be used as
an audibility test AT ALL. Can it even distinguish the kinds of things
that ABX tests easily distinguish? Can it distinguish anything? He
doesn't even know.

Look, I spent 20 years doing sensory and behavior research in food...and I
know damn well what monadic, proto-monadic, and comparative tests can and
cannot measure, both real and imputed (or as you would say, imagined).


In that case, you should already know that open-ended monadic tests
are not going to be much use for audio........................

You tell me how else, other than using ABX itself, you can determine whether
a real perceived difference exists in one piece of audio gear versus
another. You can't...it has to be done across a large enough group of
people to have statistical significance...so one can say...tested blind,
this group of (audiophiles, I presume) listening to movements (X,Y,Z) found
"P" to have significantly higher ratings thatn "Q" on "transparency" and on
"overall realism of the orchestra" (simply used as an example). Then you
know the difference is real (albeit perceived subjectively). You then use
an ABX test among a broad sample of yet another similarly-screened group of
people, using short-snippets of movements X,Y,Z, to see if in total they can
detect the difference. If they test allows them to do so, you have
validated the test. If it does not, you have invalidated the test. Finally,
once you have validated the test, it can be used by single individuals to
determine if they can reliably hear a difference between audio components
similar to what they would experience in a more normal listening situation.
If you are so sure that ABX testing works for open-ended evaluation of audio
components playing music, you should be supporting such an effort, not
ridiculing it. Because until you do, you are ****ing into the wind among
the large majority of audiophiles.


As noted above, you have it back-asswards. The audio industry - you
know, the one that *designs* all those wonderful toys we listen to -
has determined over many decades that quick-switched level-matched
DBTs are the gold standard. If *you* wish to challenge this, then
*you* must provide the evidence, not simply speculate.

Of course, the real truth if the matter is that ABX works very well
indeed, but fails to support your sighted impressions, which is why
you are convinced that there just *must* be something wrong with it.
Read my lips - wire is wire.

--

Stewart Pinkerton | Music is Art - Audio is Engineering
  #124   Report Post  
 
Posts: n/a
Default

Keith Hughes wrote:
Harry Lavo wrote:
You tell me how else, other than using ABX itself, you can determine whether
a real perceived difference exists in one piece of audio gear versus
another. You can't...it has to be done across a large enough group of
people to have statistical significance


This is sheer nonsense, in the current context, as has been pointed out
to you previously (by me, and I'll note that you did *not* reply).
Statistical significance requires *ONLY* one participant, with multiple
trials. Take *you* for example; you can easily do a sufficient number
of trials, using whatever *blind* methodology you would like, on two
components (say cables) for which *you* have identified, sighted, a
consistent audible difference, to determine whether the chosen method
statistically 'validates' your sighted results. That's it, finis.


You're presuming that Harry actually wants an answer. But what if Harry
doesn't want an answer? By making his "test" too complex and expensive
to pull off, he ensures that it'll never happen, and he'll never have
to eat his words.

More than once I've proposed a much simpler approach that would really
test what audiophiles actually claim to do--determine preferences
between components. Unlike Harry's baroque approach, mine didn't
presume anything about how audiophiles actually listen. Harry never
responded to my posts, either.

bob
  #125   Report Post  
Harry Lavo
 
Posts: n/a
Default

wrote in message ...
Harry Lavo wrote:
wrote in message
...
Harry Lavo wrote:
Actually not, for the research it grew out of. But my contention is
that
IMO and that of many others, it *may* be fatally flawed as a device
for
the
open-ended evaluation of audio components. And that until it has been
*validated* for that purpose, it should be promoted and received with
substantial skepticism. My monadic test proposal is a legitimate way
of
doing that validation.

Lest anyone think that Harry has been vested with any authority to
declare what is and is not a valid psychoacoustic test, this is
completely bass-ackwards. ABX is validated both by its constant use in
the field and by its ability to make and confirm predictions aboout
audibility. Whereas nobody in the field has ever used a monadic test to
determine the audibility of anything. Not once. Ever. And for good
reason.


Tell that to Harman International and see my comments below.


Harman does not use monadic tests to determine audibility. If they use
monadic tests for anything (and I haven't seen anything they've
published using such tests), it is to explore perceived differences
between components that are already known to be audibly different. No
one would use monadic tests to determine *whether* two things were
audibly different. At least not anyone who knew what they were doing.

So the first thing Harry needs to do, before he starts his Annus
Mirabilis Project, is to validate that monadic testing can be used as
an audibility test AT ALL. Can it even distinguish the kinds of things
that ABX tests easily distinguish? Can it distinguish anything? He
doesn't even know.


Look, I spent 20 years doing sensory and behavior research in food...and
I
know damn well what monadic, proto-monadic, and comparative tests can and
cannot measure, both real and imputed (or as you would say, imagined).


FOOD??? You complain that too much research using DBTs was listening to
things other than long musical passages, and then you say the better
test is the one you used for FOOD?


I assume you saw the quote here that said that audio research had to borrow
from the social sciences. Well so does food research. We use the same
types of tests, and for much the same reason...because sensory reaction is
subjective and needs to b objectified. Moreover, I have lots of first hand
experience specifying and designing those tests, and helping to interpret
the results. Your hands on experience doing the same?


You tell me how else, other than using ABX itself, you can determine
whether
a real perceived difference exists in one piece of audio gear versus
another.


I don't need anything else. ABX works. It allows me to make reliable
predictions back and forth. I can look at measurements and predict the
outcome of ABX tests--and be right. And I can look at the results of
ABX tests and predict the magnitude of measured differences--and be
right. You cannot do that with monadic tests, because you have no data.
And you probably wouldn't be able to do so even if you had the data,
because there would be so much noise in that data that it'd never tell
you anything.


As Mark and Michael have pointed out, you are engaged in the same circular
reasoning that has destroyed your credibility among the audiophile community
at large. That's one way never to have to think again...just assume away
any possibililites that might bring your favorite test into question.


Again, nobody uses monadic tests to determine *whether* there's a
difference. Nobody.


And you know everything there is to know about this, right? Including what
every audio research lab in the country is up to?



  #126   Report Post  
Harry Lavo
 
Posts: n/a
Default

"Keith Hughes" wrote in message
...
Harry Lavo wrote:

snip

..and I know damn well what monadic, proto-monadic, and comparative tests
can and cannot measure, both real and imputed (or as you would say,
imagined).


Really? Hmmm, let's see...

You tell me how else, other than using ABX itself, you can determine
whether a real perceived difference exists in one piece of audio gear
versus another. You can't...it has to be done across a large enough
group of people to have statistical significance


This is sheer nonsense, in the current context, as has been pointed out to
you previously (by me, and I'll note that you did *not* reply).
Statistical significance requires *ONLY* one participant, with multiple
trials. Take *you* for example; you can easily do a sufficient number of
trials, using whatever *blind* methodology you would like, on two
components (say cables) for which *you* have identified, sighted, a
consistent audible difference, to determine whether the chosen method
statistically 'validates' your sighted results. That's it, finis.


Horse pucky!. You are desribing evaluation of an ABX test, Keith. There's
a whole different world of testing out there that you apparently are not
familiar with.


We are not talking about frequency distributions within a population, the
only thing that would require a large population sample size, we're
talking about using *your* method to detect (blind) the differences that
*you* clearly hear...sighted.


Yeah, and if a whole population hears them "blind" they are real. That's
the *only* way you can tell if they are real.


...so one can say...tested blind, this group of (audiophiles, I presume)
listening to movements (X,Y,Z) found "P" to have significantly higher
ratings thatn "Q" on "transparency" and on "overall realism of the
orchestra" (simply used as an example).


Again, irrelevant. This approach presupposes that presence of difference
is an unknown, and/or that the frequency of detection capability within
the population is not known, neither of which is the case here.


Once again, you are using the test in question as the standard, rather than
trying to independently confirm it for the purpose under question. Circular
reasoning.

Then you know the difference is real (albeit perceived subjectively).


*You* already say that you know the difference is real, right? That's the
point, and that's why this is mere obfuscation.


I say no such thing. I say the first step is to use monadic testing to
determine if in fact the difference is real.

You then use an ABX test among a broad sample of yet another
similarly-screened group of people, using short-snippets of movements
X,Y,Z,


A fictitious constraint you gratuitously apply, yet again.



Not at all. You want to validate the test technique, so you've got to do it
once each among a broadscale group so you are testing the technique, not any
one individual, and using short snippets, the way it is almost always done
because of fatique/time constraints. The constraints are totally realistic.



to see if in total they can detect the difference. If they test allows
them to do so, you have validated the test.


Right, you would have validated your 'monadic' testing for detection of
'difference', as that is what the ABX protocol is designed to do.


Not at all, since you claim the ABX is *the most sensitive* test...if it
shows up in the monadic ABX should pick it up...if it doesn't the test is no
good.

If it does not, you have invalidated the test.


Clearly incorrect. This presupposes the superiority of the 'monadic' test
protocol. You get statistically significant results from *BOTH* tests,
and your conclusion is "see...ABX doesn't work!". Sorry, but that just
ignores basic statistical precepts. You cannot discount one method just
because it gives the results you don't want, when it has the same level of
significance (albeit with results in the opposite direction) as your 'pet'
method.


Sorry yourself. See my comments just above.

Finally, once you have validated the test, it can be used by single
individuals to determine if they can reliably hear a difference between
audio components similar to what they would experience in a more normal
listening situation. If you are so sure that ABX testing works for
open-ended evaluation of audio components playing music, you should be
supporting such an effort, not ridiculing it. Because until you do, you
are ****ing into the wind among the large majority of audiophiles.


No, you have it backwards. Until *one* single individual can demonstrate
confirmation of differences they have already confirmed sighted (level
matched naturally), then there is no point in applying the method to a
large population to establish the frequency of discrimination capability
within the larger population. Whether you use one person, or a thousand,
makes no difference whatsoever.


Sure it does. It takes the artificially constrained test apparatus out of
the equation. If the difference is real, and the sample size is large
enough, the monadic test will reveal it. Then it is simply a matter of
whether or not ABX does the same.

If *one* person using *any* blind protocol consistently identifies a
difference, then that difference is real (to the level of significance the
data allows). *Then* you can compare against ABX, or any other method.
This is a test that you, personally, could easily conduct if you were
truly interested. So where's your data?


Again, you are assuming the test is valid, rather than validating the test.
Totally circular.

  #127   Report Post  
Harry Lavo
 
Posts: n/a
Default

"Stewart Pinkerton" wrote in message
...
On 8 Oct 2005 01:59:58 GMT, "Harry Lavo" wrote:

wrote in message
...
Harry Lavo wrote:
Actually not, for the research it grew out of. But my contention is
that
IMO and that of many others, it *may* be fatally flawed as a device for
the
open-ended evaluation of audio components. And that until it has been
*validated* for that purpose, it should be promoted and received with
substantial skepticism. My monadic test proposal is a legitimate way
of
doing that validation.

Lest anyone think that Harry has been vested with any authority to
declare what is and is not a valid psychoacoustic test, this is
completely bass-ackwards. ABX is validated both by its constant use in
the field and by its ability to make and confirm predictions aboout
audibility. Whereas nobody in the field has ever used a monadic test to
determine the audibility of anything. Not once. Ever. And for good
reason.


Tell that to Harman International and see my comments below.


Harman International uses quick-switch level-matched DBTs. As do many
other major audio manufacturers.



Not for evaluating speakers, they don't. They use sequential monadic
testing. Because they found it squared better with their objective tests.



So the first thing Harry needs to do, before he starts his Annus
Mirabilis Project, is to validate that monadic testing can be used as
an audibility test AT ALL. Can it even distinguish the kinds of things
that ABX tests easily distinguish? Can it distinguish anything? He
doesn't even know.

Look, I spent 20 years doing sensory and behavior research in food...and I
know damn well what monadic, proto-monadic, and comparative tests can and
cannot measure, both real and imputed (or as you would say, imagined).


In that case, you should already know that open-ended monadic tests
are not going to be much use for audio........................



Yeah, right. That's why I'm spending all of this time trying to convince
others on this newsgroup....

Your hands on experience designing and using sophisticated tests, please?



You tell me how else, other than using ABX itself, you can determine
whether
a real perceived difference exists in one piece of audio gear versus
another. You can't...it has to be done across a large enough group of
people to have statistical significance...so one can say...tested blind,
this group of (audiophiles, I presume) listening to movements (X,Y,Z)
found
"P" to have significantly higher ratings thatn "Q" on "transparency" and
on
"overall realism of the orchestra" (simply used as an example). Then you
know the difference is real (albeit perceived subjectively). You then use
an ABX test among a broad sample of yet another similarly-screened group
of
people, using short-snippets of movements X,Y,Z, to see if in total they
can
detect the difference. If they test allows them to do so, you have
validated the test. If it does not, you have invalidated the test.
Finally,
once you have validated the test, it can be used by single individuals to
determine if they can reliably hear a difference between audio components
similar to what they would experience in a more normal listening
situation.
If you are so sure that ABX testing works for open-ended evaluation of
audio
components playing music, you should be supporting such an effort, not
ridiculing it. Because until you do, you are ****ing into the wind among
the large majority of audiophiles.


As noted above, you have it back-asswards. The audio industry - you
know, the one that *designs* all those wonderful toys we listen to -
has determined over many decades that quick-switched level-matched
DBTs are the gold standard. If *you* wish to challenge this, then
*you* must provide the evidence, not simply speculate.


No, they have determined that it is a useful tool for dealing with certain
development attributes, using trained listeners and pre-training sessions to
identify the attributes under investigation. That is a far cry from the
open-ended evaluation of audio components in their overall ability to convey
the musical experience.

Similarly, in the food industry we used blind comparative testing for
establishing certain taste and textural attributes. But wouldn't think of
using it for final evaluation.



Of course, the real truth if the matter is that ABX works very well
indeed, but fails to support your sighted impressions, which is why
you are convinced that there just *must* be something wrong with it.
Read my lips - wire is wire.


No, the truth of the matter is that it works well in spotting frequency and
volume level irregularities and artifacts from compression, which is what it
was developed for. You and the other objectivists *assume* it works equally
well for the open-ended evaluation of home audio components, but you have
never validated it for same. Simple as that, and you run scared and retreat
into circular reasoning whenever it is pointed out.

  #128   Report Post  
Harry Lavo
 
Posts: n/a
Default

"Stewart Pinkerton" wrote in message
...
On 7 Oct 2005 21:54:54 GMT, "Harry Lavo" wrote:

Stewart Pinkerton wrote:


That's how Science works. You observe something unusual, come up with
a theory to explain it, use that theory to predict something else, and
observe the truth or falsity of your prediction.


The light-bending experiment came *after* the theory. What was it he
"observed" that led to the theory?


He developed his theory from the Lorentzian interpretation of
Maxwell's work, and arguably had prior knowledge of the
Michelson-Morley experimental results. He had many giants on whose
shoulders to stand....


Okay, you've shown you can dazzle. Now please interpret what "observations"
he developed his theory to explain.


In our case, we are observing the fact that 90% of audiophiles find the
*sounds same* postulate so rediculous and at odds with experience (not in
just a few instances, but in many, many instances) that the postulate is
rejected.


From whence comes this magical 90%, Harry? It seems as speculative as
your other comments. I suspect that the real number is exceeded by
those who believe that the world is ruled by shape-shifting reptiles.



It doesn't matter whether it is 90%, or 95%, or 80%, or 75%. The percentage
of audiophiles who honest believe all electronics essentially sound the same
is a small minority....the vast majority simply do not buy the assertion.


And since we are dealing with strictly subjective phenomenon,
this rejection must be dealth with as a "fact". Mark and Michael have
been
working to point out why in theory the short-snippet, comparative testing
may have missed a crucial element...an element that seems to square with
what many audiophiles instinctively or intuitively feed is missing. Now
it
is time for some experimentation.


Indeed it is - so go do some, instead of railing against the entire
body of accumulated knowledge about audio.



I'm beginning to work on how the validation test might actually be
executed..


You might also bone up on his complete reluctance to embrace
quantum mechanics, despire being one of the founding fathers
(e.g., for explaining the photoelectric effect, for which he
won his only Nobel proze) and in spite of overwhelming physical
evidence.

I do know of his reluctance to aceept quantum mechanics....never said he
didn't have his weaknesses. But it goes to show what happens when
science
as faith replaces science as science.

Oh dear. No, it goes to show that even the greatest scientists can
sometimes refuse to accept scientific facts. I suppose that gives you
*some* excuse............


He refused to accept them because they were so far from what his entire
training had taught him "ought* to be. Thus his famous "roll of the dice"
quote. That's called "belief" and it overcame his scientific training.



No response, so I assume concurrence. See any similarity to the "science"
as practiced here by some objectivists?

  #129   Report Post  
 
Posts: n/a
Default

Harry Lavo wrote:
"Stewart Pinkerton" wrote in message
...
On 7 Oct 2005 21:54:54 GMT, "Harry Lavo" wrote:

Stewart Pinkerton wrote:


That's how Science works. You observe something unusual, come up with
a theory to explain it, use that theory to predict something else, and
observe the truth or falsity of your prediction.

The light-bending experiment came *after* the theory. What was it he
"observed" that led to the theory?


He developed his theory from the Lorentzian interpretation of
Maxwell's work, and arguably had prior knowledge of the
Michelson-Morley experimental results. He had many giants on whose
shoulders to stand....


Okay, you've shown you can dazzle. Now please interpret what "observations"
he developed his theory to explain.


Hi Harry,

Einstein was trying to explain the results of the Michelson-Morley
experiment, which failed to find a difference in travel time between
two perpendicular light beams--strange, because it was expected that
the absolute motion of the Earth combined with the theory that light
travelled through an absolutely fixed ether would lead to different
travel times. Einstein's wildly brilliant solution was to propose that
there is no absolute motion, that the speed of light looks the same to
all observers.

The point of Stewart & Bob is that Einstein's theory was based on
troublesome observations. And, they say, there are no "troublesome
observations" in audio; they have a way to detect if differences are
audible, and this is consistent with the reigning theory of the ear's
function.

Where I think they are wrong is basing their model on the assumption
that the ear and brain can be observed objectively without regard to
observations carried out on the inside (observing one's own perception
and listening to others describe their perceptions). So they end up
with a model that describes the ear and brain very well---under one set
of conditions.

Secondly, I think they have "no troublesome observations" because they
invoke perceptual illusion to explain away any observation they don't
like, while at the same time admitting there are too many contributing
factors to explain any given perception--it "could be" illusion, so it
"must be" illusion; but we can never explain why any particular
illusion occurred.

Mike
  #130   Report Post  
Steven Sullivan
 
Posts: n/a
Default

wrote:
Harry Lavo wrote:
wrote in message ...
Harry Lavo wrote:
Actually not, for the research it grew out of. But my contention is that
IMO and that of many others, it *may* be fatally flawed as a device for
the
open-ended evaluation of audio components. And that until it has been
*validated* for that purpose, it should be promoted and received with
substantial skepticism. My monadic test proposal is a legitimate way of
doing that validation.

Lest anyone think that Harry has been vested with any authority to
declare what is and is not a valid psychoacoustic test, this is
completely bass-ackwards. ABX is validated both by its constant use in
the field and by its ability to make and confirm predictions aboout
audibility. Whereas nobody in the field has ever used a monadic test to
determine the audibility of anything. Not once. Ever. And for good
reason.


Tell that to Harman International and see my comments below.


Harman does not use monadic tests to determine audibility. If they use
monadic tests for anything (and I haven't seen anything they've
published using such tests), it is to explore perceived differences
between components that are already known to be audibly different. No
one would use monadic tests to determine *whether* two things were
audibly different. At least not anyone who knew what they were doing.



Indeed, when Sean Olive gave a talk on his work at an August 2004 AES
meeting, here is how he described the uses of various DBTs (note that the
requirement for *double blind* methodology goes without saying)

http://www.aes.org/sections/la/PastM...004-08-31.html

"Sean began by describing three types of listening tests:

* Difference
* Descriptive analysis
* Preference / affective

The difference test, obviously, is used for determining whether two audio
devices under test are audibly different from each other. A common method
is double-blind ABX testing.

The descriptive analysis test is for gathering impressions of comparative
audio quality from the listeners. If an ABX test reveals that device “A”
sounds audibly different from device “B,” the descriptive analysis test
would determine in what way they sound different. The descriptive analysis
test has limited usefulness in audio, though.

And after the determinations of “whether different” and “how different,”
the preference or affective test asks the question, “Which one sounds
better?”

Each test has its own appropriate and inappropriate applications, as well
as its own strengths and potential pitfalls. In any test, biases have to
be controlled in order to obtain meaningful data. Most of his descriptions
of testing methods involved tests of loudspeakers, but the principles can
be put to use with other audio gear as well."


You tell me how else, other than using ABX itself, you can determine whether
a real perceived difference exists in one piece of audio gear versus
another.


I don't need anything else. ABX works. It allows me to make reliable
predictions back and forth. I can look at measurements and predict the
outcome of ABX tests--and be right. And I can look at the results of
ABX tests and predict the magnitude of measured differences--and be
right. You cannot do that with monadic tests, because you have no data.
And you probably wouldn't be able to do so even if you had the data,
because there would be so much noise in that data that it'd never tell
you anything.


Again, nobody uses monadic tests to determine *whether* there's a
difference. Nobody.



This call for 'validation' -- which at least one Stereophile
reader parrots in the October 2005 letters column, in response
to Jon Iverson's specious article on ABX tests ("The Blind LEading
the Blind?" Aug 2005) -- is interesting. Those making this
call should ask themselves:

1) Do ABX tests ever yield a 'difference' result for two
*certainly* identical sources (e.g., a phantom switch)?
No, they don't.

2) Do 'sighted' listening tests ever yield a
'difference' result in such tests? Yes, they do.






--

-S


  #131   Report Post  
Steven Sullivan
 
Posts: n/a
Default

Stewart Pinkerton wrote:
On 7 Oct 2005 21:54:54 GMT, "Harry Lavo" wrote:


Stewart Pinkerton wrote:


That's how Science works. You observe something unusual, come up with
a theory to explain it, use that theory to predict something else, and
observe the truth or falsity of your prediction.


The light-bending experiment came *after* the theory. What was it he
"observed" that led to the theory?


He developed his theory from the Lorentzian interpretation of
Maxwell's work, and arguably had prior knowledge of the
Michelson-Morley experimental results. He had many giants on whose
shoulders to stand....


In our case, we are observing the fact that 90% of audiophiles find the
*sounds same* postulate so rediculous and at odds with experience (not in
just a few instances, but in many, many instances) that the postulate is
rejected.


From whence comes this magical 90%, Harry? It seems as speculative as
your other comments. I suspect that the real number is exceeded by
those who believe that the world is ruled by shape-shifting reptiles.


And certainly exceeded by those who are subjectively *sure*
that coincidence represents a
preordained pattern -- by Harry's logic we should be interrogating
the laws of probability. Hey, all those people who *dream* something
that *happens* later, or who get a phone call *right after* they
thought of the caller, can't be wrong...can they?

When 'audiophiles' begin finding differences using DBTs, that can't
be traced to measurable differences....THEN they can start asking
that science look into the 'problem'. Until then, it's not
a problem, it's just a bunch of cultish hobbyists refusing
to accept a reasonable explanation that offends their sensibilities.


--

-S
  #132   Report Post  
Steven Sullivan
 
Posts: n/a
Default

wrote:
Keith Hughes wrote:
Harry Lavo wrote:
You tell me how else, other than using ABX itself, you can determine whether
a real perceived difference exists in one piece of audio gear versus
another. You can't...it has to be done across a large enough group of
people to have statistical significance


This is sheer nonsense, in the current context, as has been pointed out
to you previously (by me, and I'll note that you did *not* reply).
Statistical significance requires *ONLY* one participant, with multiple
trials. Take *you* for example; you can easily do a sufficient number
of trials, using whatever *blind* methodology you would like, on two
components (say cables) for which *you* have identified, sighted, a
consistent audible difference, to determine whether the chosen method
statistically 'validates' your sighted results. That's it, finis.


You're presuming that Harry actually wants an answer. But what if Harry
doesn't want an answer? By making his "test" too complex and expensive
to pull off, he ensures that it'll never happen, and he'll never have
to eat his words.


An ABX for 'preference' to gather the sort of data that Sean Olive
gathered in his studies of loudspeaker preference, would be cumbersome
indeed, since Olive varied the training of subjects, the musical
program, and compared more than two speakers per session.

For loudspeakers, it was a safe bet to *assume* the things are likely
to sound different, and proceed with a more efficient DBT for
gathering preference data for such a multivariable matrix.

And unsurprisingly, preference was significantly different for
all four speakers -- what one would expect for speakers that really
sound different from each other. Olive's more interesting
finding was that the *ranking* of loudspeaker preference for the four
was the same for trained and untrained listeners, indicating that when
listeners aren't listening sighted, their impressions converge on what
'sounds good'.

Amusingly, the 'worst-sounding' speaker turned out to be one that
had been rated 'speaker of the year' by one of the audio mags,
and had made it into the A+ class (gee, I wonder
what magazine this was....). It also *measured* the worst.



--

-S
  #133   Report Post  
Stewart Pinkerton
 
Posts: n/a
Default

On 9 Oct 2005 00:53:49 GMT, "Harry Lavo" wrote:

"Stewart Pinkerton" wrote in message
...
On 8 Oct 2005 01:59:58 GMT, "Harry Lavo" wrote:

wrote in message
...
Harry Lavo wrote:
Actually not, for the research it grew out of. But my contention is
that
IMO and that of many others, it *may* be fatally flawed as a device for
the
open-ended evaluation of audio components. And that until it has been
*validated* for that purpose, it should be promoted and received with
substantial skepticism. My monadic test proposal is a legitimate way
of
doing that validation.

Lest anyone think that Harry has been vested with any authority to
declare what is and is not a valid psychoacoustic test, this is
completely bass-ackwards. ABX is validated both by its constant use in
the field and by its ability to make and confirm predictions aboout
audibility. Whereas nobody in the field has ever used a monadic test to
determine the audibility of anything. Not once. Ever. And for good
reason.

Tell that to Harman International and see my comments below.


Harman International uses quick-switch level-matched DBTs. As do many
other major audio manufacturers.


Not for evaluating speakers, they don't. They use sequential monadic
testing. Because they found it squared better with their objective tests.


That is for evaluating *preference*, which takes place *after*
difference has been proven by quick-switched DBTs.

What we;re talking about here is the establishment of *difference*,
for which no one has *ever* used monadic testing - for the very good
reason that it's insufficiently sensitive.

So the first thing Harry needs to do, before he starts his Annus
Mirabilis Project, is to validate that monadic testing can be used as
an audibility test AT ALL. Can it even distinguish the kinds of things
that ABX tests easily distinguish? Can it distinguish anything? He
doesn't even know.

Look, I spent 20 years doing sensory and behavior research in food...and I
know damn well what monadic, proto-monadic, and comparative tests can and
cannot measure, both real and imputed (or as you would say, imagined).


In that case, you should already know that open-ended monadic tests
are not going to be much use for audio........................


Yeah, right. That's why I'm spending all of this time trying to convince
others on this newsgroup....


I siad that you *should* know it, not that you had actually grasped
the concept...................

Your hands on experience designing and using sophisticated tests, please?


No need to re-invent the wheel for audio, I *use* quick-switched
level-matched DBTs frequently. I also spent twenty years in the
Defence and Aerospace industry designing extremely sophisticated test
equipment so yes, I have considerable experience of designing and
using sophisticated tests for precision analogue electronics and audio
equipment, with a dynamic range and bandwidth considerably in excess
of anything you'll see in domestic audio.

I am happy to believe that you know how to conduct the Pepsi
Challenge, but I'm not sure what that has to do with audio.....

As noted above, you have it back-asswards. The audio industry - you
know, the one that *designs* all those wonderful toys we listen to -
has determined over many decades that quick-switched level-matched
DBTs are the gold standard. If *you* wish to challenge this, then
*you* must provide the evidence, not simply speculate.


No, they have determined that it is a useful tool for dealing with certain
development attributes, using trained listeners and pre-training sessions to
identify the attributes under investigation.


Indeed they have.

That is a far cry from the
open-ended evaluation of audio components in their overall ability to convey
the musical experience.


Indeed it is - because *no one* would use open-ended evaluation for
detrmining the existence of subtle differences. It's simply not
adequately sensitive.

Similarly, in the food industry we used blind comparative testing for
establishing certain taste and textural attributes. But wouldn't think of
using it for final evaluation.


What has this to do with audio?

I seldom use a sledgehammer for polishing my car........

Of course, the real truth if the matter is that ABX works very well
indeed, but fails to support your sighted impressions, which is why
you are convinced that there just *must* be something wrong with it.
Read my lips - wire is wire.

No, the truth of the matter is that it works well in spotting frequency and
volume level irregularities and artifacts from compression, which is what it
was developed for.


Actually, it works very well for spotting *any* truly audible
difference - just not those which are entirely down to your overactive
imagination!

You and the other objectivists *assume* it works equally
well for the open-ended evaluation of home audio components, but you have
never validated it for same. Simple as that, and you run scared and retreat
into circular reasoning whenever it is pointed out.


No Harry, *you* run scared every time you are asked to *demonstrate*
the validity of your own speculations.

--

Stewart Pinkerton | Music is Art - Audio is Engineering
  #134   Report Post  
Stewart Pinkerton
 
Posts: n/a
Default

On 9 Oct 2005 00:50:05 GMT, "Harry Lavo" wrote:

wrote in message ...


ABX works. It allows me to make reliable
predictions back and forth. I can look at measurements and predict the
outcome of ABX tests--and be right. And I can look at the results of
ABX tests and predict the magnitude of measured differences--and be
right. You cannot do that with monadic tests, because you have no data.
And you probably wouldn't be able to do so even if you had the data,
because there would be so much noise in that data that it'd never tell
you anything.


As Mark and Michael have pointed out, you are engaged in the same circular
reasoning that has destroyed your credibility among the audiophile community
at large. That's one way never to have to think again...just assume away
any possibililites that might bring your favorite test into question.


Did you actually *read* that statement before you hit the 'send'
button? Firstly, where do you get off claiming any knowledge of 'the
audiophile community'? Secondly, and much more amusingly, "assume away
any possibililites that might bring your favorite test into question"
is an *exact* description of what you three are doing.


Again, nobody uses monadic tests to determine *whether* there's a
difference. Nobody.


And you know everything there is to know about this, right? Including what
every audio research lab in the country is up to?


Yes, right up until you can provide *evidence* to the contrary. And
let's just keep this to properly established companies with genuine
R&D facilities, shall we? Peter Qvortrup et al don't count -
especially as you seem to be hung up on 'credibility'!

--

Stewart Pinkerton | Music is Art - Audio is Engineering
  #135   Report Post  
 
Posts: n/a
Default

Harry Lavo wrote:

Re Harman:

They use sequential monadic
testing. Because they found it squared better with their objective tests.


Evidence, please.

snip

No, the truth of the matter is that it works well in spotting frequency and
volume level irregularities and artifacts from compression, which is what it
was developed for.


No, it was developed to test for audible differences, independent of
what those differences were. The fact that frequency and volume level
differences ARE the only relevant differences between audio components
make it perfectly suited to comparing said components.

You and the other objectivists *assume* it works equally
well for the open-ended evaluation of home audio components, but you have
never validated it for same.


Don't need to, for the reason noted above.

Simple as that, and you run scared and retreat
into circular reasoning whenever it is pointed out.


Oh, please. Anytime you want to prove us wrong, you go right ahead,
Harry. We're not stopping you.

bob


  #136   Report Post  
Keith Hughes
 
Posts: n/a
Default

Harry Lavo wrote:
"Keith Hughes" wrote in message
...

Harry Lavo wrote:

snip

This is sheer nonsense, in the current context, as has been pointed out to
you previously (by me, and I'll note that you did *not* reply).
Statistical significance requires *ONLY* one participant, with multiple
trials. Take *you* for example; you can easily do a sufficient number of
trials, using whatever *blind* methodology you would like, on two
components (say cables) for which *you* have identified, sighted, a
consistent audible difference, to determine whether the chosen method
statistically 'validates' your sighted results. That's it, finis.



Horse pucky!. You are desribing evaluation of an ABX test, Keith.


Where Harry? Please show us how "using whatever *blind* methodology you
would like" constrains you to "ABX", or even intimates that ABX is involved.

There's a whole different world of testing out there that you apparently are not
familiar with.


The same applies to you as well, obviously. So what? We're talking about
a narrow subject here.

We are not talking about frequency distributions within a population, the
only thing that would require a large population sample size, we're
talking about using *your* method to detect (blind) the differences that
*you* clearly hear...sighted.


Yeah, and if a whole population hears them "blind" they are real. That's
the *only* way you can tell if they are real.


That is ludicrous. You are saying, with that statement, that if I, for
example, can discriminate A and B in 60 of 60 blind, level matched
trials, the only way to verify that there really *is* a difference is to
increase the test population size. If you really believe that, then I
certainly can't help you.

snip

Again, irrelevant. This approach presupposes that presence of difference
is an unknown, and/or that the frequency of detection capability within
the population is not known, neither of which is the case here.


Once again, you are using the test in question as the standard, rather than
trying to independently confirm it for the purpose under question. Circular
reasoning.


Where on Earth did you get that from??? "Neither of which is the case
here" does not refer to ABX or results therefrom, it refers to the
presence of a large population of audiophiles (90% right?) who already
*easily* and *reliably* discriminate between cables, amps, etc., using
*some* method. We do not need to 'poll' the population, as it were, we
need only verify extant observations using the same methods (blind) used
to make those observations initially.

Then you know the difference is real (albeit perceived subjectively).


*You* already say that you know the difference is real, right? That's the
point, and that's why this is mere obfuscation.


I say no such thing. I say the first step is to use monadic testing to
determine if in fact the difference is real.


OK, sorry, my mistake. You've never said that you could hear
differences between cables, amps, CD players...right.

You then use an ABX test among a broad sample of yet another
similarly-screened group of people, using short-snippets of movements
X,Y,Z,


A fictitious constraint you gratuitously apply, yet again.


Not at all. You want to validate the test technique, so you've got to do it
once each among a broadscale group so you are testing the technique, not any
one individual, and using short snippets, the way it is almost always done
because of fatique/time constraints. The constraints are totally realistic.


If there is always time to do leisurely sighted evaluations, then
clearly there is time to do leisurely AB, ABX, etc. testing. If you are
really interested.

to see if in total they can detect the difference. If they test allows
them to do so, you have validated the test.


Right, you would have validated your 'monadic' testing for detection of
'difference', as that is what the ABX protocol is designed to do.


Not at all, since you claim the ABX is *the most sensitive* test...if it
shows up in the monadic ABX should pick it up...if it doesn't the test is no
good.


For "difference", remember?


If it does not, you have invalidated the test.


Clearly incorrect. This presupposes the superiority of the 'monadic' test
protocol. You get statistically significant results from *BOTH* tests,
and your conclusion is "see...ABX doesn't work!". Sorry, but that just
ignores basic statistical precepts. You cannot discount one method just
because it gives the results you don't want, when it has the same level of
significance (albeit with results in the opposite direction) as your 'pet'
method.

Sorry yourself. See my comments just above.


Which, to the extent they are not erroneous, are irrelevant to the
statement. If you have two methods that give different results, to the
same level of significance, you cannot *just* choose the one you like.
That's BASIC statistics. You need a referee test, or barring that, a
clear determination of the root cause of the disparity (i.e. flawed
assumptions or execution for one, or both, methods).

snip

Whether you use one person, or a thousand,
makes no difference whatsoever.

Sure it does. It takes the artificially constrained test apparatus out of
the equation. If the difference is real, and the sample size is large
enough, the monadic test will reveal it. Then it is simply a matter of
whether or not ABX does the same.


*If* these 'artificial constraints' significantly affect subject
response, as you claim, then you will *not* find it, no matter what the
sample size. You will have added another independent variable (i.e.
individual response to the 'constraint') that could inhibit or enhance
the probe response, and you won't know which, or if either, is the case.

We're talking discrimination here (i.e. a binary probe/response case),
not preference, wherein you can allow multiple independent variables
and, using a multivariate analysis tool such as RSM, identify response
as well as interactions of variables.


If *one* person using *any* blind protocol consistently identifies a
difference, then that difference is real (to the level of significance the
data allows). *Then* you can compare against ABX, or any other method.
This is a test that you, personally, could easily conduct if you were
truly interested. So where's your data?

Again, you are assuming the test is valid, rather than validating the test.
Totally circular.


Again, you totally ignore the reality of the situation. I'm saying you
have to have valid *Data* to first suspect, then validate the test, i.e.
an observation under controlled, blind conditions, using *ANY method
OTHER than ABX* that can be used to challenge ABX test. This whole
"validate the test" pogrom appears to me to be simple misdirection.
Let's recap the basic situation:

1. We have a subject population that can discriminate (according to
them) between A and B under sighted conditions. We do not need to sample
a large population for 'preference' as your monadic test scenario would,
we have a ready-made subject base, each with their own 'method' they
have "validated" sighted.

2. We can recreate *all* of the conditions 'they' normally use to make
those discriminations (duration, configuration, location, relaxation,
prestidigitation...etc.) with the *sole* exception of foreknowledge of
whether they listen to A or to B.

3. If, under the *exact same conditions + blind*, 'they' can no longer
discriminate, then their initial discrimination results are
unsubstantiated, and must be assumed to be invalid.

4. If, under the *exact same conditions + blind*, they *CAN*
discriminate, then their initial discrimination results are confirmed.

5. If subsequent testing of confirmed discrimination results, via ABX,
results in a null response, *then* the method is inappropriate for that use.

So you see, *the* method you decry is not even part of this scenario
until Step 5. I performed this test with both cables and DAC's; Step 3
got me. You, Harry Lavo, a single individual, could perform such a
test, at your leisure, and prove us all wrong (if you're right). So
what's the problem? Why all the waffling on about methods? If you have
a method that you think works for discrimination, then use it blind and
see. It's really that simple.

Keith Hughes
  #137   Report Post  
 
Posts: n/a
Default

wrote:

Secondly, I think they have "no troublesome observations" because they
invoke perceptual illusion to explain away any observation they don't
like, while at the same time admitting there are too many contributing
factors to explain any given perception--it "could be" illusion, so it
"must be" illusion;


Not what we said. What we said was, it could be an illusion, so you
cannot say it was not an illusion. IOW, a sighted observation tells us
nothing. It is not evidence of anything.

but we can never explain why any particular
illusion occurred.


No, we can't. But we can say with a high degree of probability that it
*was* an illusion.

bob
  #138   Report Post  
Harry Lavo
 
Posts: n/a
Default

wrote in message
...
Harry Lavo wrote:
"Stewart Pinkerton" wrote in message
...
On 7 Oct 2005 21:54:54 GMT, "Harry Lavo" wrote:

Stewart Pinkerton wrote:

That's how Science works. You observe something unusual, come up with
a theory to explain it, use that theory to predict something else,
and
observe the truth or falsity of your prediction.

The light-bending experiment came *after* the theory. What was it he
"observed" that led to the theory?

He developed his theory from the Lorentzian interpretation of
Maxwell's work, and arguably had prior knowledge of the
Michelson-Morley experimental results. He had many giants on whose
shoulders to stand....


Okay, you've shown you can dazzle. Now please interpret what
"observations"
he developed his theory to explain.


Hi Harry,

Einstein was trying to explain the results of the Michelson-Morley
experiment, which failed to find a difference in travel time between
two perpendicular light beams--strange, because it was expected that
the absolute motion of the Earth combined with the theory that light
travelled through an absolutely fixed ether would lead to different
travel times. Einstein's wildly brilliant solution was to propose that
there is no absolute motion, that the speed of light looks the same to
all observers.

The point of Stewart & Bob is that Einstein's theory was based on
troublesome observations. And, they say, there are no "troublesome
observations" in audio; they have a way to detect if differences are
audible, and this is consistent with the reigning theory of the ear's
function.

Where I think they are wrong is basing their model on the assumption
that the ear and brain can be observed objectively without regard to
observations carried out on the inside (observing one's own perception
and listening to others describe their perceptions). So they end up
with a model that describes the ear and brain very well---under one set
of conditions.

Secondly, I think they have "no troublesome observations" because they
invoke perceptual illusion to explain away any observation they don't
like, while at the same time admitting there are too many contributing
factors to explain any given perception--it "could be" illusion, so it
"must be" illusion; but we can never explain why any particular
illusion occurred.

Mike


Thanks, Michael, for the actual explaination, which apparently was too
mundane for Stewart to attempt.

We seem in basic agreement that the big problem is that musical
interpretation by the ear/brain is totally subjective and can only be
described from within. We may differ in where the implications go from
there (although perhaps not). But until proponents of short-snippet
comparative testing can find some way of validating that their test does not
interfere with normal musical cognition, which it seems to many myself
included it does, then it will not be accepted by many/most audiophiles.

  #139   Report Post  
Harry Lavo
 
Posts: n/a
Default

"Steven Sullivan" wrote in message
...
wrote:
Harry Lavo wrote:
wrote in message
...
Harry Lavo wrote:
Actually not, for the research it grew out of. But my contention is
that
IMO and that of many others, it *may* be fatally flawed as a device
for
the
open-ended evaluation of audio components. And that until it has
been
*validated* for that purpose, it should be promoted and received
with
substantial skepticism. My monadic test proposal is a legitimate
way of
doing that validation.

Lest anyone think that Harry has been vested with any authority to
declare what is and is not a valid psychoacoustic test, this is
completely bass-ackwards. ABX is validated both by its constant use
in
the field and by its ability to make and confirm predictions aboout
audibility. Whereas nobody in the field has ever used a monadic test
to
determine the audibility of anything. Not once. Ever. And for good
reason.

Tell that to Harman International and see my comments below.


Harman does not use monadic tests to determine audibility. If they use
monadic tests for anything (and I haven't seen anything they've
published using such tests), it is to explore perceived differences
between components that are already known to be audibly different. No
one would use monadic tests to determine *whether* two things were
audibly different. At least not anyone who knew what they were doing.



Indeed, when Sean Olive gave a talk on his work at an August 2004 AES
meeting, here is how he described the uses of various DBTs (note that the
requirement for *double blind* methodology goes without saying)

http://www.aes.org/sections/la/PastM...004-08-31.html

"Sean began by describing three types of listening tests:

* Difference
* Descriptive analysis
* Preference / affective

The difference test, obviously, is used for determining whether two audio
devices under test are audibly different from each other. A common method
is double-blind ABX testing.

The descriptive analysis test is for gathering impressions of comparative
audio quality from the listeners. If an ABX test reveals that device "A"
sounds audibly different from device "B," the descriptive analysis test
would determine in what way they sound different. The descriptive analysis
test has limited usefulness in audio, though.

And after the determinations of "whether different" and "how different,"
the preference or affective test asks the question, "Which one sounds
better?"


Descriptive tests can be used for this purpose as well, although it takes
larger sample sizes. Harman uses the descriptive tests to profile the
nature and degree of differences between the speakers. Interesting, some of
their preference and discriminatory tests apparently conflicted, but the
descriptive analysis and preference data apparently correlate (with the
caveat that I have not yet seem the reprints and am getting this info second
hand).


Each test has its own appropriate and inappropriate applications, as well
as its own strengths and potential pitfalls. In any test, biases have to
be controlled in order to obtain meaningful data. Most of his descriptions
of testing methods involved tests of loudspeakers, but the principles can
be put to use with other audio gear as well."


Steven, this may seem new and "news" to you, but these are standard types of
tests. I basically used them in the food industry for years, as well as
other types of research. I'm glad Harman has brought them to the attention
of the audio community, because there are more and better test for many uses
than abx...and I'm happy to see that Harman is using some of them.



You tell me how else, other than using ABX itself, you can determine
whether
a real perceived difference exists in one piece of audio gear versus
another.


I don't need anything else. ABX works. It allows me to make reliable
predictions back and forth. I can look at measurements and predict the
outcome of ABX tests--and be right. And I can look at the results of
ABX tests and predict the magnitude of measured differences--and be
right. You cannot do that with monadic tests, because you have no data.
And you probably wouldn't be able to do so even if you had the data,
because there would be so much noise in that data that it'd never tell
you anything.


Again, nobody uses monadic tests to determine *whether* there's a
difference. Nobody.



This call for 'validation' -- which at least one Stereophile
reader parrots in the October 2005 letters column, in response
to Jon Iverson's specious article on ABX tests ("The Blind LEading
the Blind?" Aug 2005) -- is interesting. Those making this
call should ask themselves:



To the best of my knowledge, I was among the first (if not the first) to
raise such a call...here...about two years ago. The need has simply become
apparent to more people as the discussion has continued on various
newsgroups.



1) Do ABX tests ever yield a 'difference' result for two
*certainly* identical sources (e.g., a phantom switch)?
No, they don't.

2) Do 'sighted' listening tests ever yield a
'difference' result in such tests? Yes, they do.


Nobody is arguing in this case for sighted tests. You are using it as a
strawman. And points 1) and 2) have nothing to do with validating abx,
since the suspicion against it has nothing to do with either.

  #140   Report Post  
Harry Lavo
 
Posts: n/a
Default

wrote in message ...
Harry Lavo wrote:

Re Harman:

They use sequential monadic
testing. Because they found it squared better with their objective
tests.


Evidence, please.



Based on hearsay at this point, but I will be getting there reprints. I
should have inserted the word "apparently" after because and before "they".

snip

No, the truth of the matter is that it works well in spotting frequency
and
volume level irregularities and artifacts from compression, which is what
it
was developed for.


No, it was developed to test for audible differences, independent of
what those differences were. The fact that frequency and volume level
differences ARE the only relevant differences between audio components
make it perfectly suited to comparing said components.

You and the other objectivists *assume* it works equally
well for the open-ended evaluation of home audio components, but you have
never validated it for same.


Don't need to, for the reason noted above.

Simple as that, and you run scared and retreat
into circular reasoning whenever it is pointed out.


Oh, please. Anytime you want to prove us wrong, you go right ahead,
Harry. We're not stopping you.


Well, don't hold your breath, but perhaps over the next year or two.........



  #141   Report Post  
Harry Lavo
 
Posts: n/a
Default

"Stewart Pinkerton" wrote in message
...
On 9 Oct 2005 00:50:05 GMT, "Harry Lavo" wrote:

wrote in message
...


ABX works. It allows me to make reliable
predictions back and forth. I can look at measurements and predict the
outcome of ABX tests--and be right. And I can look at the results of
ABX tests and predict the magnitude of measured differences--and be
right. You cannot do that with monadic tests, because you have no data.
And you probably wouldn't be able to do so even if you had the data,
because there would be so much noise in that data that it'd never tell
you anything.


As Mark and Michael have pointed out, you are engaged in the same circular
reasoning that has destroyed your credibility among the audiophile
community
at large. That's one way never to have to think again...just assume away
any possibililites that might bring your favorite test into question.


Did you actually *read* that statement before you hit the 'send'
button? Firstly, where do you get off claiming any knowledge of 'the
audiophile community'? Secondly, and much more amusingly, "assume away
any possibililites that might bring your favorite test into question"
is an *exact* description of what you three are doing.


No, we have not assumed it away. We have cited the need for verification
and validation. You are the one refusing to accept that fact...that your
test need validated for its intended purpose and hasn't been...in fact in
another post you say it would be rediculous to assume its use for that very
purpose.




Again, nobody uses monadic tests to determine *whether* there's a
difference. Nobody.


And you know everything there is to know about this, right? Including what
every audio research lab in the country is up to?


Yes, right up until you can provide *evidence* to the contrary. And
let's just keep this to properly established companies with genuine
R&D facilities, shall we? Peter Qvortrup et al don't count -
especially as you seem to be hung up on 'credibility'!



Well, actually Harman has already apparently found that monadic tests give
them more useful information for understanding what goes on between the
actual reproduction of music by audio equipment and its perception by
listeners. I think you would find that similar tests on, say, CD players
would yield somewhat similar results, but with less variation and larger
sample sizes required. I'm not sure about amplifiers or cables...but a
validation test would certainly find out now, wouldn't it?

  #142   Report Post  
Harry Lavo
 
Posts: n/a
Default

"Keith Hughes" wrote in message
...
Harry Lavo wrote:
"Keith Hughes" wrote in message
...

Harry Lavo wrote:

snip

This is sheer nonsense, in the current context, as has been pointed out
to you previously (by me, and I'll note that you did *not* reply).
Statistical significance requires *ONLY* one participant, with multiple
trials. Take *you* for example; you can easily do a sufficient number of
trials, using whatever *blind* methodology you would like, on two
components (say cables) for which *you* have identified, sighted, a
consistent audible difference, to determine whether the chosen method
statistically 'validates' your sighted results. That's it, finis.



Horse pucky!. You are desribing evaluation of an ABX test, Keith.


Where Harry? Please show us how "using whatever *blind* methodology you
would like" constrains you to "ABX", or even intimates that ABX is
involved.


Because ABX and its cousins are the only tests that use repeated comparisons
among individual users. That is audiometric testing, not social science
testing such as preference testing (AB) and descriptive testing (monadic, or
comparative monadic).


There's a whole different world of testing out there that you apparently
are not familiar with.


The same applies to you as well, obviously. So what? We're talking about a
narrow subject here.



No, *you* are talking about a narrow subject. You are also talking about
taking a test developed for use in a narrow way and applying it for use in a
much broader way. That is why the potential test set must also be
broadened.



We are not talking about frequency distributions within a population, the
only thing that would require a large population sample size, we're
talking about using *your* method to detect (blind) the differences that
*you* clearly hear...sighted.


Yeah, and if a whole population hears them "blind" they are real. That's
the *only* way you can tell if they are real.


That is ludicrous. You are saying, with that statement, that if I, for
example, can discriminate A and B in 60 of 60 blind, level matched trials,
the only way to verify that there really *is* a difference is to increase
the test population size. If you really believe that, then I certainly
can't help you.


No I would accept that as a real difference for you. And therefore in all
probablility a real difference although perhaps one that only a few percent
of the population might hear. If so, it would also show up in a monadic
test if the test sample is large enough (it only takes a few percent far off
the centerline of the bell-curve to create significant differences) But..

I am talking about differences that audiophiles claim to hear that don't
show up in ABX testing. If they are real, they will show up in monadic
perceptual testing, since it is a 'cleaner" test (e.g. fewer intervening
variables versus normal listening). If there are not differences, they won't
show up. But before we can conclude that ABX will also pick up these
differences (and I'm talking here about things like depth of soundstage,
transprency, holagraphy, etc.) we have to know if they are "real"
(statistically) under conditions approximating relaxed home listening.
Conditions that are a far cry from normal ABX-type testing.

ABX has been validated for volume threshold deterction and other volume
related artifacts; it has not been validated for other possible perception
differences or the open-ended evaluation of audio components claimed to be
heard under normal home use conditions.

snip

Again, irrelevant. This approach presupposes that presence of difference
is an unknown, and/or that the frequency of detection capability within
the population is not known, neither of which is the case here.


Once again, you are using the test in question as the standard, rather
than trying to independently confirm it for the purpose under question.
Circular reasoning.


Where on Earth did you get that from??? "Neither of which is the case
here" does not refer to ABX or results therefrom, it refers to the
presence of a large population of audiophiles (90% right?) who already
*easily* and *reliably* discriminate between cables, amps, etc., using
*some* method. We do not need to 'poll' the population, as it were, we
need only verify extant observations using the same methods (blind) used
to make those observations initially.


I made no claims that 90% of audiophiles can easily and reliably discrimate.
I said the 90% don't buy into abx testing as a valid means of evaluating the
musicality of audio components. The proposed use of monadic testing as a
control is to determine that in fact such differences can be discrimated by
enough audiophiles under more ideal musical listening and test conditions
than ABX to serve as a benchmark for ABX testing. I'm not talking about
volume or frequency-response aberrations here.


Then you know the difference is real (albeit perceived subjectively).

*You* already say that you know the difference is real, right? That's
the point, and that's why this is mere obfuscation.


I say no such thing. I say the first step is to use monadic testing to
determine if in fact the difference is real.


OK, sorry, my mistake. You've never said that you could hear differences
between cables, amps, CD players...right.



Apology accepted. Let me be perfectly clear...I said *if* the monadic phase
*shows* a statistically-significant difference then you *know* it is real
(even if conventional measurements don't show differences in frequency
response or volume to explain it). This gets at the difficulty EE's in
particular have with accepting potential differnences. If it is *real* it
shows up; if it is "not* it doesn't. Obviously, we would only want to use a
control test where a "real" difference showed up that is not volume or
frequency related.


You then use an ABX test among a broad sample of yet another
similarly-screened group of people, using short-snippets of movements
X,Y,Z,

A fictitious constraint you gratuitously apply, yet again.


Not at all. You want to validate the test technique, so you've got to do
it once each among a broadscale group so you are testing the technique,
not any one individual, and using short snippets, the way it is almost
always done because of fatique/time constraints. The constraints are
totally realistic.


If there is always time to do leisurely sighted evaluations, then clearly
there is time to do leisurely AB, ABX, etc. testing. If you are really
interested.


Not so. For the monadic test...each person does just one test. For the
comparative tests...they must do fiteen-twenty. For this to be practical,
reality dictates (and actually practice entails) that short musical snippets
are used for the testing since for open-ended evaluation of musical
reproduction several types of musical example must be used.


to see if in total they can detect the difference. If they test allows
them to do so, you have validated the test.

Right, you would have validated your 'monadic' testing for detection of
'difference', as that is what the ABX protocol is designed to do.


Not at all, since you claim the ABX is *the most sensitive* test...if it
shows up in the monadic ABX should pick it up...if it doesn't the test is
no good.


For "difference", remember?


Right. Let me be more clear. If it shows up as a statistically significant
"difference" in the monadic test (between two cells employing the two
equipment variables under test) then it *should* show up in ABX as a
significant difference.


If it does not, you have invalidated the test.

Clearly incorrect. This presupposes the superiority of the 'monadic'
test protocol. You get statistically significant results from *BOTH*
tests, and your conclusion is "see...ABX doesn't work!". Sorry, but that
just ignores basic statistical precepts. You cannot discount one method
just because it gives the results you don't want, when it has the same
level of significance (albeit with results in the opposite direction) as
your 'pet' method.

Sorry yourself. See my comments just above.


Which, to the extent they are not erroneous, are irrelevant to the
statement. If you have two methods that give different results, to the
same level of significance, you cannot *just* choose the one you like.
That's BASIC statistics. You need a referee test, or barring that, a clear
determination of the root cause of the disparity (i.e. flawed assumptions
or execution for one, or both, methods).


Right. That's why I have proposed the classic monadic, descriptive test as
the control....because it is the least interruptive type of test that can be
done and therefore there are fewer potentially intervening and contaminating
variables. And obviously, all the tests must be executed properly and with
proper controls.


snip

Whether you use one person, or a thousand,
makes no difference whatsoever.

Sure it does. It takes the artificially constrained test apparatus out
of the equation. If the difference is real, and the sample size is large
enough, the monadic test will reveal it. Then it is simply a matter of
whether or not ABX does the same.


*If* these 'artificial constraints' significantly affect subject response,
as you claim, then you will *not* find it, no matter what the sample size.
You will have added another independent variable (i.e. individual response
to the 'constraint') that could inhibit or enhance the probe response, and
you won't know which, or if either, is the case.

We're talking discrimination here (i.e. a binary probe/response case), not
preference, wherein you can allow multiple independent variables and,
using a multivariate analysis tool such as RSM, identify response as well
as interactions of variables.


You can do the same with monadic testing...exactly the same thing. You just
need larger sample sizes for a given level of significance.


If *one* person using *any* blind protocol consistently identifies a
difference, then that difference is real (to the level of significance
the data allows). *Then* you can compare against ABX, or any other
method. This is a test that you, personally, could easily conduct if you
were truly interested. So where's your data?

Again, you are assuming the test is valid, rather than validating the
test. Totally circular.


Again, you totally ignore the reality of the situation. I'm saying you
have to have valid *Data* to first suspect, then validate the test, i.e.
an observation under controlled, blind conditions, using *ANY method OTHER
than ABX* that can be used to challenge ABX test. This whole "validate
the test" pogrom appears to me to be simple misdirection. Let's recap the
basic situation:

1. We have a subject population that can discriminate (according to them)
between A and B under sighted conditions. We do not need to sample a large
population for 'preference' as your monadic test scenario would, we have a
ready-made subject base, each with their own 'method' they have
"validated" sighted.

2. We can recreate *all* of the conditions 'they' normally use to make
those discriminations (duration, configuration, location, relaxation,
prestidigitation...etc.) with the *sole* exception of foreknowledge of
whether they listen to A or to B.

3. If, under the *exact same conditions + blind*, 'they' can no longer
discriminate, then their initial discrimination results are
unsubstantiated, and must be assumed to be invalid.

4. If, under the *exact same conditions + blind*, they *CAN*
discriminate, then their initial discrimination results are confirmed.

5. If subsequent testing of confirmed discrimination results, via ABX,
results in a null response, *then* the method is inappropriate for that
use.

So you see, *the* method you decry is not even part of this scenario until
Step 5. I performed this test with both cables and DAC's; Step 3 got me.
You, Harry Lavo, a single individual, could perform such a test, at your
leisure, and prove us all wrong (if you're right). So what's the problem?
Why all the waffling on about methods? If you have a method that you
think works for discrimination, then use it blind and see. It's really
that simple.


The sequence you describe is exactly the one I have outlined, Keith. Except
that steps 1-4 would be carried out among a large population and step 2
would be combined with step 3 and step 4 (although it could be done exactly
as you outline). However, I am not interested in what the group *thinks*
they can hear sighted, I want to know *that* they can hear it blind in a
monadic test. I don't want to measure "phantom" differences...first I want
to establish that there is in fact a "real" difference blind that can be
discriminated by a relatively large group of people. But I want to use a
non-intrusive test to do it. *Then* do an ABX test wherein a similarly
large group of people use ABX, *except* that instead of a few doing 17
samples, seventeen (for each) do one sample. This separates the test (e.g.
ABX vs. monadic) from the individual doing the testing. The final test once
ABX is validated is then for individuals to use it themselves, twenty times
each if preferred.

  #143   Report Post  
 
Posts: n/a
Default

Harry Lavo wrote:
wrote in message ...
Harry Lavo wrote:

Re Harman:

They use sequential monadic
testing. Because they found it squared better with their objective
tests.


Evidence, please.



Based on hearsay at this point, but I will be getting there reprints. I
should have inserted the word "apparently" after because and before "they".


Thought so. A much more logical reason why they would have used monadic
tests is that they were comparing dozens of speakers at a time. To do
match-pair comparisons of all of them would have taken well-nigh
forever. I think you'll be disappointed when you see those reprints.

And note, once again, that they were doing *preference* tests, not
*discrimination* tests. You still haven't come up with a single example
of anyone using monadic testing for discrimination.

bob
  #146   Report Post  
Steven Sullivan
 
Posts: n/a
Default

wrote:
wrote:
Harry Lavo wrote:
wrote in message ...
Harry Lavo wrote:

Re Harman:

They use sequential monadic
testing. Because they found it squared better with their objective
tests.

Evidence, please.



Based on hearsay at this point, but I will be getting there reprints. I
should have inserted the word "apparently" after because and before "they".


Thought so. A much more logical reason why they would have used monadic
tests is that they were comparing dozens of speakers at a time.


On second thought...


I went and read Sean Olive's three AES papers from 2003-04 and wouldn't
you know? He doesn't use monadic testing *at all*.


What's worse, he uses *short snippets*.


And even *quick switching*.



So does Toole in his 28-page 1984 JAES paper, 'Subjective Measurements of Loudspeaker Sound
Quality and Listener Performance' . The paper, like most JAES
papers, is avaialble electronically for $20 from the AES website ( I cheaped out and
xeroxed it at a local engineering library)

His section 1, 'Subjective Measurement -- Development of the Technique' is
a partcularly interesting read. It's illustrative of Toole's thoroughness to
simply note the subsection headings of this section:
1.1 A Brief History
1.2 Improving the Technique
1.3 Sources of Variability
1.4 Controlling the Variables
1.4.1 Controlling the technical and environmental variables
1.4.2 Controlling the listener variables
1.4.3 Controlling the experimental variables
1.5 Choosing an Experimental Method
1.5.1 A Discussion of the Options
1.5.2 Scaling the Listener Responses
1.5.3 Technical Ratings
1.5.4 Experimental Apparatus


1.1 is a survey of listening tests from 1950 to the early 80s, from
which one notes that even at the dawn of audio press listening tests
(reviews) in the early 60's -- a time when statistical analysis began
to appear in scientific listening tests -- the sighted method was known
to be substandard : "Listening tests continued, of course, with the
audio press developing its own version, known as the product review.
In spite of the large audience and influence that these product
assessments had, the tests themselves were usually of the most
rudimentary kind. The large variations in opinion resulting from these
widely publicized tests simply confused the picture, cultivating a
public mistrust in measurements and a reliance on 'golden eared'
listeners.
In 1975 Cooke presented a careful analysis of the prevailing practice
of [loudspeaker] listening tests and concluded that a great many
yielded results that were so influenced by extraneous factors as to be
misleading"

At this point one can't help realize how little has changed in the audio
press since then.



--

-S
  #147   Report Post  
Stewart Pinkerton
 
Posts: n/a
Default

On 9 Oct 2005 21:44:59 GMT, "Harry Lavo" wrote:

"Stewart Pinkerton" wrote in message
...
On 9 Oct 2005 00:50:05 GMT, "Harry Lavo" wrote:


Again, nobody uses monadic tests to determine *whether* there's a
difference. Nobody.

And you know everything there is to know about this, right? Including what
every audio research lab in the country is up to?


Yes, right up until you can provide *evidence* to the contrary. And
let's just keep this to properly established companies with genuine
R&D facilities, shall we? Peter Qvortrup et al don't count -
especially as you seem to be hung up on 'credibility'!


Well, actually Harman has already apparently found that monadic tests give
them more useful information for understanding what goes on between the
actual reproduction of music by audio equipment and its perception by
listeners.


Well actually no it hasn't, not for determing *difference*. Only
*after* difference has been established do they move to *preference*
testing, where monadic testing is certainly appropriate. Not e hoever
that it is necessary to *first* establish difference. Without
difference, preference is nonsensical.

I think you would find that similar tests on, say, CD players
would yield somewhat similar results, but with less variation and larger
sample sizes required. I'm not sure about amplifiers or cables...but a
validation test would certainly find out now, wouldn't it?


Harry, ABX is perfectly valid - that's why the professionals use it.
The armchair quarter-backing of those who don't like the results it
gives, will not alter this fact.

--

Stewart Pinkerton | Music is Art - Audio is Engineering
  #148   Report Post  
Harry Lavo
 
Posts: n/a
Default

wrote in message ...
Harry Lavo wrote:
wrote in message
...
Harry Lavo wrote:

Re Harman:

They use sequential monadic
testing. Because they found it squared better with their objective
tests.

Evidence, please.



Based on hearsay at this point, but I will be getting there reprints. I
should have inserted the word "apparently" after because and before
"they".


Thought so. A much more logical reason why they would have used monadic
tests is that they were comparing dozens of speakers at a time. To do
match-pair comparisons of all of them would have taken well-nigh
forever. I think you'll be disappointed when you see those reprints.

And note, once again, that they were doing *preference* tests, not
*discrimination* tests. You still haven't come up with a single example
of anyone using monadic testing for discrimination.


You don't seem to understand that *preference* tests are matched pair tests.

But their descriptive evaluations were monadic.

  #150   Report Post  
Keith Hughes
 
Posts: n/a
Default

Harry Lavo wrote:
"Keith Hughes" wrote in message
...

Harry Lavo wrote:


snip

Where Harry? Please show us how "using whatever *blind* methodology you
would like" constrains you to "ABX", or even intimates that ABX is
involved.


Because ABX and its cousins are the only tests that use repeated comparisons
among individual users. That is audiometric testing, not social science
testing such as preference testing (AB) and descriptive testing (monadic, or
comparative monadic).


And this is audio, Harry, not social science. When you do taste testing
for foods, you are free to rely on *all* organoleptic perceptual
components as that *is* the context of use. That's why this type of
test is not suitable for verification of *audible* differences - it is
not designed to control the organoleptic components that contribute to a
"preference".

There's a whole different world of testing out there that you apparently
are not familiar with.


The same applies to you as well, obviously. So what? We're talking about a
narrow subject here.


No, *you* are talking about a narrow subject. You are also talking about
taking a test developed for use in a narrow way and applying it for use in a
much broader way. That is why the potential test set must also be
broadened.


Now, *We* are...audio.

snip

No I would accept that as a real difference for you. And therefore in all
probablility a real difference although perhaps one that only a few percent
of the population might hear.


Thank you for making my point. Your test is a population distribution
test, *not* a discrimination test. In the scenario I presented, the
results were not real *for me*, they were real. From that point, we can
discuss investigative ways for determining cause and extent. One need
continue with a larger sample size *ONLY* if distribution within the
population is of interest.

If so, it would also show up in a monadic
test if the test sample is large enough (it only takes a few percent far off
the centerline of the bell-curve to create significant differences) But..


If it were just a few percent off the centerline, there would be a very
low significance, buried in the noise. Tuchy's would likely identify
them as outliers.

I am talking about differences that audiophiles claim to hear that don't
show up in ABX testing. If they are real, they will show up in monadic
perceptual testing, since it is a 'cleaner" test (e.g. fewer intervening
variables versus normal listening). If there are not differences, they won't
show up. But before we can conclude that ABX will also pick up these
differences (and I'm talking here about things like depth of soundstage,
transprency, holagraphy, etc.) we have to know if they are "real"
(statistically) under conditions approximating relaxed home listening.
Conditions that are a far cry from normal ABX-type testing.


Yes, you're talking about differences that have not been demonstrated
under *any* test scenario other than sighted.

snip

I made no claims that 90% of audiophiles can easily and reliably discrimate.


You claimed that 90% of audiophiles believed, counter to the
objectivists, that components sounded different, and could be
distinguished. The "easily and reliably" reflects the opinions
typically espoused here, a la our radioactive buddy.

I said the 90% don't buy into abx testing as a valid means of evaluating the
musicality of audio components.


And thus, the belief in ability to discriminate is based solely on
sighted evaluations, right?

snip

1. We have a subject population that can discriminate (according to them)
between A and B under sighted conditions. We do not need to sample a large
population for 'preference' as your monadic test scenario would, we have a
ready-made subject base, each with their own 'method' they have
"validated" sighted.

2. We can recreate *all* of the conditions 'they' normally use to make
those discriminations (duration, configuration, location, relaxation,
prestidigitation...etc.) with the *sole* exception of foreknowledge of
whether they listen to A or to B.

3. If, under the *exact same conditions + blind*, 'they' can no longer
discriminate, then their initial discrimination results are
unsubstantiated, and must be assumed to be invalid.

4. If, under the *exact same conditions + blind*, they *CAN*
discriminate, then their initial discrimination results are confirmed.

5. If subsequent testing of confirmed discrimination results, via ABX,
results in a null response, *then* the method is inappropriate for that
use.


snip

The sequence you describe is exactly the one I have outlined, Keith. Except
that steps 1-4 would be carried out among a large population and step 2
would be combined with step 3 and step 4 (although it could be done exactly
as you outline).


Except that each person would conduct only *one* trial, which of course
introduces another huge source of error (albeit random, *if* properly
executed)...the reason that a huge sample size is needed for that type
of test. Not nearly the "clean" test you claim.

However, I am not interested in what the group *thinks*
they can hear sighted, I want to know *that* they can hear it blind in a
monadic test.


Well, you clearly are interested Harry, because the *only* reason to
question whether the boundaries of ABX testing can extend beyond where
you believe it to be 'validated', is the presence of anecdotal evidence
based on sighted evaluation. What other reasons are there
(non-phenomenological that is)?

I don't want to measure "phantom" differences...first I want
to establish that there is in fact a "real" difference blind that can be
discriminated by a relatively large group of people. But I want to use a
non-intrusive test to do it.


Well, first, you don't know that the test is "non-intrusive" to a
greater extent than is ABX. You merely assume that it is. Second, you
have *no* validation of your monadic testing for detecting *audible*
differences. Further, you assume that audio and organoleptic perceptions
are testable in the same fashion, and that the intrusiveness of any
particular test constraint apply equally to both (these are implicit in
your belief that the monadic test is suitable). Neither of which has
been verified. Thus, you are using an unverified method as a reference
against which to 'validate' a test that has been at least partially
validated, by your own admission, for the sensory mode under test. An
untenable approach IMO.

*Then* do an ABX test wherein a similarly
large group of people use ABX, *except* that instead of a few doing 17
samples, seventeen (for each) do one sample. This separates the test (e.g.
ABX vs. monadic) from the individual doing the testing.


No, Harry, it does not "separate the test from the individual" at all.
That process is the same - you *assume* that multiple presentations are
inherently more intrusive, and thus data-correlative, than is a single
presentation, something I believe you have no data to support, relative
to audio. There's a very valid reason that repetitive trials for
organoleptic perception testing are problematic - the senses quickly
become habituated, and discrimination ability is reduced. AFAIK,
barring fatigue (or high volume related artifacts which should, of
course, be controlled for in the test), this has not been shown to be an
issue with auditory testing.

The final test once
ABX is validated is then for individuals to use it themselves, twenty times
each if preferred.


Again, Harry, you want to use a test that has not shown *any* utility
for audible testing for a reference to 'validate' one that has. Your
call to "validate the test" applies to an even greater measure to the
test you propose as the reference.

Keith Hughes


  #151   Report Post  
Alan Hoyle
 
Posts: n/a
Default

On 8 Oct 2005 17:16:07 GMT, Stewart Pinkerton wrote:

[snip]

Read my lips - wire is wire.


Wait a minute, now I'm confused: I thought we were arguing about
"blind" testing, not "deaf" testing.... :-)

-alan

--
Alan Hoyle - - http://www.alanhoyle.com/
"I don't want the world, I just want your half." -TMBG
Get Horizontal, Play Ultimate.
  #152   Report Post  
 
Posts: n/a
Default

"Harry Lavo" wrote in message
...
"Steven Sullivan" wrote in message


1) Do ABX tests ever yield a 'difference' result for two
*certainly* identical sources (e.g., a phantom switch)?
No, they don't.

2) Do 'sighted' listening tests ever yield a
'difference' result in such tests? Yes, they do.


Nobody is arguing in this case for sighted tests. You are using it as a
strawman. And points 1) and 2) have nothing to do with validating abx,
since the suspicion against it has nothing to do with either.


I don't suppose you would be willing to make a suggestion as to how one
might "validate" the reliability of an ABX test--at least to your
satisfaction? IOW, what procedure that would be convincing to an ABX
doubter? I've always figured that double blind testing, of which ABX is an
example, as being the way to validate OTHER tests--the gold standard as it
were.

I'm willing to devote a substantial amount of time to doing a satisfactory
job of validating the ABX concept, but only if it will convince the
skeptics.

Norm Strong

  #153   Report Post  
Stewart Pinkerton
 
Posts: n/a
Default

On 9 Oct 2005 21:12:16 GMT, "Harry Lavo" wrote:

wrote in message
...
Harry Lavo wrote:
"Stewart Pinkerton" wrote in message
...
On 7 Oct 2005 21:54:54 GMT, "Harry Lavo" wrote:

Stewart Pinkerton wrote:

That's how Science works. You observe something unusual, come up with
a theory to explain it, use that theory to predict something else,
and
observe the truth or falsity of your prediction.

The light-bending experiment came *after* the theory. What was it he
"observed" that led to the theory?

He developed his theory from the Lorentzian interpretation of
Maxwell's work, and arguably had prior knowledge of the
Michelson-Morley experimental results. He had many giants on whose
shoulders to stand....

Okay, you've shown you can dazzle. Now please interpret what
"observations"
he developed his theory to explain.


Hi Harry,

Einstein was trying to explain the results of the Michelson-Morley
experiment, which failed to find a difference in travel time between
two perpendicular light beams--strange, because it was expected that
the absolute motion of the Earth combined with the theory that light
travelled through an absolutely fixed ether would lead to different
travel times. Einstein's wildly brilliant solution was to propose that
there is no absolute motion, that the speed of light looks the same to
all observers.

The point of Stewart & Bob is that Einstein's theory was based on
troublesome observations. And, they say, there are no "troublesome
observations" in audio; they have a way to detect if differences are
audible, and this is consistent with the reigning theory of the ear's
function.

Where I think they are wrong is basing their model on the assumption
that the ear and brain can be observed objectively without regard to
observations carried out on the inside (observing one's own perception
and listening to others describe their perceptions). So they end up
with a model that describes the ear and brain very well---under one set
of conditions.

Secondly, I think they have "no troublesome observations" because they
invoke perceptual illusion to explain away any observation they don't
like, while at the same time admitting there are too many contributing
factors to explain any given perception--it "could be" illusion, so it
"must be" illusion; but we can never explain why any particular
illusion occurred.

Mike


Thanks, Michael, for the actual explaination, which apparently was too
mundane for Stewart to attempt.


Actually, the moderator bounced my explanation, as he felt you might
find my comments hurtful........................ :-)

We seem in basic agreement that the big problem is that musical
interpretation by the ear/brain is totally subjective and can only be
described from within. We may differ in where the implications go from
there (although perhaps not). But until proponents of short-snippet
comparative testing can find some way of validating that their test does not
interfere with normal musical cognition, which it seems to many myself
included it does, then it will not be accepted by many/most audiophiles.


The test is validated every day in the R&D labs of major players in
the audio industry. That you three don't *like* the results it gives,
e.g. that wire is just wire, doesn't invalidate the test.

You are the ones making the extraordinary claims, so where is *your*
evidence in support?

--

Stewart Pinkerton | Music is Art - Audio is Engineering
  #154   Report Post  
Harry Lavo
 
Posts: n/a
Default

"Stewart Pinkerton" wrote in message
...
On 9 Oct 2005 21:44:59 GMT, "Harry Lavo" wrote:


snip



Well actually no it hasn't, not for determing *difference*. Only
*after* difference has been established do they move to *preference*
testing, where monadic testing is certainly appropriate. Not e hoever
that it is necessary to *first* establish difference. Without
difference, preference is nonsensical.


I've been told that their preference tests discriminated better than their
discrimination tests, which is why they didn't use them, but I won't state
that as fact until I've obtained and read the test write-ups myself. If
that finding is actually true, it would cast grave doubt on abx, since
preference tests are simple blind AB tests.

  #155   Report Post  
 
Posts: n/a
Default

Harry Lavo wrote:

But their descriptive evaluations were monadic.


Only if "monadic" means whatever Harry Lavo wants it to mean at any
given moment. You're the one who's lectured us for two years about how
standard ABX tests were insufficiently sensitive because they forced
the listener into "comparative mode." Well what the hell kind of mode
do you think a listener is in when he's evaluating four speakers at
once, changing at 15-30 second intervals, with 3-second gaps between
changes?

Try reading the research before you start pontificating about it,
Harry.

bob


  #156   Report Post  
 
Posts: n/a
Default

Keith Hughes wrote:
Harry Lavo wrote:

I don't want to measure "phantom" differences...first I want
to establish that there is in fact a "real" difference blind that can be
discriminated by a relatively large group of people. But I want to use a
non-intrusive test to do it.


Well, first, you don't know that the test is "non-intrusive" to a
greater extent than is ABX. You merely assume that it is.


Actually, Harry's "test," such as it is, would be far more intrusive
than an ABX test, because it forces subjects to listen to and for the
things Harry thinks they should be listening to and for. Whereas an ABX
test doesn't ask you to do that; it allows you to listen however you
would if you were deciding which of two cables to purchase.

Second, you
have *no* validation of your monadic testing for detecting *audible*
differences. Further, you assume that audio and organoleptic perceptions
are testable in the same fashion, and that the intrusiveness of any
particular test constraint apply equally to both (these are implicit in
your belief that the monadic test is suitable). Neither of which has
been verified. Thus, you are using an unverified method as a reference
against which to 'validate' a test that has been at least partially
validated, by your own admission, for the sensory mode under test. An
untenable approach IMO.


Yeah, the whole idea that a standard listening test needs to be
"validated" against a test that's never in history been used for that
purpose is absurd.

bob
  #157   Report Post  
Harry Lavo
 
Posts: n/a
Default

wrote in message
...
"Harry Lavo" wrote in message
...
"Steven Sullivan" wrote in message


1) Do ABX tests ever yield a 'difference' result for two
*certainly* identical sources (e.g., a phantom switch)?
No, they don't.

2) Do 'sighted' listening tests ever yield a
'difference' result in such tests? Yes, they do.


Nobody is arguing in this case for sighted tests. You are using it as a
strawman. And points 1) and 2) have nothing to do with validating abx,
since the suspicion against it has nothing to do with either.


I don't suppose you would be willing to make a suggestion as to how one
might "validate" the reliability of an ABX test--at least to your
satisfaction? IOW, what procedure that would be convincing to an ABX
doubter? I've always figured that double blind testing, of which ABX is
an
example, as being the way to validate OTHER tests--the gold standard as it
were.

I'm willing to devote a substantial amount of time to doing a satisfactory
job of validating the ABX concept, but only if it will convince the
skeptics.


Norm, what do you think my half dozen last posts have been about? They've
been about how to go about validating ABX...by using broadbased monadic
testing and full musical excerpts to identify a subtle yet real difference
unrelated to frequency response or signal levels. This test is the test
least likely to interfere with normal listening habits and therefore the one
most likely to catch such differences if they exist. Once such a difference
is confirmed, then the test would be replicated using ABX techniques. If
ABX picks up the difference, it is validated. And if it is validated, many
of us will stop arguing and start using it. If it is not validated,
hopefully some objectivists would consider abandoning it.

  #158   Report Post  
Harry Lavo
 
Posts: n/a
Default

"Keith Hughes" wrote in message
...
Harry Lavo wrote:
"Keith Hughes" wrote in message
...

Harry Lavo wrote:


snip

Where Harry? Please show us how "using whatever *blind* methodology you
would like" constrains you to "ABX", or even intimates that ABX is
involved.


Because ABX and its cousins are the only tests that use repeated
comparisons among individual users. That is audiometric testing, not
social science testing such as preference testing (AB) and descriptive
testing (monadic, or comparative monadic).


And this is audio, Harry, not social science. When you do taste testing
for foods, you are free to rely on *all* organoleptic perceptual
components as that *is* the context of use. That's why this type of test
is not suitable for verification of *audible* differences - it is not
designed to control the organoleptic components that contribute to a
"preference".


There is a heavy social science side to audio that has been ignored....for
the interpretation of sound, particularly musical value judgements (i.e. is
the bass "right", does the orchestra sound "lifelike") are subjective
judgements that can only be reported by people....no different than people
reporting whether they liked a certain food, or color, or flavor, or thought
a certain imitation sour cream mix tasted "almost like the real thing".
Testing that ignores this aspect of the audiophile experience is
automatically suspect, and that is one of the reasons why the abx test is
not embraced by most audiophiles.


There's a whole different world of testing out there that you apparently
are not familiar with.

The same applies to you as well, obviously. So what? We're talking about
a narrow subject here.


No, *you* are talking about a narrow subject. You are also talking about
taking a test developed for use in a narrow way and applying it for use
in a much broader way. That is why the potential test set must also be
broadened.


Now, *We* are...audio.

snip


Your irony escapes me.


No I would accept that as a real difference for you. And therefore in
all probablility a real difference although perhaps one that only a few
percent of the population might hear.


Thank you for making my point. Your test is a population distribution
test, *not* a discrimination test. In the scenario I presented, the
results were not real *for me*, they were real. From that point, we can
discuss investigative ways for determining cause and extent. One need
continue with a larger sample size *ONLY* if distribution within the
population is of interest.


Sorry, Keith, there are very sophisticated probability measurements to
determine significance between two distributed populations, and if the
difference exists (and the population of testers large enough) it will be
picked up.

Furthermore, the percentage of people hearing the same thing is likely to be
higher because the test is less demanding and more closely approximates
normal listening.


If so, it would also show up in a monadic test if the test sample is
large enough (it only takes a few percent far off the centerline of the
bell-curve to create significant differences) But..


If it were just a few percent off the centerline, there would be a very
low significance, buried in the noise. Tuchy's would likely identify them
as outliers.


In measuring probabilities against a null hypothesis in distributed samples,
if their are outliers there is a reason for them....that's one of the
beauties of using a distributed population. Their is virtually no chance of
a true outlier screwing up the results.


I am talking about differences that audiophiles claim to hear that don't
show up in ABX testing. If they are real, they will show up in monadic
perceptual testing, since it is a 'cleaner" test (e.g. fewer intervening
variables versus normal listening). If there are not differences, they
won't show up. But before we can conclude that ABX will also pick up
these differences (and I'm talking here about things like depth of
soundstage, transprency, holagraphy, etc.) we have to know if they are
"real" (statistically) under conditions approximating relaxed home
listening. Conditions that are a far cry from normal ABX-type testing.


Yes, you're talking about differences that have not been demonstrated
under *any* test scenario other than sighted.


First, how many published component tests are you citing to support this
fact? Cite them please. And of those, how many have *not* been ABX tests?
In other words, how do you determine that the differences not being found
are not the result of the test technique and environment itself.


snip

I made no claims that 90% of audiophiles can easily and reliably
discrimate.


You claimed that 90% of audiophiles believed, counter to the objectivists,
that components sounded different, and could be distinguished. The
"easily and reliably" reflects the opinions typically espoused here, a la
our radioactive buddy.


Thank you for toning down you claim.


I said the 90% don't buy into abx testing as a valid means of evaluating
the musicality of audio components.


And thus, the belief in ability to discriminate is based solely on sighted
evaluations, right?


Wrong. ABX tests are only one kind of blind test. To the best of my
knowledge nobody in the audio industry has yet had the motivation or
resources to undertake the kind of validation testing I have proposed. That
is another kind. Simple AB preference tests, done blind, are a third. I
can probably name another four or six variations on these.

Again, my argument is not with blind, other than its practicality for home
use in the purchase of equipment. My problem is with short-snippet,
comparative testing, of which ABX is the leading example. And the same goes
for most other opponents whose position I have run into here on usenet.
Your suggestion that we oppose blind testing is a strawman that is often
used on usenet to avoid engaging over the real issues raised against ABX and
its ilk.


snip

1. We have a subject population that can discriminate (according to them)
between A and B under sighted conditions. We do not need to sample a
large population for 'preference' as your monadic test scenario would, we
have a ready-made subject base, each with their own 'method' they have
"validated" sighted.

2. We can recreate *all* of the conditions 'they' normally use to make
those discriminations (duration, configuration, location, relaxation,
prestidigitation...etc.) with the *sole* exception of foreknowledge of
whether they listen to A or to B.

3. If, under the *exact same conditions + blind*, 'they' can no longer
discriminate, then their initial discrimination results are
unsubstantiated, and must be assumed to be invalid.

4. If, under the *exact same conditions + blind*, they *CAN*
discriminate, then their initial discrimination results are confirmed.

5. If subsequent testing of confirmed discrimination results, via ABX,
results in a null response, *then* the method is inappropriate for that
use.


snip

The sequence you describe is exactly the one I have outlined, Keith.
Except that steps 1-4 would be carried out among a large population and
step 2 would be combined with step 3 and step 4 (although it could be
done exactly as you outline).


Except that each person would conduct only *one* trial, which of course
introduces another huge source of error (albeit random, *if* properly
executed)...the reason that a huge sample size is needed for that type of
test. Not nearly the "clean" test you claim.

However, I am not interested in what the group *thinks* they can hear
sighted, I want to know *that* they can hear it blind in a monadic test.


Well, you clearly are interested Harry, because the *only* reason to
question whether the boundaries of ABX testing can extend beyond where you
believe it to be 'validated', is the presence of anecdotal evidence based
on sighted evaluation. What other reasons are there (non-phenomenological
that is)?


I'm not saying I'm not interested in the question, I'm saying that for
purposes of establishing a control test is is irrelevant. The control must
be both perceived and "real" in the sense of being able to be measured with
statistical significance in monadic testing. Things that are perceived but
are not real are totally irrelevant to the necessary control. The purpose
of the control test is to see if ABX testing can pick up real differences
that are not volume or frequency response related.


I don't want to measure "phantom" differences...first I want to establish
that there is in fact a "real" difference blind that can be discriminated
by a relatively large group of people. But I want to use a non-intrusive
test to do it.


Well, first, you don't know that the test is "non-intrusive" to a greater
extent than is ABX. You merely assume that it is. Second, you have *no*
validation of your monadic testing for detecting *audible* differences.
Further, you assume that audio and organoleptic perceptions are testable
in the same fashion, and that the intrusiveness of any particular test
constraint apply equally to both (these are implicit in your belief that
the monadic test is suitable). Neither of which has been verified. Thus,
you are using an unverified method as a reference against which to
'validate' a test that has been at least partially validated, by your own
admission, for the sensory mode under test. An untenable approach IMO.


Absolutely, I don't know. But I do know test design, and there has been
plenty of discussion about how people listen here in arguing these
issues...even Arnie's 10 criteria deal with some of them. I can certainly
say that the monadic test as proposed comes a lot closer than does ABX.
First, it only has to be done once, so it can use full segments of music,
establishing musical context and allowing time for differences to surface.
Second, it does not require active comparison at all. It simply requires
normal audiophile-type reactions to the music and the sound. Third, any
"rating" is done after the listening is over, not during it, and is based on
recall..recall that can take into account perceptions both acute and vague,
as well as feeling-states.

Compare that to ABX where one must somehow get fifteen-twenty choices made,
with multiple comparisons to be made before each choice, using an
intervening box thus changing the physical parameters of the test setup, and
of necessity (because of the time/stress factor) using only short snippets
of music. Very relaxing and enjoyable, right?

The monadic testing approximates as much as possible a normal listening
environment and lets statistics tell us about diffrences in after-the-fact
reported impressions/ratings. The ABX forces us to make constant, rational
comparisons in a rush to beat fatique.


*Then* do an ABX test wherein a similarly large group of people use ABX,
*except* that instead of a few doing 17 samples, seventeen (for each) do
one sample. This separates the test (e.g. ABX vs. monadic) from the
individual doing the testing.


No, Harry, it does not "separate the test from the individual" at all.
That process is the same - you *assume* that multiple presentations are
inherently more intrusive, and thus data-correlative, than is a single
presentation, something I believe you have no data to support, relative to
audio. There's a very valid reason that repetitive trials for
organoleptic perception testing are problematic - the senses quickly
become habituated, and discrimination ability is reduced. AFAIK, barring
fatigue (or high volume related artifacts which should, of course, be
controlled for in the test), this has not been shown to be an issue with
auditory testing.


You must be kidding. This is one of the things most commented upon by
people using the technique...how quickly they lose the ability to
discriminate along with growing fatigue. Even those using and supporting
the test often report it as fatiguing and rather grueling with a sense
developing in the late stages of great uncertainty. The ITU guidelines even
comment upon this aspect of the test, as one of the reason for limiting the
number of trials.


The final test once ABX is validated is then for individuals to use it
themselves, twenty times each if preferred.


Again, Harry, you want to use a test that has not shown *any* utility for
audible testing for a reference to 'validate' one that has. Your call to
"validate the test" applies to an even greater measure to the test you
propose as the reference.


Keith, I am trying along with others here to show why ABX should be viewed
with some skepticism for the purpose of open-ended evaluation of audio
components, until and unless it is validated. It is that simple.

  #159   Report Post  
 
Posts: n/a
Default

Harry Lavo wrote:

We were speaking specifically of his latest round of loudspeaker tests,
which Sean himself describes as "monadic".


Either provide a quote and citation, or admit you are making this up.

bob
Reply
Thread Tools
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Not In Love With Tivoli Audio? Maybe Here is Why-FAQ and Exegesis [email protected] Audio Opinions 5 April 25th 05 01:35 AM
NPR reports on new brain research music Harry Lavo High End Audio 21 March 25th 05 05:02 AM
Installing stand-by switch Sugarite Vacuum Tubes 3 February 26th 04 04:04 PM
More cable questions! [email protected] Tech 317 January 20th 04 03:58 AM


All times are GMT +1. The time now is 07:05 PM.

Powered by: vBulletin
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 AudioBanter.com.
The comments are property of their posters.
 

About Us

"It's about Audio and hi-fi"