Reply
 
Thread Tools Display Modes
  #81   Report Post  
 
Posts: n/a
Default

Mark DeBellis wrote:
The question that interests me now is whether the implications of an
identification ("Was that SACD or CD?") test need be the same as those
of a discrimination ("Are A and B the same or different?") test. Does
the research show, in particular, that an identification test (the
kind I undertook) is among the kinds of tests that are reliable for
determining whether two sources sound different?


If by "identification test," you mean that you listen to a single
signal and decide whether it is CD or SACD, that is extremely
difficult, because you must remember what both SACD and CD sound like.
(And, as I've said before, our aural memory for such small sonic
differences is far too short to do that.)

Whereas, in a proper same-different test (or an ABX test, which is a
variant), you have both signals available to you at all times, and can
switch immediately between them, which allows you to compare directly.

Unfortunately, most home users cannot do that sort of a test, because
it requires you not only to level-match the two (relatively easy) but
also time-sync them (very hard). It would be too easy to tell that the
two were different if one were running even fractionally ahead of the
other.

bob
  #82   Report Post  
 
Posts: n/a
Default

Harry Lavo wrote:

We aren't looking to determine differences, Bob.


You're the one who started this whole conversation by insisting that an
ABX test was inadequate. Well, the ONLY purpose of an ABX test is to
determine difference. If your argument is that an ABX test is not
adequate for determining something it was not designed to determine,
then you've been wasting our time.

We're looking to evaluate
audio components sonic signatures and subjective shading of musical
reproduction. And there has been no confimation that ABX or a straight AB
difference test can show up all the various shadings that show up in
longer-term listening evaluations.


There is no evidence that "various shadings" really do show up (rather
than simply being imagined by the listener) in longer-term listening
evaluations of components that cannot be distinguished in ABX tests.
You are once again assuming your conclusion.

bob
  #83   Report Post  
vlad
 
Posts: n/a
Default

Harry Lavo wrote:
"vlad" wrote in message
...
Harry,

It was few weeks ago when you described your "monadic" test first time
in this group. Now you are talking about this test as an established
fact.

Even if somebody would go into a hassle and an expense of
implementing your suggestion it is not at all obvious that the test
would produce any results. I think that most likely outcome that it
would find your subjective terms like "warmth", "depth", etc. not
correlated to the sound of the recording. I would bet that the
distribution of particular term would be completely random for
different users.

But of course then you would require not 200 participants but 2000,
etc. or something that again will make a proposed test unfeasible. And
you will continue speculate about validity of your imaginary test.

Please, either provide some proof that you so-called "monadic" test
works or stop speculating about it.

vlad


Harry, you did not address my statement about your "monadic" test
procedure. Let me repeat it here -

-- Even if somebody would go into a hassle and an expense of
-- implementing your suggestion it is not at all obvious that the test
-- would produce any results. I think that most likely outcome that it
-- would find your subjective terms like "warmth", "depth", etc. not
-- correlated to the sound of the recording.

Also you are trying to present your test as a mean of "validation"
of ABX/DBT tests. ABX/DBT tests do not need validation. They test
audibility of differences in physical devices (amp, wires, etc) and for
this purpose they work just fine according to experts in this field.

Your 'monadic' testing is designed to measure subjective
differences.
You probably can measure subjective preferences, I will give you that.
For instance after testing of 10000 subjects you can conclude that 52%
favor box a, 45% box B, and 3% are undecided. After all efforts and
money spent on this test will these results have any value? Subjective
is subjective and that is all.

For instance, many people have subjective preference for LPs. But it
does not make LP an accurate reproduction medium? Should we stick to
LPs for music listening? We have much better means now to store and
transfer audio signal. It is the matter of preference for some people,
that's it. Nobody argues with preferences.



Well, I guess I can understand why you feel that way. But fact is, Vlad, I
postulated such a test as a key part (the "control" part) of a validation
test here nearly two years ago.


DBT does not need validation by "monadic" tests.

I let the matter drop after much
controvery, and only recently brought it up again (in another forum, but it
has spilled over here). I also realized that perhaps understanding of what
I was proposing was buried in the complexity of the overall testing needed
to validate quick-switch testing, so I have tried to make my explainations
as simple as possible.

The reason I say it is a standard test is that it is widely used in the
social sciences, psychological and behavioral sciences, and in the medical
sciences. Audio is a field where it has not traditionally been used, at
least to my knowledge. Partly this may be structural (there are not a lot
of large companies worried about the quality of musical reprodcution, after
all). But more likely it is because the field has been dominated by sound
research conducted by physisists, electircal engineers, and audiologists.
However, more recently scientists have made rapid progress in brain research
with the growing realization that how we hear is very complex, and how we
hear music even more so. There is growing realization that musical
evaluation must be treated as a subjective phenomenon, and that means
treating its measurement using the tools of the social and psychological
scientists, and the medical scientists, not necessarily the physical
scientists.


Musical perception is subjective phenomenon, and always was.


As difficult as you may find it to believe that ratings of things like
"warmth" or "depth" or "dimensional" have meaning, those kinds of subjective
yet descriptive phrases are widely used in subjective research. Of course,
part of the art of researchers in a given field is determining the best,
most precise, way of asking the question to minimize confusion. You don't
want to say "on a scale of one to five, rate this item on "warmth"". You
doubtless would construct a scale that said " on a scale of one to five,
where 'one' is a relatively cool tone, and 'five' is a relatively warm tone,
where would you place the sound you just heard?". Or something to that
effect.


No, I think first of all you will find that if you will take two amps
or wires that are undistinguishable in DBT , then results of you
subjective evaluation test will be all over the map. I would expect
that subjective feelings of subject will be very poorly correlated if
correlated at all with particular pieces of equipment.

So before pouring any money or efforts in this kind of testing I would
ask first why you think that this test will give results at all. My
second question would be what you are going to do with results.
Subjective preferences tend to change with the time and can be
influenced by the last review in a Stereophile easily.

I personally don't care about subjective feeling of people that I
don't know.


Part of the research art is developing, and oft-times pretesting, the
questions so that you know they are meaningful and with minimum
misinterpretation. This is all practical "art", and there are commercial
researchers who are quite good at it.


What do you mean by that?

vlad
  #84   Report Post  
 
Posts: n/a
Default

Mark DeBellis wrote:
On 24 Jun 2005 01:09:45 GMT, Gary Eickmeier
wrote:


You can't do a "quick switch" test with two sources that run at
different speeds because you can't synchronize them, which would be a
dead giveaway in itself, so that is a bad example.

If you want to use that example, you will have to listen first to one,
then the other, in its entirety, then decide if the speed difference is
audible. If so, then do a blind series, listening to a known version,
then to a randomly chosen one, and decide whether it is the same or
different. In this manner you will eventually arrive at a number for a
speed differential that is at the audible threshold. That is the basic
idea of how audio research is done. You may find that speed differences
of 1.01 will be inaudible to most, but audible to some with perfect
pitch. If this is interesting enough a question for you, then do the
research and report it.


p.s. Suppose one carried out research such as this and found, for a
given one-minute-long excerpt, what is the audible threshold. So a
given subject could reliably discriminate between the excerpt and a
version that is 1.01 as fast (say). What theoretical reason would we
have to think that, if we did a quick switch test (see my previous
email for a suggestion about how to do it), the subject would be able
to tell the excerpts apart in that test?


Because the difference in pitch would be the way you'd be telling them
apart. (You certainly don't think you can tell the difference between a
passage that is 60 seconds long and a passage that is 60.6 seconds
long, do you?) And we know that differences in pitch are much easier to
detect when you can switch directly between the samples.

I don't understand the point about perfect pitch, because I am
supposing that one version is faster than the other, not that the
speed and pitch are both higher (as would be the case with analog
tape). Maybe I am not seeing your point though.


If one version is faster than the other, then the pitch will be higher,
whatever the medium. The only exception would be if you were to use
digital signal processing to correct for this. In that case, you
probably won't be able to tell them apart without a stopwatch unless
the difference is substantial. Our resident conductor would presumably
do somewhat better, because she is trained to be sensitive to subtle
differences in tempo. But even she would have her limits.

bob
  #85   Report Post  
 
Posts: n/a
Default

Harry Lavo wrote:

But that is a result of the fact that music itself is
subjective, and *cannot* be measured objectively. The closest you can come
perhaps is to substitute some kind of psychophysiological measurements.


Do you really believe all that???

Notation?
Music theory?
Tuning systems?
Harmonic series?
Compositional devices?

Just to name a few of the obvious ones.


  #86   Report Post  
Mark DeBellis
 
Posts: n/a
Default

On 27 Jun 2005 15:01:05 GMT, wrote:

Mark DeBellis wrote:
The question that interests me now is whether the implications of an
identification ("Was that SACD or CD?") test need be the same as those
of a discrimination ("Are A and B the same or different?") test. Does
the research show, in particular, that an identification test (the
kind I undertook) is among the kinds of tests that are reliable for
determining whether two sources sound different?


If by "identification test," you mean that you listen to a single
signal and decide whether it is CD or SACD, that is extremely
difficult, because you must remember what both SACD and CD sound like.
(And, as I've said before, our aural memory for such small sonic
differences is far too short to do that.)

Whereas, in a proper same-different test (or an ABX test, which is a
variant), you have both signals available to you at all times, and can
switch immediately between them, which allows you to compare directly.

Unfortunately, most home users cannot do that sort of a test, because
it requires you not only to level-match the two (relatively easy) but
also time-sync them (very hard). It would be too easy to tell that the
two were different if one were running even fractionally ahead of the
other.


By an identification test I mean one where you can switch back and
forth between the signals, but where what you have to decide is not
whether they are the same or different, but which one is CD and which
is SACD. This, I think, is difficult for some of the same reasons
that the test you describe above is difficult, so what should be
inferred from a subject's failure to get a high percentage of correct
answers on my kind of identification test is not necessarily the same
as what should be inferred from a subject's failure to get a high
score on a proper "same-different" test.

Mark
  #87   Report Post  
Mark DeBellis
 
Posts: n/a
Default

On 27 Jun 2005 14:52:21 GMT, Gary Eickmeier
wrote:

Mark DeBellis wrote:
The question that interests me now is whether the implications of an
identification ("Was that SACD or CD?") test need be the same as those
of a discrimination ("Are A and B the same or different?") test. Does
the research show, in particular, that an identification test (the
kind I undertook) is among the kinds of tests that are reliable for
determining whether two sources sound different?


I'm not sure what you mean by "identification test." There is no such
paradigm in what I have read. It is much more difficult to listen to a
randomly selected source and try to "identify" it than to compare two
sources and decide "same" or "different." In an ABX test, for example,
you can listen to the two known sources as long as you want, switch back
and forth between them and listen for differences, see if you can get a
"fix" on just what each sounds like, then go for a test. In the test,
you would select A or B, then let the comparator select X, and decide
whether X is A or B. You usually do this by quick switching between A
and X, then B and X, and deciding same or different. If X is same as A,
then you put A as the identification of it, and press on to trial 2.

If the differences are really audible, the trials will be child's play.
If they sound identical, you will be guessing and probably know it.

Anyway, the task is to decide same or different, not to identify the
source when presented with a single signal.


Thank you. That confirms my belief and I appreciate the elegant
description of the testing paradigm.

Mark
  #88   Report Post  
Jenn
 
Posts: n/a
Default

In article , wrote:

Mark DeBellis wrote:
On 24 Jun 2005 01:09:45 GMT, Gary Eickmeier
wrote:


You can't do a "quick switch" test with two sources that run at
different speeds because you can't synchronize them, which would be a
dead giveaway in itself, so that is a bad example.

If you want to use that example, you will have to listen first to one,
then the other, in its entirety, then decide if the speed difference is
audible. If so, then do a blind series, listening to a known version,
then to a randomly chosen one, and decide whether it is the same or
different. In this manner you will eventually arrive at a number for a
speed differential that is at the audible threshold. That is the basic
idea of how audio research is done. You may find that speed differences
of 1.01 will be inaudible to most, but audible to some with perfect
pitch. If this is interesting enough a question for you, then do the
research and report it.


p.s. Suppose one carried out research such as this and found, for a
given one-minute-long excerpt, what is the audible threshold. So a
given subject could reliably discriminate between the excerpt and a
version that is 1.01 as fast (say). What theoretical reason would we
have to think that, if we did a quick switch test (see my previous
email for a suggestion about how to do it), the subject would be able
to tell the excerpts apart in that test?


Because the difference in pitch would be the way you'd be telling them
apart. (You certainly don't think you can tell the difference between a
passage that is 60 seconds long and a passage that is 60.6 seconds
long, do you?) And we know that differences in pitch are much easier to
detect when you can switch directly between the samples.

I don't understand the point about perfect pitch, because I am
supposing that one version is faster than the other, not that the
speed and pitch are both higher (as would be the case with analog
tape). Maybe I am not seeing your point though.


If one version is faster than the other, then the pitch will be higher,
whatever the medium. The only exception would be if you were to use
digital signal processing to correct for this. In that case, you
probably won't be able to tell them apart without a stopwatch unless
the difference is substantial. Our resident conductor would presumably
do somewhat better, because she is trained to be sensitive to subtle
differences in tempo. But even she would have her limits.


Hey, I have MANY limitations! :-) I have found that I have sensitivity
in regard to tempi at about 3 beats per min. That is, I can tell that
one performance is slower or fast at about that threshold. I CAN'T pick
specific tempi out of the air with that degree of sensitivity. Others
can come pretty close to that. That's why I don't care for conducting
ballet, for example. Dancers need things REALLY exact in tempo, and I
just don't enjoy working that way; it's anti musical to me.

I'm learning a lot through this discussion, btw. Thanks to the
participants.
  #90   Report Post  
 
Posts: n/a
Default

vlad wrote:
So before pouring any money or efforts in this kind of testing I would
ask first why you think that this test will give results at all.


Because he doesn't like the results we've already got. No other reason.

The problem with using monadic tests for the purpose of determining
whether any difference is discernible between two components is that
the you will get a large (and incalcuable) number of false negatives.
You will get negative results:
1) when subjects really can't distinguish between the two,
2) when they could but didn't in this particular test (the standard
false negative that all such tests face), and
3) when subjects could distinguish between the two, but their
impressions based on whatever criteria you asked them about did not
lean consistently in a single direction. For example, if they could all
hear a difference between LP and CD, but half of them preferred one and
found it more lifelike/musical/etc., and the other half had exactly the
opposite reaction, the results would be inconclusive. And what good is
a test for difference that can't even distinguish between things that
sound as different as LP and CD?

bob


  #91   Report Post  
Harry Lavo
 
Posts: n/a
Default

wrote in message ...
Harry Lavo wrote:

We aren't looking to determine differences, Bob.


You're the one who started this whole conversation by insisting that an
ABX test was inadequate. Well, the ONLY purpose of an ABX test is to
determine difference. If your argument is that an ABX test is not
adequate for determining something it was not designed to determine,
then you've been wasting our time.



It started because an ABX test was proposed as a means of making listening
decisions for audio equipment.
The fact that *difference* is the wrong measure is just one of the problems
with this approach.

We're looking to evaluate
audio components sonic signatures and subjective shading of musical
reproduction. And there has been no confimation that ABX or a straight
AB
difference test can show up all the various shadings that show up in
longer-term listening evaluations.


There is no evidence that "various shadings" really do show up (rather
than simply being imagined by the listener) in longer-term listening
evaluations of components that cannot be distinguished in ABX tests.
You are once again assuming your conclusion.


The shadings can presume to be there, as they are heard by many people,
until proven otherwise. And they can't be proven otherwise except through
something like a monadic control test. The "shadings" are subjective; it
requires a test that can determine if subjective perception is real or not
and that is by ratings among a large cross-section of audiophiles, with
statistical analysis applied.

  #92   Report Post  
Mark DeBellis
 
Posts: n/a
Default

On 27 Jun 2005 21:34:43 GMT, wrote:

Mark DeBellis wrote:

Suppose one carried out research such as this and found, for a
given one-minute-long excerpt, what is the audible threshold. So a
given subject could reliably discriminate between the excerpt and a
version that is 1.01 as fast (say). What theoretical reason would we
have to think that, if we did a quick switch test (see my previous
email for a suggestion about how to do it), the subject would be able
to tell the excerpts apart in that test?


Because the difference in pitch would be the way you'd be telling them
apart. (You certainly don't think you can tell the difference between a
passage that is 60 seconds long and a passage that is 60.6 seconds
long, do you?) And we know that differences in pitch are much easier to
detect when you can switch directly between the samples.

I don't understand the point about perfect pitch, because I am
supposing that one version is faster than the other, not that the
speed and pitch are both higher (as would be the case with analog
tape). Maybe I am not seeing your point though.


If one version is faster than the other, then the pitch will be higher,
whatever the medium. The only exception would be if you were to use
digital signal processing to correct for this. In that case, you
probably won't be able to tell them apart without a stopwatch unless
the difference is substantial. Our resident conductor would presumably
do somewhat better, because she is trained to be sensitive to subtle
differences in tempo. But even she would have her limits.


In the example in question, I am supposing that the speed is higher
but not the pitch. If the only way to do this is by digital signal
processing, then so be it. There will be an audible threshold at
which a subject can reliably discriminate between the excerpt and the
sped-up excerpt. My question stands: What theoretical reason would we
have to think that, if we did a quick switch test (see my previous
email for a suggestion about how to do it), the subject would be able
to tell the excerpts apart in that test?

Mark
  #93   Report Post  
 
Posts: n/a
Default

Mark DeBellis wrote:

By an identification test I mean one where you can switch back and
forth between the signals, but where what you have to decide is not
whether they are the same or different, but which one is CD and which
is SACD.


That would be slightly easier than just listening to a single one, but
it still requires you to remember the criteria by which you had
previously distinguished (or *thought* you'd distinguished) them. So
it's still harder than a same-different test, or ABX, or similar.

This, I think, is difficult for some of the same reasons
that the test you describe above is difficult, so what should be
inferred from a subject's failure to get a high percentage of correct
answers on my kind of identification test is not necessarily the same
as what should be inferred from a subject's failure to get a high
score on a proper "same-different" test.


Agreed. In particular, there's probably some subset of sonic
differences which you wouldn't detect in an identification test, but
you would in a same-different test.

bob
  #94   Report Post  
Harry Lavo
 
Posts: n/a
Default

wrote in message ...
Harry Lavo wrote:

But that is a result of the fact that music itself is
subjective, and *cannot* be measured objectively. The closest you can
come
perhaps is to substitute some kind of psychophysiological measurements.


Do you really believe all that???

Notation?
Music theory?
Tuning systems?
Harmonic series?
Compositional devices?

Just to name a few of the obvious ones.


I see your point. Let me correct my statement: the "experiencing" of music
itself is subjective, and *cannot* be measured objectively.

Now hopefully you can agree to that, which is the part that is relevant to a
listening test.

  #95   Report Post  
Harry Lavo
 
Posts: n/a
Default

wrote in message ...
vlad wrote:
So before pouring any money or efforts in this kind of testing I would
ask first why you think that this test will give results at all.


Because he doesn't like the results we've already got. No other reason.


Thanks for the gratuitous insult, Bob.


The problem with using monadic tests for the purpose of determining
whether any difference is discernible between two components is that
the you will get a large (and incalcuable) number of false negatives.
You will get negative results:
1) when subjects really can't distinguish between the two,
2) when they could but didn't in this particular test (the standard
false negative that all such tests face), and
3) when subjects could distinguish between the two, but their
impressions based on whatever criteria you asked them about did not
lean consistently in a single direction. For example, if they could all
hear a difference between LP and CD, but half of them preferred one and
found it more lifelike/musical/etc., and the other half had exactly the
opposite reaction, the results would be inconclusive. And what good is
a test for difference that can't even distinguish between things that
sound as different as LP and CD?


Basically, Bob, this exposition shows that you have no idea of how scaling
works to measure differences. Please read my current posts before you
*decide* (based on erroneous beliefs) why it doesn't work. If I am to
believe you, I just wasted twenty five years of work and my company(s)
didn't make the hundreds of millions of dollars based on it that they
thought they did.



  #96   Report Post  
Harry Lavo
 
Posts: n/a
Default

"vlad" wrote in message
...
Harry Lavo wrote:


snip


Harry, you did not address my statement about your "monadic" test
procedure. Let me repeat it here -

-- Even if somebody would go into a hassle and an expense of
-- implementing your suggestion it is not at all obvious that the test
-- would produce any results. I think that most likely outcome that it
-- would find your subjective terms like "warmth", "depth", etc. not
-- correlated to the sound of the recording.


Well, you thoughts are your thoughts. But I have done a lot of research in
food, where ratings are subjective, and I simply disagree. If one amp, for
example, performs in a way that can be characterized as "cool" and another
as "warm", peoples ratings will reflect that even if they think they are
rating the music rather than the amp. Although there probably is no reason
to deceive them since the test is mondadic. There will be substantial
scatter; they won't march in lockstep. But the averages will reflect the
difference, and if the difference in averages is great enough, they will
reach the 95% significance level. Then you can conclude that amp "A" is
warmer-sounding than Amp "B".

Likewise, you can ask for overall preference and a whole series of ratings
on characteristics. Together they will tell you if and how the two amps
differ.

Keep in mind that this goes beyond measurement. If the "coolness" is a
static frequency response dip, it would also likely be heard in an ABX test.
If it is the way the timbre changes dynamically, heard over an extended
listen, it might not. In this *subjective* test, it really doesn't matter
what is creating the perception; the test simply determines whether the
perception difference is real or not.


Also you are trying to present your test as a mean of "validation"
of ABX/DBT tests. ABX/DBT tests do not need validation. They test
audibility of differences in physical devices (amp, wires, etc) and for
this purpose they work just fine according to experts in this field.


The test differences that are volume-related, since as frequency response,
loudness, and standard distortions. They don't do so well, many of us
believe, on things that are more complex perceptually such as imaging,
transparency, dynamic phase coherence, dimensionality, etc.


Your 'monadic' testing is designed to measure subjective
differences.


That is correct.

You probably can measure subjective preferences, I will give you that.


That's a start. You can also measure differences in perception, believe me.

To say you can't to me, with my background, is like saying you can't measure
harmonic distortion to an EE.


For instance after testing of 10000 subjects you can conclude that 52%
favor box a, 45% box B, and 3% are undecided. After all efforts and
money spent on this test will these results have any value? Subjective
is subjective and that is all.


Well, for starters, such a preference by 10000 people woud be statistically
significant beyond a doubt. So you can say for sure "the two amps sound
different, and "A" is preferred.

Now also suppose that those 10000 subjects also determine that Amp "A"
"sounds less constrained" on dynamic peaks, versus Amp "B" (at the 95%
confidence level), and they also determine that Amp "A" sounds "easier to
listen 'into' on soft passages" than Amp "B", again at the 95% level.

That would give a pretty good indication of why Amp "A" was preferred. It's
value, and what was done with the information, would depend on who and for
what purpose the test was done.

(Incidentally, usually a sample size of 200-300 people per cell is an
acceptable trade off between test cost and statistical sensitivity).

Now in my case I proposed such a test as part of an overall series of tests
to determine if the short-form, quick-switch, comparative tests could give
the same results. If so, their worth would be proven for open-ended
evaluation of audio components. If not they would be misleading for this
use, however valuable for other uses they might be. Or the test technique
might have to be altered slightly. For example, let us hypothesize some
possible results:

a standard ABX test of 20 trials, conducted among ten similar people, fails
to reveal a statistical difference.
a standard ABX test of 20 trials, conducted among ten similar people just
reaches the 95% difference threshold
a standard AB preference test of 20 trials shows roughly the same preference
and significance among 10 people
a standard AB preference test of 20 trials shows no statistical preference
among 10 people, but shows a statistical prefernce when 20 people are
included

All of these would have major implications for those using comparative tests
for the purposes of open-ended evaluation of audio components. But only
once the *benchmark* or control had been established.

For instance, many people have subjective preference for LPs. But it
does not make LP an accurate reproduction medium? Should we stick to
LPs for music listening? We have much better means now to store and
transfer audio signal. It is the matter of preference for some people,
that's it. Nobody argues with preferences.


If I were a Sony executive and my testing among 300 people should a
statistically significant 60-40 preference for vinyl over CD, I might think
hard about the product and marketing implications of same. Likewise, if I
had hard evidence that SACD was preferred over CD, I'd certainly be thinking
hard about how to capitalize on that fact.




Well, I guess I can understand why you feel that way. But fact is, Vlad,
I
postulated such a test as a key part (the "control" part) of a validation
test here nearly two years ago.


DBT does not need validation by "monadic" tests.


The double-blind technique as a concept certainly does not. However,
qucik-switch comparative testing certainly does for the purpose of
open-ended evaluation of audio components, since these tests were designed
or a whole 'nother purpose.


I let the matter drop after much
controvery, and only recently brought it up again (in another forum, but
it
has spilled over here). I also realized that perhaps understanding of
what
I was proposing was buried in the complexity of the overall testing
needed
to validate quick-switch testing, so I have tried to make my
explainations
as simple as possible.

The reason I say it is a standard test is that it is widely used in the
social sciences, psychological and behavioral sciences, and in the
medical
sciences. Audio is a field where it has not traditionally been used, at
least to my knowledge. Partly this may be structural (there are not a
lot
of large companies worried about the quality of musical reprodcution,
after
all). But more likely it is because the field has been dominated by
sound
research conducted by physisists, electircal engineers, and audiologists.
However, more recently scientists have made rapid progress in brain
research
with the growing realization that how we hear is very complex, and how we
hear music even more so. There is growing realization that musical
evaluation must be treated as a subjective phenomenon, and that means
treating its measurement using the tools of the social and psychological
scientists, and the medical scientists, not necessarily the physical
scientists.


Musical perception is subjective phenomenon, and always was.


Then you must use a test that measures this subjective phenomenon in its
fullest. That means the test itself has to be designed to interfere the
least possible way in the actual act of listening and evaluation. This is
where the quick-switch, comparative testing has conceptual weakness, since
it completely alters the listening experience and most likely portions of
the brain involved in this activity.




As difficult as you may find it to believe that ratings of things like
"warmth" or "depth" or "dimensional" have meaning, those kinds of
subjective
yet descriptive phrases are widely used in subjective research. Of
course,
part of the art of researchers in a given field is determining the best,
most precise, way of asking the question to minimize confusion. You
don't
want to say "on a scale of one to five, rate this item on "warmth"".
You
doubtless would construct a scale that said " on a scale of one to five,
where 'one' is a relatively cool tone, and 'five' is a relatively warm
tone,
where would you place the sound you just heard?". Or something to that
effect.


No, I think first of all you will find that if you will take two amps
or wires that are undistinguishable in DBT , then results of you
subjective evaluation test will be all over the map. I would expect
that subjective feelings of subject will be very poorly correlated if
correlated at all with particular pieces of equipment.


Au contraire...if their truly is no difference the averages of the two cells
evaluation the amps or wires will be identical from a statistical
standpoint, that is, they would fail to differ at a statistically
significant level. Within each evaluating cell, there would be a lot of
scatter, but the averages are what are used in such a test.


So before pouring any money or efforts in this kind of testing I would
ask first why you think that this test will give results at all. My
second question would be what you are going to do with results.
Subjective preferences tend to change with the time and can be
influenced by the last review in a Stereophile easily.

I personally don't care about subjective feeling of people that I
don't know.


I think I've answered all of this above. I've proposed it as a control test
for the short form tests. And if I was a marketing or R&D exec at Sony or
Harman International, I'd consider using it for other purposes, as aparently
Harman has.



Part of the research art is developing, and oft-times pretesting, the
questions so that you know they are meaningful and with minimum
misinterpretation. This is all practical "art", and there are
commercial
researchers who are quite good at it.


What do you mean by that?


I mean there are firms whose job it is to help companies design, conduct,
and evaluate tests. And one of the skills a company that does this has to
develop is the ability to design and pretest questions that make sense and
increase response coherence. I happened to study under the founder of one
such company while obtaining my MBA from Northwestern back in the early
'60's. Dr. Sidney Levy was a highly regarded leader in the field of
behavioral psychology. And then for twenty-five years as an executive I
helped design and make decisions based on such testing for a major consumer
packaged goods company, working with many such companies.

  #97   Report Post  
 
Posts: n/a
Default

Jenn wrote:

Hey, I have MANY limitations! :-) I have found that I have sensitivity
in regard to tempi at about 3 beats per min.


3 beats per minute out of how many? Three beats of Largo is a lot
longer than 3 beats of Presto.

That is, I can tell that
one performance is slower or fast at about that threshold. I CAN'T pick
specific tempi out of the air with that degree of sensitivity. Others
can come pretty close to that. That's why I don't care for conducting
ballet, for example. Dancers need things REALLY exact in tempo, and I
just don't enjoy working that way; it's anti musical to me.

I'm learning a lot through this discussion, btw. Thanks to the
participants.


Don't say that. You'll only encourage us.

bob
  #98   Report Post  
 
Posts: n/a
Default

Harry Lavo wrote:
wrote in message ...
Harry Lavo wrote:

But that is a result of the fact that music itself is
subjective, and *cannot* be measured objectively. The closest you can
come
perhaps is to substitute some kind of psychophysiological measurements.


Do you really believe all that???

Notation?
Music theory?
Tuning systems?
Harmonic series?
Compositional devices?

Just to name a few of the obvious ones.


I see your point. Let me correct my statement: the "experiencing" of music
itself is subjective, and *cannot* be measured objectively.


Now hopefully you can agree to that, which is the part that is relevant to a
listening test.


I wouldn't disagree, except that soliciting responses under controlled
conditions is also relevant, which is a bogeyman for you for reasons you
have yet to adequitely explain.

They are not mutually exclusive, despite your assertions otherwise.

Saying the experience of music is subjective is sort of belaboring the obvious.
I think a single word to describe it better, I would use abstraction.
  #99   Report Post  
 
Posts: n/a
Default

Harry Lavo wrote:
wrote in message ...
vlad wrote:
So before pouring any money or efforts in this kind of testing I would
ask first why you think that this test will give results at all.


Because he doesn't like the results we've already got. No other reason.


Thanks for the gratuitous insult, Bob.



The problem with using monadic tests for the purpose of determining
whether any difference is discernible between two components is that
the you will get a large (and incalcuable) number of false negatives.
You will get negative results:
1) when subjects really can't distinguish between the two,
2) when they could but didn't in this particular test (the standard
false negative that all such tests face), and
3) when subjects could distinguish between the two, but their
impressions based on whatever criteria you asked them about did not
lean consistently in a single direction. For example, if they could all
hear a difference between LP and CD, but half of them preferred one and
found it more lifelike/musical/etc., and the other half had exactly the
opposite reaction, the results would be inconclusive. And what good is
a test for difference that can't even distinguish between things that
sound as different as LP and CD?


Basically, Bob, this exposition shows that you have no idea of how scaling
works to measure differences. Please read my current posts before you
*decide* (based on erroneous beliefs) why it doesn't work. If I am to
believe you, I just wasted twenty five years of work and my company(s)
didn't make the hundreds of millions of dollars based on it that they
thought they did.


And those were audio tests. Correct?
  #100   Report Post  
Mark DeBellis
 
Posts: n/a
Default

On 28 Jun 2005 03:11:37 GMT, wrote:

Mark DeBellis wrote:

By an identification test I mean one where you can switch back and
forth between the signals, but where what you have to decide is not
whether they are the same or different, but which one is CD and which
is SACD.


That would be slightly easier than just listening to a single one, but
it still requires you to remember the criteria by which you had
previously distinguished (or *thought* you'd distinguished) them. So
it's still harder than a same-different test, or ABX, or similar.

This, I think, is difficult for some of the same reasons
that the test you describe above is difficult, so what should be
inferred from a subject's failure to get a high percentage of correct
answers on my kind of identification test is not necessarily the same
as what should be inferred from a subject's failure to get a high
score on a proper "same-different" test.


Agreed. In particular, there's probably some subset of sonic
differences which you wouldn't detect in an identification test, but
you would in a same-different test.


Yes indeed. Here is another example which may prove useful. Suppose
you have two signals. The first consists of the pattern dot-dot-dee
(where dot and dee are different pitches, say), repeated over and
over, where each dot or dee lasts one second. The second pattern
consists of dot-dot-dot-dee, repeated over and over. Say that these
signals are synchronized to begin at the start of the patterns, and
then each goes on in its own way.

Suppose now that the test consists only in comparing short
corresponding snippets of the two signals, two seconds in length (so
you hear only the snippet, not its surrounding context). If the task
is to say whether the two signals are the same, that will be easy,
because there will be different sounds on at least some of the
samples, assuming enough samples are allowed.

If on the other hand the task is to say which signal is which, it will
be impossible, because the samples are too short. This is an example
of the "forest-for-trees" phenomenon that I worried about in an
earlier post. And the difference between the dot-dot-dee pattern and
the dot-dot-dot-dee pattern is an example of a difference in
properties of temporally extended passages.

That is why I invoked that notion earlier, in order to give at least a
partial explanation of why identification tests might be inadequate.
I want to emphasize, this is a problem for *identification* tests; I
am not saying here that it is a problem for "proper same-different"
tests.

Mark


  #102   Report Post  
 
Posts: n/a
Default

Mark DeBellis wrote:
In the example in question, I am supposing that the speed is higher
but not the pitch. If the only way to do this is by digital signal
processing, then so be it. There will be an audible threshold


No, there won't. You're not measuring audibility here. You're measuring
perception of elapsed time. (That is, if you are comparing a one-minute
segment to the same segment stretched out to one minute and X seconds.)
Now, you may also be measuring perception of tempo, and I would guess
that focusing on tempo would be much more effective than trying to
judge the relative length of two long musical segments. And you
certainly don't need to listen to a full minute to judge the tempo, do
you?

at
which a subject can reliably discriminate between the excerpt and the
sped-up excerpt. My question stands: What theoretical reason would we
have to think that, if we did a quick switch test (see my previous
email for a suggestion about how to do it), the subject would be able
to tell the excerpts apart in that test?


I've no idea which post you're referring to, but your question displays
a misconception about quick-switching tests. They do not *require*
switching; they *allow* switching. A subject can, in a quick-switching
test, listen to the entire one-minute segment, if he so chooses.
Subject tend not to do so, however, because it tends not to work.

Let me pose your question a different way. Which method would work
better:

1) The DeBellis Method: Listen to a full one-minute segment, then
listen to the same segment (possibly now stretched to 1.X minutes), and
determine whether the two are the same.

2) The Marcus Method (if I may be so bold): Listen to sets of two
beats, and determine whether the distance between the beats is the same
or different.

I don't know offhand whether this experiment has been done (though I
can't imagine I'm the first to think of it). Absent any data, I see no
reason to believe that your method would be more sensitive than mine.
And given the general pattern of findings in psychoacoustics, I'd bet
on mine.

bob
  #103   Report Post  
 
Posts: n/a
Default

Harry Lavo wrote:
Basically, Bob, this exposition shows that you have no idea of how scaling
works to measure differences. Please read my current posts before you
*decide* (based on erroneous beliefs) why it doesn't work. If I am to
believe you, I just wasted twenty five years of work and my company(s)
didn't make the hundreds of millions of dollars based on it that they
thought they did.


Basically, Harry, this exposition shows that you are not paying
attention to what I am saying. Please read my posts more carefully.

If, indeed, you spent 25 years trying to determine whether cereal A
tastes different than cereal B, then I would question the sanity of
your employers. But I suspect that you spent most of that time trying
to answer other questions, like which cereal people preferred, and what
they preferred about it.

Now, again assuming the sanity of your employers, they would not have
spent the kind of money that monadic testing costs if they had any
doubt about whether cereal A and cereal B tasted different.

And that is why all of your experience is completely irrelevant to the
basic objectivist-subjectivist divide in audio. In your career, you
were dealing with comparisons and evaluations of products that were
known to differ. Here, we are talking about components that are not
known to differ. That's Question #1. That's why we're talking about
difference, Harry. And I think we know why you keep trying to change
the subject.

bob
  #104   Report Post  
 
Posts: n/a
Default

Harry Lavo wrote:
wrote in message ...
Harry Lavo wrote:

We aren't looking to determine differences, Bob.


You're the one who started this whole conversation by insisting that an
ABX test was inadequate. Well, the ONLY purpose of an ABX test is to
determine difference. If your argument is that an ABX test is not
adequate for determining something it was not designed to determine,
then you've been wasting our time.



It started because an ABX test was proposed as a means of making listening
decisions for audio equipment.


So it apparently started because you misread something, and then
decided to pick a fight about it. No one's ever suggested using ABX
tests to "make listening decisions" here. It's been proposed only as a
way to confirm impressions that components sound different. Stop
fighting the straw men, Harry. It doesn't help your cause.

The fact that *difference* is the wrong measure is just one of the problems
with this approach.

We're looking to evaluate
audio components sonic signatures and subjective shading of musical
reproduction. And there has been no confimation that ABX or a straight
AB
difference test can show up all the various shadings that show up in
longer-term listening evaluations.


There is no evidence that "various shadings" really do show up (rather
than simply being imagined by the listener) in longer-term listening
evaluations of components that cannot be distinguished in ABX tests.
You are once again assuming your conclusion.


The shadings can presume to be there, as they are heard by many people,
until proven otherwise.


Spoken like a true anti-empiricist. People who don't pick and choose
which science they wish to believe in will understand that things don't
exist just because people--even "many" people--claim they exist. Human
perception is not that simple.

And they can't be proven otherwise except through
something like a monadic control test. The "shadings" are subjective; it
requires a test that can determine if subjective perception is real or not
and that is by ratings among a large cross-section of audiophiles, with
statistical analysis applied.


Just to sum up here, it is your position that ABX tests are inadequate
because:

1) they do not measure things they are not designed to measure; and,

2) they cannot detect things we do not know exist.

Glad we've got that straight.

bob
  #106   Report Post  
Harry Lavo
 
Posts: n/a
Default

wrote in message ...
Harry Lavo wrote:
wrote in message ...
Harry Lavo wrote:

But that is a result of the fact that music itself is
subjective, and *cannot* be measured objectively. The closest you can
come
perhaps is to substitute some kind of psychophysiological measurements.

Do you really believe all that???

Notation?
Music theory?
Tuning systems?
Harmonic series?
Compositional devices?

Just to name a few of the obvious ones.


I see your point. Let me correct my statement: the "experiencing" of
music
itself is subjective, and *cannot* be measured objectively.


Now hopefully you can agree to that, which is the part that is relevant
to a
listening test.


I wouldn't disagree, except that soliciting responses under controlled
conditions is also relevant, which is a bogeyman for you for reasons you
have yet to adequitely explain.

They are not mutually exclusive, despite your assertions otherwise.


We simply don't know that. Knowledge of the brain suggests they may be, or
at the very least are different enough demands on the brain that the
"controlled conditions" where those conditions impose the need for
quick-switching, short-snippet, comparative choices interfere with normal
musical perception.

The reason for a control test is to determine which assumptions are
correct.

Saying the experience of music is subjective is sort of belaboring the
obvious.
I think a single word to describe it better, I would use abstraction.


Well, it may be obvious. But the tests being used to say "no difference"
have been shown only to be highly senstive to more objective volume/partial
volume differences...so other aspects of subjectivity may well be blocked.

  #107   Report Post  
Harry Lavo
 
Posts: n/a
Default

wrote in message ...
Harry Lavo wrote:
wrote in message
...
vlad wrote:
So before pouring any money or efforts in this kind of testing I would
ask first why you think that this test will give results at all.

Because he doesn't like the results we've already got. No other reason.


Thanks for the gratuitous insult, Bob.



The problem with using monadic tests for the purpose of determining
whether any difference is discernible between two components is that
the you will get a large (and incalcuable) number of false negatives.
You will get negative results:
1) when subjects really can't distinguish between the two,
2) when they could but didn't in this particular test (the standard
false negative that all such tests face), and
3) when subjects could distinguish between the two, but their
impressions based on whatever criteria you asked them about did not
lean consistently in a single direction. For example, if they could all
hear a difference between LP and CD, but half of them preferred one and
found it more lifelike/musical/etc., and the other half had exactly the
opposite reaction, the results would be inconclusive. And what good is
a test for difference that can't even distinguish between things that
sound as different as LP and CD?


Basically, Bob, this exposition shows that you have no idea of how
scaling
works to measure differences. Please read my current posts before you
*decide* (based on erroneous beliefs) why it doesn't work. If I am to
believe you, I just wasted twenty five years of work and my company(s)
didn't make the hundreds of millions of dollars based on it that they
thought they did.


And those were audio tests. Correct?


Bob's critique were of test design and use, not audio per se. Test design
and use are practices in an of themselves, applicable to testing in any
field. Makes no difference in this case whether food, drugs, or
audio...scalar ratings work and are evaluated the same way in a mondadic
test.

  #108   Report Post  
Harry Lavo
 
Posts: n/a
Default

"Mark DeBellis" wrote in message
...
On 28 Jun 2005 03:11:37 GMT, wrote:

Mark DeBellis wrote:

By an identification test I mean one where you can switch back and
forth between the signals, but where what you have to decide is not
whether they are the same or different, but which one is CD and which
is SACD.


That would be slightly easier than just listening to a single one, but
it still requires you to remember the criteria by which you had
previously distinguished (or *thought* you'd distinguished) them. So
it's still harder than a same-different test, or ABX, or similar.

This, I think, is difficult for some of the same reasons
that the test you describe above is difficult, so what should be
inferred from a subject's failure to get a high percentage of correct
answers on my kind of identification test is not necessarily the same
as what should be inferred from a subject's failure to get a high
score on a proper "same-different" test.


Agreed. In particular, there's probably some subset of sonic
differences which you wouldn't detect in an identification test, but
you would in a same-different test.


Yes indeed. Here is another example which may prove useful. Suppose
you have two signals. The first consists of the pattern dot-dot-dee
(where dot and dee are different pitches, say), repeated over and
over, where each dot or dee lasts one second. The second pattern
consists of dot-dot-dot-dee, repeated over and over. Say that these
signals are synchronized to begin at the start of the patterns, and
then each goes on in its own way.

Suppose now that the test consists only in comparing short
corresponding snippets of the two signals, two seconds in length (so
you hear only the snippet, not its surrounding context). If the task
is to say whether the two signals are the same, that will be easy,
because there will be different sounds on at least some of the
samples, assuming enough samples are allowed.

If on the other hand the task is to say which signal is which, it will
be impossible, because the samples are too short. This is an example
of the "forest-for-trees" phenomenon that I worried about in an
earlier post. And the difference between the dot-dot-dee pattern and
the dot-dot-dot-dee pattern is an example of a difference in
properties of temporally extended passages.

That is why I invoked that notion earlier, in order to give at least a
partial explanation of why identification tests might be inadequate.
I want to emphasize, this is a problem for *identification* tests; I
am not saying here that it is a problem for "proper same-different"
tests.


It would seem to have relevance to these tests when it comes to the
open-ended evaluation of musical reproduction, as opposed to white or pink
noise testing.

The latter are granular enough to work no matter. Musical interpretation is
much more complex and requires many parts of the brain as well as the ear
itself to respond. Differences of equipment using music as a signal source
*must* have a context if the brain/ear combine is to sort out any
differences in perception.

My guess is that this is why white noise testing and volume differences are
picked us so readily in these tests. They are simplistic and continuous.

Music and musical reproduction, on the other hand, .........

  #109   Report Post  
Mark DeBellis
 
Posts: n/a
Default

On 25 Jun 2005 02:28:36 GMT, "Buster Mudd"
wrote:


But in order for a psychologist to postulate unconscious representation
they need to observe something in a subject's behavior that suggests
that Perceived-But-Not-Brought-To-Consciousness thing *was* affecting
the subject's cognitive economy. This gets right back to my previous
question: How would you go about *proving* (confirming? demonstrating?)
that something was in someone's "cognitive economy" if that something
could not enable that someone to perform a task?


Mark wrote:
Well, just as you say, by observing behavior that, together
with everything else that is observed, is best explained
by that hypothesis, in the context of a larger theory.


p.s. If you are thinking: but specifically what behavior or what sort
of behavior? DeBellis isn't telling me that! That is because the
relevant behavior would vary from one case to another. It would
depend on what the mental item was and what role it was playing in
somebody's psychology. The relevant behavior would be specified in
the psychological theory itself, not by you or I looking at the
theory, as it were, "from outside."

Mark
  #110   Report Post  
Keith Hughes
 
Posts: n/a
Default

Harry Lavo wrote:
wrote in message ...

Harry Lavo wrote:


We aren't looking to determine differences, Bob.


You're the one who started this whole conversation by insisting that an
ABX test was inadequate. Well, the ONLY purpose of an ABX test is to
determine difference. If your argument is that an ABX test is not
adequate for determining something it was not designed to determine,
then you've been wasting our time.


It started because an ABX test was proposed as a means of making listening
decisions for audio equipment.
The fact that *difference* is the wrong measure is just one of the problems
with this approach.


Clearly you must be joking. Difference is *the* requisite predicate. If
you cannot determine a difference, due to sonic characteristics only,
then a preference (as between components) must be based on non-sonic
attributes. QED.


We're looking to evaluate
audio components sonic signatures and subjective shading of musical
reproduction. And there has been no confimation that ABX or a straight
AB
difference test can show up all the various shadings that show up in
longer-term listening evaluations.


There is no evidence that "various shadings" really do show up (rather
than simply being imagined by the listener) in longer-term listening
evaluations of components that cannot be distinguished in ABX tests.
You are once again assuming your conclusion.



The shadings can presume to be there, as they are heard by many people,
until proven otherwise. And they can't be proven otherwise except through
something like a monadic control test. The "shadings" are subjective; it
requires a test that can determine if subjective perception is real or not
and that is by ratings among a large cross-section of audiophiles, with
statistical analysis applied.


You keep repeating this misguided idea that a "monadic / proto-monadic"
test must be applied to some vast population to have any meaning. As a
research method to identify the frequency/distribution of some attribute
or parameter, and extrapolate that to the general population, this
method has merit. However, relative to the situation being discussed
here, it is merely a dodge. Why? Because population distribution is
irrelevant within the current context. You're talking about a test for
identification of *preference* within the population, where there is a
*known* difference in presented stimuli. That's a basic precept in the
method. There is no *known* difference in stimuli in the current context
- that's the whole argument.

Luckily, however, you already have a population subset, yourself
included, who claim to possess an attribute (i.e. who can distinguish,
sighted, the differences within a myriad of devices believed by many to
be indistiguishable, and believe that those differences are *real* and
reproducible), and thus the test need only involve that subset. Conduct
the test among the identified subset, construct the test to utilize
blind controls and level matching, then test in whatever manner, using
whatever scoring system, and for whatever period, you wish. Perform
sufficient replicates to generate a statistically valid data set, and
you're done.

Will this be universally transferrable to the whole population? No, but
again, that's irrelevant. It will, however, identify whether there is
such an attribute (ability to distinguish cable differences for e.g.)
within the *ONLY* population subset of interest. There is no utility in
testing outside that subset until the existence of the 'peceived'
attribute is confirmed, or not.

You see, testing only yourself, Mr. Lavo, using proper controls, would
be sufficient to confirm the existence of the ability you claim. Your
failure to confirm such an ability could not be extrapolated to the
population, but that's not the intent. So what keeps you from doing
just that? I did, and my observed (and obvious) differences in
cables...disappeared.

Keith Hughes


  #111   Report Post  
Mark DeBellis
 
Posts: n/a
Default

In the example I posted previously ...

Suppose now that the test consists only in comparing short
corresponding snippets of the two signals, two seconds in length ...


please change the sample length to *one* second.

(If the samples are two seconds long, and if each dot lasts an entire
second ("dot" is a bad name for that, I know), and if a sample can
start partway through a dot, and if the dot begins with an
articulation so you can tell that a dot is beginning, then it would
not be impossible to tell that you were hearing three dots in a row,
if the sample started partway through a dot. This problem disappears
if the sample length is changed to one second.)

  #112   Report Post  
Steven Sullivan
 
Posts: n/a
Default

wrote:
Harry Lavo wrote:
wrote in message ...
Harry Lavo wrote:

But that is a result of the fact that music itself is
subjective, and *cannot* be measured objectively. The closest you can
come
perhaps is to substitute some kind of psychophysiological measurements.

Do you really believe all that???

Notation?
Music theory?
Tuning systems?
Harmonic series?
Compositional devices?

Just to name a few of the obvious ones.


I see your point. Let me correct my statement: the "experiencing" of music
itself is subjective, and *cannot* be measured objectively.


Now hopefully you can agree to that, which is the part that is relevant to a
listening test.


I wouldn't disagree, except that soliciting responses under controlled
conditions is also relevant, which is a bogeyman for you for reasons you
have yet to adequitely explain.



I would propose that the 'experiencing' of music isn't inherently
beyond scientific investigation, as brain scanning technology advances.
Certainly the 'experiencing' of music has been the subject of psychological
investigation.

Of course one can 'experience' things that have no physical existence, making
'experience' alone rather iffy as a basis for objective claims of
difference. I am pretty sure that Harry would report a different
'experience' of the *same* musical selection, using the same playback
gear, played twice in succession, if Harry was led to believe that he
was hearing different gear.

Possibly this 'experiental' difference would even have a
physical manifestation, visible in a brain scan. Imagination does.



They are not mutually exclusive, despite your assertions otherwise.


Saying the experience of music is subjective is sort of belaboring the obvious.
I think a single word to describe it better, I would use abstraction.


--

-S
"You know what love really is? It's like you've swallowed a great big
secret. A warm wonderful secret that nobody else knows about." - 'Blame it
on Rio'
  #114   Report Post  
 
Posts: n/a
Default

Harry Lavo wrote:

We simply don't know that. Knowledge of the brain suggests they may be, or
at the very least are different enough demands on the brain that the
"controlled conditions" where those conditions impose the need for
quick-switching, short-snippet, comparative choices interfere with normal
musical perception.


Then you would then agree that all musicians are unmusical because the
effort involved in just playing the right notes at the right time (objective)
destroys their emotional perception of music. Playing all those right notes
at the right time also involves training, (read: rehersal, where
musicians break pieces up into parts, make exercises out of passages,
compare snippets of interpretive ideas played back to back and etc. and then
have to put it all back together) which is something else that you seem
to think destroys music.

I think it's absurd. Sorry.









  #115   Report Post  
 
Posts: n/a
Default

Harry Lavo wrote:
wrote in message ...
Harry Lavo wrote:
wrote in message
...
vlad wrote:
So before pouring any money or efforts in this kind of testing I would
ask first why you think that this test will give results at all.

Because he doesn't like the results we've already got. No other reason.


Thanks for the gratuitous insult, Bob.



The problem with using monadic tests for the purpose of determining
whether any difference is discernible between two components is that
the you will get a large (and incalcuable) number of false negatives.
You will get negative results:
1) when subjects really can't distinguish between the two,
2) when they could but didn't in this particular test (the standard
false negative that all such tests face), and
3) when subjects could distinguish between the two, but their
impressions based on whatever criteria you asked them about did not
lean consistently in a single direction. For example, if they could all
hear a difference between LP and CD, but half of them preferred one and
found it more lifelike/musical/etc., and the other half had exactly the
opposite reaction, the results would be inconclusive. And what good is
a test for difference that can't even distinguish between things that
sound as different as LP and CD?


Basically, Bob, this exposition shows that you have no idea of how
scaling
works to measure differences. Please read my current posts before you
*decide* (based on erroneous beliefs) why it doesn't work. If I am to
believe you, I just wasted twenty five years of work and my company(s)
didn't make the hundreds of millions of dollars based on it that they
thought they did.


And those were audio tests. Correct?


Bob's critique were of test design and use, not audio per se. Test design
and use are practices in an of themselves, applicable to testing in any
field. Makes no difference in this case whether food, drugs, or
audio...scalar ratings work and are evaluated the same way in a mondadic
test.


Yes, which leads me to my point. The little details of how a test is
implemented are dependent on what you are testing. You keep trying to take
your experience with food tasing tests and apply them to audio testing
apparently without reading the scientific literature on hearing perception.

I doubt if you really understand the difference between marketing research and
basic research.


  #116   Report Post  
Buster Mudd
 
Posts: n/a
Default

Mark DeBellis wrote:

Yes indeed. Here is another example which may prove useful. Suppose
you have two signals. The first consists of the pattern dot-dot-dee
(where dot and dee are different pitches, say), repeated over and
over, where each dot or dee lasts one second. The second pattern
consists of dot-dot-dot-dee, repeated over and over. Say that these
signals are synchronized to begin at the start of the patterns, and
then each goes on in its own way.

Suppose now that the test consists only in comparing short
corresponding snippets of the two signals, two seconds in length (so
you hear only the snippet, not its surrounding context). If the task
is to say whether the two signals are the same, that will be easy,
because there will be different sounds on at least some of the
samples, assuming enough samples are allowed.

If on the other hand the task is to say which signal is which, it will
be impossible, because the samples are too short. This is an example
of the "forest-for-trees" phenomenon that I worried about in an
earlier post. And the difference between the dot-dot-dee pattern and
the dot-dot-dot-dee pattern is an example of a difference in
properties of temporally extended passages.



"Because the samples are too short"??? Unless your dots & dees are
plodding along at an excrutiatingly lethargic adagio, the two second
samples *wouldn't* be shorter than a single complete iteration of this
recurring dot-dot-dee or dot-dot-dot-dee pattern...which is all that
would be required for most folks to identify which signal is which.
  #118   Report Post  
Harry Lavo
 
Posts: n/a
Default

"Keith Hughes" wrote in message
...
Harry Lavo wrote:
wrote in message
...

Harry Lavo wrote:


We aren't looking to determine differences, Bob.

You're the one who started this whole conversation by insisting that an
ABX test was inadequate. Well, the ONLY purpose of an ABX test is to
determine difference. If your argument is that an ABX test is not
adequate for determining something it was not designed to determine,
then you've been wasting our time.


It started because an ABX test was proposed as a means of making
listening decisions for audio equipment.
The fact that *difference* is the wrong measure is just one of the
problems with this approach.


Clearly you must be joking. Difference is *the* requisite predicate. If
you cannot determine a difference, due to sonic characteristics only, then
a preference (as between components) must be based on non-sonic
attributes. QED.


Difference is a necessary condition to explain differences in sonic
perception. The problem is that AB or ABX testing has never been shown
decisively to be able to include in their "difference" measurement *all* the
things that can led to a perception difference. Thus the need for a control
test.




We're looking to evaluate
audio components sonic signatures and subjective shading of musical
reproduction. And there has been no confimation that ABX or a straight
AB
difference test can show up all the various shadings that show up in
longer-term listening evaluations.

There is no evidence that "various shadings" really do show up (rather
than simply being imagined by the listener) in longer-term listening
evaluations of components that cannot be distinguished in ABX tests.
You are once again assuming your conclusion.



The shadings can presume to be there, as they are heard by many people,
until proven otherwise. And they can't be proven otherwise except
through something like a monadic control test. The "shadings" are
subjective; it requires a test that can determine if subjective
perception is real or not and that is by ratings among a large
cross-section of audiophiles, with statistical analysis applied.


You keep repeating this misguided idea that a "monadic / proto-monadic"
test must be applied to some vast population to have any meaning. As a
research method to identify the frequency/distribution of some attribute
or parameter, and extrapolate that to the general population, this method
has merit. However, relative to the situation being discussed here, it is
merely a dodge. Why? Because population distribution is irrelevant within
the current context. You're talking about a test for identification of
*preference* within the population, where there is a *known* difference in
presented stimuli. That's a basic precept in the method. There is no
*known* difference in stimuli in the current context - that's the whole
argument.


I have proposed it only as a means of validating ABX and AB testing, to make
sure that they can deliver the goods in the more esoteric perceptual areas.
It has never been done, and until it is, the use of such tests, while
bequiling because of their simplicity, is simply a matter of faith in the
test technique. Not science.


Luckily, however, you already have a population subset, yourself included,
who claim to possess an attribute (i.e. who can distinguish, sighted, the
differences within a myriad of devices believed by many to be
indistiguishable, and believe that those differences are *real* and
reproducible), and thus the test need only involve that subset. Conduct
the test among the identified subset, construct the test to utilize blind
controls and level matching, then test in whatever manner, using whatever
scoring system, and for whatever period, you wish. Perform sufficient
replicates to generate a statistically valid data set, and you're done.


You can't use the test you believe might be inaccurate to validate itself.
Think about it.



Will this be universally transferrable to the whole population? No, but
again, that's irrelevant. It will, however, identify whether there is
such an attribute (ability to distinguish cable differences for e.g.)
within the *ONLY* population subset of interest. There is no utility in
testing outside that subset until the existence of the 'peceived'
attribute is confirmed, or not.



Again you miss the basic point. The test is not *PROVEN* to work for all
conditions of perceived sonic difference.



You see, testing only yourself, Mr. Lavo, using proper controls, would be
sufficient to confirm the existence of the ability you claim. Your
failure to confirm such an ability could not be extrapolated to the
population, but that's not the intent. So what keeps you from doing just
that? I did, and my observed (and obvious) differences in
cables...disappeared.



Yep, so you bought the argument. Did you ever seriously question the
underlying premises of the test itself? Did you ever think about the
difference in how you listened during the test, and how you listen when
relaxing and enjoying music? Did you pause to consider that the ear/brain
function in *listening to music* is very complex and context-derived? If
not, then you've bought into a faith. But it is not science. If it was
truly science, it's advocates (not its skeptics) would be pushing to
absolutely, positively verify it. That has not happened.

  #119   Report Post  
Harry Lavo
 
Posts: n/a
Default

wrote in message ...
Harry Lavo wrote:

We simply don't know that. Knowledge of the brain suggests they may be,
or
at the very least are different enough demands on the brain that the
"controlled conditions" where those conditions impose the need for
quick-switching, short-snippet, comparative choices interfere with normal
musical perception.


Then you would then agree that all musicians are unmusical because the
effort involved in just playing the right notes at the right time
(objective)
destroys their emotional perception of music. Playing all those right
notes
at the right time also involves training, (read: rehersal, where
musicians break pieces up into parts, make exercises out of passages,
compare snippets of interpretive ideas played back to back and etc. and
then
have to put it all back together) which is something else that you seem
to think destroys music.

I think it's absurd. Sorry.


And I would suggest that a musician performing is more akin to an audiophile
taking a test, rather than one kicking back and simply experiencing the
music.


  #120   Report Post  
Harry Lavo
 
Posts: n/a
Default

"Steven Sullivan" wrote in message
...
wrote:
Harry Lavo wrote:
wrote in message
...
Harry Lavo wrote:

But that is a result of the fact that music itself is
subjective, and *cannot* be measured objectively. The closest you
can
come
perhaps is to substitute some kind of psychophysiological
measurements.

Do you really believe all that???

Notation?
Music theory?
Tuning systems?
Harmonic series?
Compositional devices?

Just to name a few of the obvious ones.


I see your point. Let me correct my statement: the "experiencing" of
music
itself is subjective, and *cannot* be measured objectively.


Now hopefully you can agree to that, which is the part that is relevant
to a
listening test.


I wouldn't disagree, except that soliciting responses under controlled
conditions is also relevant, which is a bogeyman for you for reasons you
have yet to adequitely explain.



I would propose that the 'experiencing' of music isn't inherently
beyond scientific investigation, as brain scanning technology advances.
Certainly the 'experiencing' of music has been the subject of
psychological
investigation.


For what it is worth, Steven, I agree with you on this and hope more of this
type of work is done. From some of the articles I scanned briefly while
looking for the Oohashi article, it would appear more and more is being
done.


Of course one can 'experience' things that have no physical existence,
making
'experience' alone rather iffy as a basis for objective claims of
difference. I am pretty sure that Harry would report a different
'experience' of the *same* musical selection, using the same playback
gear, played twice in succession, if Harry was led to believe that he
was hearing different gear.

Possibly this 'experiental' difference would even have a
physical manifestation, visible in a brain scan. Imagination does.


Agree with you again. However, I am talking about a test that measures the
average response of two similar groups of people to the same musical
stimuls, but played through two different pieces of gear. Therefore any
difference can only be ascribed to the equipment. Thats what test design is
designed to do...control all othe variables, either by eliminating them or
by normalizing them.



snip remainder as not commented upon


Reply
Thread Tools
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
common mode rejection vs. crosstalk xy Pro Audio 385 December 29th 04 01:00 AM
Topic Police Steve Jorgensen Pro Audio 85 July 9th 04 11:47 PM


All times are GMT +1. The time now is 03:17 PM.

Powered by: vBulletin
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 AudioBanter.com.
The comments are property of their posters.
 

About Us

"It's about Audio and hi-fi"