Home |
Search |
Today's Posts |
#121
|
|||
|
|||
Harry Lavo wrote:
snip ..and I know damn well what monadic, proto-monadic, and comparative tests can and cannot measure, both real and imputed (or as you would say, imagined). Really? Hmmm, let's see... You tell me how else, other than using ABX itself, you can determine whether a real perceived difference exists in one piece of audio gear versus another. You can't...it has to be done across a large enough group of people to have statistical significance This is sheer nonsense, in the current context, as has been pointed out to you previously (by me, and I'll note that you did *not* reply). Statistical significance requires *ONLY* one participant, with multiple trials. Take *you* for example; you can easily do a sufficient number of trials, using whatever *blind* methodology you would like, on two components (say cables) for which *you* have identified, sighted, a consistent audible difference, to determine whether the chosen method statistically 'validates' your sighted results. That's it, finis. We are not talking about frequency distributions within a population, the only thing that would require a large population sample size, we're talking about using *your* method to detect (blind) the differences that *you* clearly hear...sighted. ...so one can say...tested blind, this group of (audiophiles, I presume) listening to movements (X,Y,Z) found "P" to have significantly higher ratings thatn "Q" on "transparency" and on "overall realism of the orchestra" (simply used as an example). Again, irrelevant. This approach presupposes that presence of difference is an unknown, and/or that the frequency of detection capability within the population is not known, neither of which is the case here. Then you know the difference is real (albeit perceived subjectively). *You* already say that you know the difference is real, right? That's the point, and that's why this is mere obfuscation. You then use an ABX test among a broad sample of yet another similarly-screened group of people, using short-snippets of movements X,Y,Z, A fictitious constraint you gratuitously apply, yet again. to see if in total they can detect the difference. If they test allows them to do so, you have validated the test. Right, you would have validated your 'monadic' testing for detection of 'difference', as that is what the ABX protocol is designed to do. If it does not, you have invalidated the test. Clearly incorrect. This presupposes the superiority of the 'monadic' test protocol. You get statistically significant results from *BOTH* tests, and your conclusion is "see...ABX doesn't work!". Sorry, but that just ignores basic statistical precepts. You cannot discount one method just because it gives the results you don't want, when it has the same level of significance (albeit with results in the opposite direction) as your 'pet' method. Finally, once you have validated the test, it can be used by single individuals to determine if they can reliably hear a difference between audio components similar to what they would experience in a more normal listening situation. If you are so sure that ABX testing works for open-ended evaluation of audio components playing music, you should be supporting such an effort, not ridiculing it. Because until you do, you are ****ing into the wind among the large majority of audiophiles. No, you have it backwards. Until *one* single individual can demonstrate confirmation of differences they have already confirmed sighted (level matched naturally), then there is no point in applying the method to a large population to establish the frequency of discrimination capability within the larger population. Whether you use one person, or a thousand, makes no difference whatsoever. If *one* person using *any* blind protocol consistently identifies a difference, then that difference is real (to the level of significance the data allows). *Then* you can compare against ABX, or any other method. This is a test that you, personally, could easily conduct if you were truly interested. So where's your data? Keith Hughes |
#122
|
|||
|
|||
On 7 Oct 2005 21:54:54 GMT, "Harry Lavo" wrote:
Stewart Pinkerton wrote: That's how Science works. You observe something unusual, come up with a theory to explain it, use that theory to predict something else, and observe the truth or falsity of your prediction. The light-bending experiment came *after* the theory. What was it he "observed" that led to the theory? He developed his theory from the Lorentzian interpretation of Maxwell's work, and arguably had prior knowledge of the Michelson-Morley experimental results. He had many giants on whose shoulders to stand.... In our case, we are observing the fact that 90% of audiophiles find the *sounds same* postulate so rediculous and at odds with experience (not in just a few instances, but in many, many instances) that the postulate is rejected. From whence comes this magical 90%, Harry? It seems as speculative as your other comments. I suspect that the real number is exceeded by those who believe that the world is ruled by shape-shifting reptiles. And since we are dealing with strictly subjective phenomenon, this rejection must be dealth with as a "fact". Mark and Michael have been working to point out why in theory the short-snippet, comparative testing may have missed a crucial element...an element that seems to square with what many audiophiles instinctively or intuitively feed is missing. Now it is time for some experimentation. Indeed it is - so go do some, instead of railing against the entire body of accumulated knowledge about audio. You might also bone up on his complete reluctance to embrace quantum mechanics, despire being one of the founding fathers (e.g., for explaining the photoelectric effect, for which he won his only Nobel proze) and in spite of overwhelming physical evidence. I do know of his reluctance to aceept quantum mechanics....never said he didn't have his weaknesses. But it goes to show what happens when science as faith replaces science as science. Oh dear. No, it goes to show that even the greatest scientists can sometimes refuse to accept scientific facts. I suppose that gives you *some* excuse............ He refused to accept them because they were so far from what his entire training had taught him "ought* to be. Thus his famous "roll of the dice" quote. That's called "belief" and it overcame his scien |
#123
|
|||
|
|||
On 8 Oct 2005 01:59:58 GMT, "Harry Lavo" wrote:
wrote in message ... Harry Lavo wrote: Actually not, for the research it grew out of. But my contention is that IMO and that of many others, it *may* be fatally flawed as a device for the open-ended evaluation of audio components. And that until it has been *validated* for that purpose, it should be promoted and received with substantial skepticism. My monadic test proposal is a legitimate way of doing that validation. Lest anyone think that Harry has been vested with any authority to declare what is and is not a valid psychoacoustic test, this is completely bass-ackwards. ABX is validated both by its constant use in the field and by its ability to make and confirm predictions aboout audibility. Whereas nobody in the field has ever used a monadic test to determine the audibility of anything. Not once. Ever. And for good reason. Tell that to Harman International and see my comments below. Harman International uses quick-switch level-matched DBTs. As do many other major audio manufacturers. So the first thing Harry needs to do, before he starts his Annus Mirabilis Project, is to validate that monadic testing can be used as an audibility test AT ALL. Can it even distinguish the kinds of things that ABX tests easily distinguish? Can it distinguish anything? He doesn't even know. Look, I spent 20 years doing sensory and behavior research in food...and I know damn well what monadic, proto-monadic, and comparative tests can and cannot measure, both real and imputed (or as you would say, imagined). In that case, you should already know that open-ended monadic tests are not going to be much use for audio........................ You tell me how else, other than using ABX itself, you can determine whether a real perceived difference exists in one piece of audio gear versus another. You can't...it has to be done across a large enough group of people to have statistical significance...so one can say...tested blind, this group of (audiophiles, I presume) listening to movements (X,Y,Z) found "P" to have significantly higher ratings thatn "Q" on "transparency" and on "overall realism of the orchestra" (simply used as an example). Then you know the difference is real (albeit perceived subjectively). You then use an ABX test among a broad sample of yet another similarly-screened group of people, using short-snippets of movements X,Y,Z, to see if in total they can detect the difference. If they test allows them to do so, you have validated the test. If it does not, you have invalidated the test. Finally, once you have validated the test, it can be used by single individuals to determine if they can reliably hear a difference between audio components similar to what they would experience in a more normal listening situation. If you are so sure that ABX testing works for open-ended evaluation of audio components playing music, you should be supporting such an effort, not ridiculing it. Because until you do, you are ****ing into the wind among the large majority of audiophiles. As noted above, you have it back-asswards. The audio industry - you know, the one that *designs* all those wonderful toys we listen to - has determined over many decades that quick-switched level-matched DBTs are the gold standard. If *you* wish to challenge this, then *you* must provide the evidence, not simply speculate. Of course, the real truth if the matter is that ABX works very well indeed, but fails to support your sighted impressions, which is why you are convinced that there just *must* be something wrong with it. Read my lips - wire is wire. -- Stewart Pinkerton | Music is Art - Audio is Engineering |
#124
|
|||
|
|||
Keith Hughes wrote:
Harry Lavo wrote: You tell me how else, other than using ABX itself, you can determine whether a real perceived difference exists in one piece of audio gear versus another. You can't...it has to be done across a large enough group of people to have statistical significance This is sheer nonsense, in the current context, as has been pointed out to you previously (by me, and I'll note that you did *not* reply). Statistical significance requires *ONLY* one participant, with multiple trials. Take *you* for example; you can easily do a sufficient number of trials, using whatever *blind* methodology you would like, on two components (say cables) for which *you* have identified, sighted, a consistent audible difference, to determine whether the chosen method statistically 'validates' your sighted results. That's it, finis. You're presuming that Harry actually wants an answer. But what if Harry doesn't want an answer? By making his "test" too complex and expensive to pull off, he ensures that it'll never happen, and he'll never have to eat his words. More than once I've proposed a much simpler approach that would really test what audiophiles actually claim to do--determine preferences between components. Unlike Harry's baroque approach, mine didn't presume anything about how audiophiles actually listen. Harry never responded to my posts, either. bob |
#125
|
|||
|
|||
wrote in message ...
Harry Lavo wrote: wrote in message ... Harry Lavo wrote: Actually not, for the research it grew out of. But my contention is that IMO and that of many others, it *may* be fatally flawed as a device for the open-ended evaluation of audio components. And that until it has been *validated* for that purpose, it should be promoted and received with substantial skepticism. My monadic test proposal is a legitimate way of doing that validation. Lest anyone think that Harry has been vested with any authority to declare what is and is not a valid psychoacoustic test, this is completely bass-ackwards. ABX is validated both by its constant use in the field and by its ability to make and confirm predictions aboout audibility. Whereas nobody in the field has ever used a monadic test to determine the audibility of anything. Not once. Ever. And for good reason. Tell that to Harman International and see my comments below. Harman does not use monadic tests to determine audibility. If they use monadic tests for anything (and I haven't seen anything they've published using such tests), it is to explore perceived differences between components that are already known to be audibly different. No one would use monadic tests to determine *whether* two things were audibly different. At least not anyone who knew what they were doing. So the first thing Harry needs to do, before he starts his Annus Mirabilis Project, is to validate that monadic testing can be used as an audibility test AT ALL. Can it even distinguish the kinds of things that ABX tests easily distinguish? Can it distinguish anything? He doesn't even know. Look, I spent 20 years doing sensory and behavior research in food...and I know damn well what monadic, proto-monadic, and comparative tests can and cannot measure, both real and imputed (or as you would say, imagined). FOOD??? You complain that too much research using DBTs was listening to things other than long musical passages, and then you say the better test is the one you used for FOOD? I assume you saw the quote here that said that audio research had to borrow from the social sciences. Well so does food research. We use the same types of tests, and for much the same reason...because sensory reaction is subjective and needs to b objectified. Moreover, I have lots of first hand experience specifying and designing those tests, and helping to interpret the results. Your hands on experience doing the same? You tell me how else, other than using ABX itself, you can determine whether a real perceived difference exists in one piece of audio gear versus another. I don't need anything else. ABX works. It allows me to make reliable predictions back and forth. I can look at measurements and predict the outcome of ABX tests--and be right. And I can look at the results of ABX tests and predict the magnitude of measured differences--and be right. You cannot do that with monadic tests, because you have no data. And you probably wouldn't be able to do so even if you had the data, because there would be so much noise in that data that it'd never tell you anything. As Mark and Michael have pointed out, you are engaged in the same circular reasoning that has destroyed your credibility among the audiophile community at large. That's one way never to have to think again...just assume away any possibililites that might bring your favorite test into question. Again, nobody uses monadic tests to determine *whether* there's a difference. Nobody. And you know everything there is to know about this, right? Including what every audio research lab in the country is up to? |
#126
|
|||
|
|||
"Keith Hughes" wrote in message
... Harry Lavo wrote: snip ..and I know damn well what monadic, proto-monadic, and comparative tests can and cannot measure, both real and imputed (or as you would say, imagined). Really? Hmmm, let's see... You tell me how else, other than using ABX itself, you can determine whether a real perceived difference exists in one piece of audio gear versus another. You can't...it has to be done across a large enough group of people to have statistical significance This is sheer nonsense, in the current context, as has been pointed out to you previously (by me, and I'll note that you did *not* reply). Statistical significance requires *ONLY* one participant, with multiple trials. Take *you* for example; you can easily do a sufficient number of trials, using whatever *blind* methodology you would like, on two components (say cables) for which *you* have identified, sighted, a consistent audible difference, to determine whether the chosen method statistically 'validates' your sighted results. That's it, finis. Horse pucky!. You are desribing evaluation of an ABX test, Keith. There's a whole different world of testing out there that you apparently are not familiar with. We are not talking about frequency distributions within a population, the only thing that would require a large population sample size, we're talking about using *your* method to detect (blind) the differences that *you* clearly hear...sighted. Yeah, and if a whole population hears them "blind" they are real. That's the *only* way you can tell if they are real. ...so one can say...tested blind, this group of (audiophiles, I presume) listening to movements (X,Y,Z) found "P" to have significantly higher ratings thatn "Q" on "transparency" and on "overall realism of the orchestra" (simply used as an example). Again, irrelevant. This approach presupposes that presence of difference is an unknown, and/or that the frequency of detection capability within the population is not known, neither of which is the case here. Once again, you are using the test in question as the standard, rather than trying to independently confirm it for the purpose under question. Circular reasoning. Then you know the difference is real (albeit perceived subjectively). *You* already say that you know the difference is real, right? That's the point, and that's why this is mere obfuscation. I say no such thing. I say the first step is to use monadic testing to determine if in fact the difference is real. You then use an ABX test among a broad sample of yet another similarly-screened group of people, using short-snippets of movements X,Y,Z, A fictitious constraint you gratuitously apply, yet again. Not at all. You want to validate the test technique, so you've got to do it once each among a broadscale group so you are testing the technique, not any one individual, and using short snippets, the way it is almost always done because of fatique/time constraints. The constraints are totally realistic. to see if in total they can detect the difference. If they test allows them to do so, you have validated the test. Right, you would have validated your 'monadic' testing for detection of 'difference', as that is what the ABX protocol is designed to do. Not at all, since you claim the ABX is *the most sensitive* test...if it shows up in the monadic ABX should pick it up...if it doesn't the test is no good. If it does not, you have invalidated the test. Clearly incorrect. This presupposes the superiority of the 'monadic' test protocol. You get statistically significant results from *BOTH* tests, and your conclusion is "see...ABX doesn't work!". Sorry, but that just ignores basic statistical precepts. You cannot discount one method just because it gives the results you don't want, when it has the same level of significance (albeit with results in the opposite direction) as your 'pet' method. Sorry yourself. See my comments just above. Finally, once you have validated the test, it can be used by single individuals to determine if they can reliably hear a difference between audio components similar to what they would experience in a more normal listening situation. If you are so sure that ABX testing works for open-ended evaluation of audio components playing music, you should be supporting such an effort, not ridiculing it. Because until you do, you are ****ing into the wind among the large majority of audiophiles. No, you have it backwards. Until *one* single individual can demonstrate confirmation of differences they have already confirmed sighted (level matched naturally), then there is no point in applying the method to a large population to establish the frequency of discrimination capability within the larger population. Whether you use one person, or a thousand, makes no difference whatsoever. Sure it does. It takes the artificially constrained test apparatus out of the equation. If the difference is real, and the sample size is large enough, the monadic test will reveal it. Then it is simply a matter of whether or not ABX does the same. If *one* person using *any* blind protocol consistently identifies a difference, then that difference is real (to the level of significance the data allows). *Then* you can compare against ABX, or any other method. This is a test that you, personally, could easily conduct if you were truly interested. So where's your data? Again, you are assuming the test is valid, rather than validating the test. Totally circular. |
#127
|
|||
|
|||
"Stewart Pinkerton" wrote in message
... On 8 Oct 2005 01:59:58 GMT, "Harry Lavo" wrote: wrote in message ... Harry Lavo wrote: Actually not, for the research it grew out of. But my contention is that IMO and that of many others, it *may* be fatally flawed as a device for the open-ended evaluation of audio components. And that until it has been *validated* for that purpose, it should be promoted and received with substantial skepticism. My monadic test proposal is a legitimate way of doing that validation. Lest anyone think that Harry has been vested with any authority to declare what is and is not a valid psychoacoustic test, this is completely bass-ackwards. ABX is validated both by its constant use in the field and by its ability to make and confirm predictions aboout audibility. Whereas nobody in the field has ever used a monadic test to determine the audibility of anything. Not once. Ever. And for good reason. Tell that to Harman International and see my comments below. Harman International uses quick-switch level-matched DBTs. As do many other major audio manufacturers. Not for evaluating speakers, they don't. They use sequential monadic testing. Because they found it squared better with their objective tests. So the first thing Harry needs to do, before he starts his Annus Mirabilis Project, is to validate that monadic testing can be used as an audibility test AT ALL. Can it even distinguish the kinds of things that ABX tests easily distinguish? Can it distinguish anything? He doesn't even know. Look, I spent 20 years doing sensory and behavior research in food...and I know damn well what monadic, proto-monadic, and comparative tests can and cannot measure, both real and imputed (or as you would say, imagined). In that case, you should already know that open-ended monadic tests are not going to be much use for audio........................ Yeah, right. That's why I'm spending all of this time trying to convince others on this newsgroup.... Your hands on experience designing and using sophisticated tests, please? You tell me how else, other than using ABX itself, you can determine whether a real perceived difference exists in one piece of audio gear versus another. You can't...it has to be done across a large enough group of people to have statistical significance...so one can say...tested blind, this group of (audiophiles, I presume) listening to movements (X,Y,Z) found "P" to have significantly higher ratings thatn "Q" on "transparency" and on "overall realism of the orchestra" (simply used as an example). Then you know the difference is real (albeit perceived subjectively). You then use an ABX test among a broad sample of yet another similarly-screened group of people, using short-snippets of movements X,Y,Z, to see if in total they can detect the difference. If they test allows them to do so, you have validated the test. If it does not, you have invalidated the test. Finally, once you have validated the test, it can be used by single individuals to determine if they can reliably hear a difference between audio components similar to what they would experience in a more normal listening situation. If you are so sure that ABX testing works for open-ended evaluation of audio components playing music, you should be supporting such an effort, not ridiculing it. Because until you do, you are ****ing into the wind among the large majority of audiophiles. As noted above, you have it back-asswards. The audio industry - you know, the one that *designs* all those wonderful toys we listen to - has determined over many decades that quick-switched level-matched DBTs are the gold standard. If *you* wish to challenge this, then *you* must provide the evidence, not simply speculate. No, they have determined that it is a useful tool for dealing with certain development attributes, using trained listeners and pre-training sessions to identify the attributes under investigation. That is a far cry from the open-ended evaluation of audio components in their overall ability to convey the musical experience. Similarly, in the food industry we used blind comparative testing for establishing certain taste and textural attributes. But wouldn't think of using it for final evaluation. Of course, the real truth if the matter is that ABX works very well indeed, but fails to support your sighted impressions, which is why you are convinced that there just *must* be something wrong with it. Read my lips - wire is wire. No, the truth of the matter is that it works well in spotting frequency and volume level irregularities and artifacts from compression, which is what it was developed for. You and the other objectivists *assume* it works equally well for the open-ended evaluation of home audio components, but you have never validated it for same. Simple as that, and you run scared and retreat into circular reasoning whenever it is pointed out. |
#128
|
|||
|
|||
"Stewart Pinkerton" wrote in message
... On 7 Oct 2005 21:54:54 GMT, "Harry Lavo" wrote: Stewart Pinkerton wrote: That's how Science works. You observe something unusual, come up with a theory to explain it, use that theory to predict something else, and observe the truth or falsity of your prediction. The light-bending experiment came *after* the theory. What was it he "observed" that led to the theory? He developed his theory from the Lorentzian interpretation of Maxwell's work, and arguably had prior knowledge of the Michelson-Morley experimental results. He had many giants on whose shoulders to stand.... Okay, you've shown you can dazzle. Now please interpret what "observations" he developed his theory to explain. In our case, we are observing the fact that 90% of audiophiles find the *sounds same* postulate so rediculous and at odds with experience (not in just a few instances, but in many, many instances) that the postulate is rejected. From whence comes this magical 90%, Harry? It seems as speculative as your other comments. I suspect that the real number is exceeded by those who believe that the world is ruled by shape-shifting reptiles. It doesn't matter whether it is 90%, or 95%, or 80%, or 75%. The percentage of audiophiles who honest believe all electronics essentially sound the same is a small minority....the vast majority simply do not buy the assertion. And since we are dealing with strictly subjective phenomenon, this rejection must be dealth with as a "fact". Mark and Michael have been working to point out why in theory the short-snippet, comparative testing may have missed a crucial element...an element that seems to square with what many audiophiles instinctively or intuitively feed is missing. Now it is time for some experimentation. Indeed it is - so go do some, instead of railing against the entire body of accumulated knowledge about audio. I'm beginning to work on how the validation test might actually be executed.. You might also bone up on his complete reluctance to embrace quantum mechanics, despire being one of the founding fathers (e.g., for explaining the photoelectric effect, for which he won his only Nobel proze) and in spite of overwhelming physical evidence. I do know of his reluctance to aceept quantum mechanics....never said he didn't have his weaknesses. But it goes to show what happens when science as faith replaces science as science. Oh dear. No, it goes to show that even the greatest scientists can sometimes refuse to accept scientific facts. I suppose that gives you *some* excuse............ He refused to accept them because they were so far from what his entire training had taught him "ought* to be. Thus his famous "roll of the dice" quote. That's called "belief" and it overcame his scientific training. No response, so I assume concurrence. See any similarity to the "science" as practiced here by some objectivists? |
#129
|
|||
|
|||
Harry Lavo wrote:
"Stewart Pinkerton" wrote in message ... On 7 Oct 2005 21:54:54 GMT, "Harry Lavo" wrote: Stewart Pinkerton wrote: That's how Science works. You observe something unusual, come up with a theory to explain it, use that theory to predict something else, and observe the truth or falsity of your prediction. The light-bending experiment came *after* the theory. What was it he "observed" that led to the theory? He developed his theory from the Lorentzian interpretation of Maxwell's work, and arguably had prior knowledge of the Michelson-Morley experimental results. He had many giants on whose shoulders to stand.... Okay, you've shown you can dazzle. Now please interpret what "observations" he developed his theory to explain. Hi Harry, Einstein was trying to explain the results of the Michelson-Morley experiment, which failed to find a difference in travel time between two perpendicular light beams--strange, because it was expected that the absolute motion of the Earth combined with the theory that light travelled through an absolutely fixed ether would lead to different travel times. Einstein's wildly brilliant solution was to propose that there is no absolute motion, that the speed of light looks the same to all observers. The point of Stewart & Bob is that Einstein's theory was based on troublesome observations. And, they say, there are no "troublesome observations" in audio; they have a way to detect if differences are audible, and this is consistent with the reigning theory of the ear's function. Where I think they are wrong is basing their model on the assumption that the ear and brain can be observed objectively without regard to observations carried out on the inside (observing one's own perception and listening to others describe their perceptions). So they end up with a model that describes the ear and brain very well---under one set of conditions. Secondly, I think they have "no troublesome observations" because they invoke perceptual illusion to explain away any observation they don't like, while at the same time admitting there are too many contributing factors to explain any given perception--it "could be" illusion, so it "must be" illusion; but we can never explain why any particular illusion occurred. Mike |
#130
|
|||
|
|||
wrote:
Harry Lavo wrote: wrote in message ... Harry Lavo wrote: Actually not, for the research it grew out of. But my contention is that IMO and that of many others, it *may* be fatally flawed as a device for the open-ended evaluation of audio components. And that until it has been *validated* for that purpose, it should be promoted and received with substantial skepticism. My monadic test proposal is a legitimate way of doing that validation. Lest anyone think that Harry has been vested with any authority to declare what is and is not a valid psychoacoustic test, this is completely bass-ackwards. ABX is validated both by its constant use in the field and by its ability to make and confirm predictions aboout audibility. Whereas nobody in the field has ever used a monadic test to determine the audibility of anything. Not once. Ever. And for good reason. Tell that to Harman International and see my comments below. Harman does not use monadic tests to determine audibility. If they use monadic tests for anything (and I haven't seen anything they've published using such tests), it is to explore perceived differences between components that are already known to be audibly different. No one would use monadic tests to determine *whether* two things were audibly different. At least not anyone who knew what they were doing. Indeed, when Sean Olive gave a talk on his work at an August 2004 AES meeting, here is how he described the uses of various DBTs (note that the requirement for *double blind* methodology goes without saying) http://www.aes.org/sections/la/PastM...004-08-31.html "Sean began by describing three types of listening tests: * Difference * Descriptive analysis * Preference / affective The difference test, obviously, is used for determining whether two audio devices under test are audibly different from each other. A common method is double-blind ABX testing. The descriptive analysis test is for gathering impressions of comparative audio quality from the listeners. If an ABX test reveals that device “A” sounds audibly different from device “B,” the descriptive analysis test would determine in what way they sound different. The descriptive analysis test has limited usefulness in audio, though. And after the determinations of “whether different” and “how different,” the preference or affective test asks the question, “Which one sounds better?” Each test has its own appropriate and inappropriate applications, as well as its own strengths and potential pitfalls. In any test, biases have to be controlled in order to obtain meaningful data. Most of his descriptions of testing methods involved tests of loudspeakers, but the principles can be put to use with other audio gear as well." You tell me how else, other than using ABX itself, you can determine whether a real perceived difference exists in one piece of audio gear versus another. I don't need anything else. ABX works. It allows me to make reliable predictions back and forth. I can look at measurements and predict the outcome of ABX tests--and be right. And I can look at the results of ABX tests and predict the magnitude of measured differences--and be right. You cannot do that with monadic tests, because you have no data. And you probably wouldn't be able to do so even if you had the data, because there would be so much noise in that data that it'd never tell you anything. Again, nobody uses monadic tests to determine *whether* there's a difference. Nobody. This call for 'validation' -- which at least one Stereophile reader parrots in the October 2005 letters column, in response to Jon Iverson's specious article on ABX tests ("The Blind LEading the Blind?" Aug 2005) -- is interesting. Those making this call should ask themselves: 1) Do ABX tests ever yield a 'difference' result for two *certainly* identical sources (e.g., a phantom switch)? No, they don't. 2) Do 'sighted' listening tests ever yield a 'difference' result in such tests? Yes, they do. -- -S |
#131
|
|||
|
|||
Stewart Pinkerton wrote:
On 7 Oct 2005 21:54:54 GMT, "Harry Lavo" wrote: Stewart Pinkerton wrote: That's how Science works. You observe something unusual, come up with a theory to explain it, use that theory to predict something else, and observe the truth or falsity of your prediction. The light-bending experiment came *after* the theory. What was it he "observed" that led to the theory? He developed his theory from the Lorentzian interpretation of Maxwell's work, and arguably had prior knowledge of the Michelson-Morley experimental results. He had many giants on whose shoulders to stand.... In our case, we are observing the fact that 90% of audiophiles find the *sounds same* postulate so rediculous and at odds with experience (not in just a few instances, but in many, many instances) that the postulate is rejected. From whence comes this magical 90%, Harry? It seems as speculative as your other comments. I suspect that the real number is exceeded by those who believe that the world is ruled by shape-shifting reptiles. And certainly exceeded by those who are subjectively *sure* that coincidence represents a preordained pattern -- by Harry's logic we should be interrogating the laws of probability. Hey, all those people who *dream* something that *happens* later, or who get a phone call *right after* they thought of the caller, can't be wrong...can they? When 'audiophiles' begin finding differences using DBTs, that can't be traced to measurable differences....THEN they can start asking that science look into the 'problem'. Until then, it's not a problem, it's just a bunch of cultish hobbyists refusing to accept a reasonable explanation that offends their sensibilities. -- -S |
#132
|
|||
|
|||
|
#133
|
|||
|
|||
On 9 Oct 2005 00:53:49 GMT, "Harry Lavo" wrote:
"Stewart Pinkerton" wrote in message ... On 8 Oct 2005 01:59:58 GMT, "Harry Lavo" wrote: wrote in message ... Harry Lavo wrote: Actually not, for the research it grew out of. But my contention is that IMO and that of many others, it *may* be fatally flawed as a device for the open-ended evaluation of audio components. And that until it has been *validated* for that purpose, it should be promoted and received with substantial skepticism. My monadic test proposal is a legitimate way of doing that validation. Lest anyone think that Harry has been vested with any authority to declare what is and is not a valid psychoacoustic test, this is completely bass-ackwards. ABX is validated both by its constant use in the field and by its ability to make and confirm predictions aboout audibility. Whereas nobody in the field has ever used a monadic test to determine the audibility of anything. Not once. Ever. And for good reason. Tell that to Harman International and see my comments below. Harman International uses quick-switch level-matched DBTs. As do many other major audio manufacturers. Not for evaluating speakers, they don't. They use sequential monadic testing. Because they found it squared better with their objective tests. That is for evaluating *preference*, which takes place *after* difference has been proven by quick-switched DBTs. What we;re talking about here is the establishment of *difference*, for which no one has *ever* used monadic testing - for the very good reason that it's insufficiently sensitive. So the first thing Harry needs to do, before he starts his Annus Mirabilis Project, is to validate that monadic testing can be used as an audibility test AT ALL. Can it even distinguish the kinds of things that ABX tests easily distinguish? Can it distinguish anything? He doesn't even know. Look, I spent 20 years doing sensory and behavior research in food...and I know damn well what monadic, proto-monadic, and comparative tests can and cannot measure, both real and imputed (or as you would say, imagined). In that case, you should already know that open-ended monadic tests are not going to be much use for audio........................ Yeah, right. That's why I'm spending all of this time trying to convince others on this newsgroup.... I siad that you *should* know it, not that you had actually grasped the concept................... Your hands on experience designing and using sophisticated tests, please? No need to re-invent the wheel for audio, I *use* quick-switched level-matched DBTs frequently. I also spent twenty years in the Defence and Aerospace industry designing extremely sophisticated test equipment so yes, I have considerable experience of designing and using sophisticated tests for precision analogue electronics and audio equipment, with a dynamic range and bandwidth considerably in excess of anything you'll see in domestic audio. I am happy to believe that you know how to conduct the Pepsi Challenge, but I'm not sure what that has to do with audio..... As noted above, you have it back-asswards. The audio industry - you know, the one that *designs* all those wonderful toys we listen to - has determined over many decades that quick-switched level-matched DBTs are the gold standard. If *you* wish to challenge this, then *you* must provide the evidence, not simply speculate. No, they have determined that it is a useful tool for dealing with certain development attributes, using trained listeners and pre-training sessions to identify the attributes under investigation. Indeed they have. That is a far cry from the open-ended evaluation of audio components in their overall ability to convey the musical experience. Indeed it is - because *no one* would use open-ended evaluation for detrmining the existence of subtle differences. It's simply not adequately sensitive. Similarly, in the food industry we used blind comparative testing for establishing certain taste and textural attributes. But wouldn't think of using it for final evaluation. What has this to do with audio? I seldom use a sledgehammer for polishing my car........ Of course, the real truth if the matter is that ABX works very well indeed, but fails to support your sighted impressions, which is why you are convinced that there just *must* be something wrong with it. Read my lips - wire is wire. No, the truth of the matter is that it works well in spotting frequency and volume level irregularities and artifacts from compression, which is what it was developed for. Actually, it works very well for spotting *any* truly audible difference - just not those which are entirely down to your overactive imagination! You and the other objectivists *assume* it works equally well for the open-ended evaluation of home audio components, but you have never validated it for same. Simple as that, and you run scared and retreat into circular reasoning whenever it is pointed out. No Harry, *you* run scared every time you are asked to *demonstrate* the validity of your own speculations. -- Stewart Pinkerton | Music is Art - Audio is Engineering |
#134
|
|||
|
|||
On 9 Oct 2005 00:50:05 GMT, "Harry Lavo" wrote:
wrote in message ... ABX works. It allows me to make reliable predictions back and forth. I can look at measurements and predict the outcome of ABX tests--and be right. And I can look at the results of ABX tests and predict the magnitude of measured differences--and be right. You cannot do that with monadic tests, because you have no data. And you probably wouldn't be able to do so even if you had the data, because there would be so much noise in that data that it'd never tell you anything. As Mark and Michael have pointed out, you are engaged in the same circular reasoning that has destroyed your credibility among the audiophile community at large. That's one way never to have to think again...just assume away any possibililites that might bring your favorite test into question. Did you actually *read* that statement before you hit the 'send' button? Firstly, where do you get off claiming any knowledge of 'the audiophile community'? Secondly, and much more amusingly, "assume away any possibililites that might bring your favorite test into question" is an *exact* description of what you three are doing. Again, nobody uses monadic tests to determine *whether* there's a difference. Nobody. And you know everything there is to know about this, right? Including what every audio research lab in the country is up to? Yes, right up until you can provide *evidence* to the contrary. And let's just keep this to properly established companies with genuine R&D facilities, shall we? Peter Qvortrup et al don't count - especially as you seem to be hung up on 'credibility'! -- Stewart Pinkerton | Music is Art - Audio is Engineering |
#135
|
|||
|
|||
Harry Lavo wrote:
Re Harman: They use sequential monadic testing. Because they found it squared better with their objective tests. Evidence, please. snip No, the truth of the matter is that it works well in spotting frequency and volume level irregularities and artifacts from compression, which is what it was developed for. No, it was developed to test for audible differences, independent of what those differences were. The fact that frequency and volume level differences ARE the only relevant differences between audio components make it perfectly suited to comparing said components. You and the other objectivists *assume* it works equally well for the open-ended evaluation of home audio components, but you have never validated it for same. Don't need to, for the reason noted above. Simple as that, and you run scared and retreat into circular reasoning whenever it is pointed out. Oh, please. Anytime you want to prove us wrong, you go right ahead, Harry. We're not stopping you. bob |
#136
|
|||
|
|||
Harry Lavo wrote:
"Keith Hughes" wrote in message ... Harry Lavo wrote: snip This is sheer nonsense, in the current context, as has been pointed out to you previously (by me, and I'll note that you did *not* reply). Statistical significance requires *ONLY* one participant, with multiple trials. Take *you* for example; you can easily do a sufficient number of trials, using whatever *blind* methodology you would like, on two components (say cables) for which *you* have identified, sighted, a consistent audible difference, to determine whether the chosen method statistically 'validates' your sighted results. That's it, finis. Horse pucky!. You are desribing evaluation of an ABX test, Keith. Where Harry? Please show us how "using whatever *blind* methodology you would like" constrains you to "ABX", or even intimates that ABX is involved. There's a whole different world of testing out there that you apparently are not familiar with. The same applies to you as well, obviously. So what? We're talking about a narrow subject here. We are not talking about frequency distributions within a population, the only thing that would require a large population sample size, we're talking about using *your* method to detect (blind) the differences that *you* clearly hear...sighted. Yeah, and if a whole population hears them "blind" they are real. That's the *only* way you can tell if they are real. That is ludicrous. You are saying, with that statement, that if I, for example, can discriminate A and B in 60 of 60 blind, level matched trials, the only way to verify that there really *is* a difference is to increase the test population size. If you really believe that, then I certainly can't help you. snip Again, irrelevant. This approach presupposes that presence of difference is an unknown, and/or that the frequency of detection capability within the population is not known, neither of which is the case here. Once again, you are using the test in question as the standard, rather than trying to independently confirm it for the purpose under question. Circular reasoning. Where on Earth did you get that from??? "Neither of which is the case here" does not refer to ABX or results therefrom, it refers to the presence of a large population of audiophiles (90% right?) who already *easily* and *reliably* discriminate between cables, amps, etc., using *some* method. We do not need to 'poll' the population, as it were, we need only verify extant observations using the same methods (blind) used to make those observations initially. Then you know the difference is real (albeit perceived subjectively). *You* already say that you know the difference is real, right? That's the point, and that's why this is mere obfuscation. I say no such thing. I say the first step is to use monadic testing to determine if in fact the difference is real. OK, sorry, my mistake. You've never said that you could hear differences between cables, amps, CD players...right. You then use an ABX test among a broad sample of yet another similarly-screened group of people, using short-snippets of movements X,Y,Z, A fictitious constraint you gratuitously apply, yet again. Not at all. You want to validate the test technique, so you've got to do it once each among a broadscale group so you are testing the technique, not any one individual, and using short snippets, the way it is almost always done because of fatique/time constraints. The constraints are totally realistic. If there is always time to do leisurely sighted evaluations, then clearly there is time to do leisurely AB, ABX, etc. testing. If you are really interested. to see if in total they can detect the difference. If they test allows them to do so, you have validated the test. Right, you would have validated your 'monadic' testing for detection of 'difference', as that is what the ABX protocol is designed to do. Not at all, since you claim the ABX is *the most sensitive* test...if it shows up in the monadic ABX should pick it up...if it doesn't the test is no good. For "difference", remember? If it does not, you have invalidated the test. Clearly incorrect. This presupposes the superiority of the 'monadic' test protocol. You get statistically significant results from *BOTH* tests, and your conclusion is "see...ABX doesn't work!". Sorry, but that just ignores basic statistical precepts. You cannot discount one method just because it gives the results you don't want, when it has the same level of significance (albeit with results in the opposite direction) as your 'pet' method. Sorry yourself. See my comments just above. Which, to the extent they are not erroneous, are irrelevant to the statement. If you have two methods that give different results, to the same level of significance, you cannot *just* choose the one you like. That's BASIC statistics. You need a referee test, or barring that, a clear determination of the root cause of the disparity (i.e. flawed assumptions or execution for one, or both, methods). snip Whether you use one person, or a thousand, makes no difference whatsoever. Sure it does. It takes the artificially constrained test apparatus out of the equation. If the difference is real, and the sample size is large enough, the monadic test will reveal it. Then it is simply a matter of whether or not ABX does the same. *If* these 'artificial constraints' significantly affect subject response, as you claim, then you will *not* find it, no matter what the sample size. You will have added another independent variable (i.e. individual response to the 'constraint') that could inhibit or enhance the probe response, and you won't know which, or if either, is the case. We're talking discrimination here (i.e. a binary probe/response case), not preference, wherein you can allow multiple independent variables and, using a multivariate analysis tool such as RSM, identify response as well as interactions of variables. If *one* person using *any* blind protocol consistently identifies a difference, then that difference is real (to the level of significance the data allows). *Then* you can compare against ABX, or any other method. This is a test that you, personally, could easily conduct if you were truly interested. So where's your data? Again, you are assuming the test is valid, rather than validating the test. Totally circular. Again, you totally ignore the reality of the situation. I'm saying you have to have valid *Data* to first suspect, then validate the test, i.e. an observation under controlled, blind conditions, using *ANY method OTHER than ABX* that can be used to challenge ABX test. This whole "validate the test" pogrom appears to me to be simple misdirection. Let's recap the basic situation: 1. We have a subject population that can discriminate (according to them) between A and B under sighted conditions. We do not need to sample a large population for 'preference' as your monadic test scenario would, we have a ready-made subject base, each with their own 'method' they have "validated" sighted. 2. We can recreate *all* of the conditions 'they' normally use to make those discriminations (duration, configuration, location, relaxation, prestidigitation...etc.) with the *sole* exception of foreknowledge of whether they listen to A or to B. 3. If, under the *exact same conditions + blind*, 'they' can no longer discriminate, then their initial discrimination results are unsubstantiated, and must be assumed to be invalid. 4. If, under the *exact same conditions + blind*, they *CAN* discriminate, then their initial discrimination results are confirmed. 5. If subsequent testing of confirmed discrimination results, via ABX, results in a null response, *then* the method is inappropriate for that use. So you see, *the* method you decry is not even part of this scenario until Step 5. I performed this test with both cables and DAC's; Step 3 got me. You, Harry Lavo, a single individual, could perform such a test, at your leisure, and prove us all wrong (if you're right). So what's the problem? Why all the waffling on about methods? If you have a method that you think works for discrimination, then use it blind and see. It's really that simple. Keith Hughes |
#137
|
|||
|
|||
wrote:
Secondly, I think they have "no troublesome observations" because they invoke perceptual illusion to explain away any observation they don't like, while at the same time admitting there are too many contributing factors to explain any given perception--it "could be" illusion, so it "must be" illusion; Not what we said. What we said was, it could be an illusion, so you cannot say it was not an illusion. IOW, a sighted observation tells us nothing. It is not evidence of anything. but we can never explain why any particular illusion occurred. No, we can't. But we can say with a high degree of probability that it *was* an illusion. bob |
#138
|
|||
|
|||
wrote in message
... Harry Lavo wrote: "Stewart Pinkerton" wrote in message ... On 7 Oct 2005 21:54:54 GMT, "Harry Lavo" wrote: Stewart Pinkerton wrote: That's how Science works. You observe something unusual, come up with a theory to explain it, use that theory to predict something else, and observe the truth or falsity of your prediction. The light-bending experiment came *after* the theory. What was it he "observed" that led to the theory? He developed his theory from the Lorentzian interpretation of Maxwell's work, and arguably had prior knowledge of the Michelson-Morley experimental results. He had many giants on whose shoulders to stand.... Okay, you've shown you can dazzle. Now please interpret what "observations" he developed his theory to explain. Hi Harry, Einstein was trying to explain the results of the Michelson-Morley experiment, which failed to find a difference in travel time between two perpendicular light beams--strange, because it was expected that the absolute motion of the Earth combined with the theory that light travelled through an absolutely fixed ether would lead to different travel times. Einstein's wildly brilliant solution was to propose that there is no absolute motion, that the speed of light looks the same to all observers. The point of Stewart & Bob is that Einstein's theory was based on troublesome observations. And, they say, there are no "troublesome observations" in audio; they have a way to detect if differences are audible, and this is consistent with the reigning theory of the ear's function. Where I think they are wrong is basing their model on the assumption that the ear and brain can be observed objectively without regard to observations carried out on the inside (observing one's own perception and listening to others describe their perceptions). So they end up with a model that describes the ear and brain very well---under one set of conditions. Secondly, I think they have "no troublesome observations" because they invoke perceptual illusion to explain away any observation they don't like, while at the same time admitting there are too many contributing factors to explain any given perception--it "could be" illusion, so it "must be" illusion; but we can never explain why any particular illusion occurred. Mike Thanks, Michael, for the actual explaination, which apparently was too mundane for Stewart to attempt. We seem in basic agreement that the big problem is that musical interpretation by the ear/brain is totally subjective and can only be described from within. We may differ in where the implications go from there (although perhaps not). But until proponents of short-snippet comparative testing can find some way of validating that their test does not interfere with normal musical cognition, which it seems to many myself included it does, then it will not be accepted by many/most audiophiles. |
#139
|
|||
|
|||
"Steven Sullivan" wrote in message
... wrote: Harry Lavo wrote: wrote in message ... Harry Lavo wrote: Actually not, for the research it grew out of. But my contention is that IMO and that of many others, it *may* be fatally flawed as a device for the open-ended evaluation of audio components. And that until it has been *validated* for that purpose, it should be promoted and received with substantial skepticism. My monadic test proposal is a legitimate way of doing that validation. Lest anyone think that Harry has been vested with any authority to declare what is and is not a valid psychoacoustic test, this is completely bass-ackwards. ABX is validated both by its constant use in the field and by its ability to make and confirm predictions aboout audibility. Whereas nobody in the field has ever used a monadic test to determine the audibility of anything. Not once. Ever. And for good reason. Tell that to Harman International and see my comments below. Harman does not use monadic tests to determine audibility. If they use monadic tests for anything (and I haven't seen anything they've published using such tests), it is to explore perceived differences between components that are already known to be audibly different. No one would use monadic tests to determine *whether* two things were audibly different. At least not anyone who knew what they were doing. Indeed, when Sean Olive gave a talk on his work at an August 2004 AES meeting, here is how he described the uses of various DBTs (note that the requirement for *double blind* methodology goes without saying) http://www.aes.org/sections/la/PastM...004-08-31.html "Sean began by describing three types of listening tests: * Difference * Descriptive analysis * Preference / affective The difference test, obviously, is used for determining whether two audio devices under test are audibly different from each other. A common method is double-blind ABX testing. The descriptive analysis test is for gathering impressions of comparative audio quality from the listeners. If an ABX test reveals that device "A" sounds audibly different from device "B," the descriptive analysis test would determine in what way they sound different. The descriptive analysis test has limited usefulness in audio, though. And after the determinations of "whether different" and "how different," the preference or affective test asks the question, "Which one sounds better?" Descriptive tests can be used for this purpose as well, although it takes larger sample sizes. Harman uses the descriptive tests to profile the nature and degree of differences between the speakers. Interesting, some of their preference and discriminatory tests apparently conflicted, but the descriptive analysis and preference data apparently correlate (with the caveat that I have not yet seem the reprints and am getting this info second hand). Each test has its own appropriate and inappropriate applications, as well as its own strengths and potential pitfalls. In any test, biases have to be controlled in order to obtain meaningful data. Most of his descriptions of testing methods involved tests of loudspeakers, but the principles can be put to use with other audio gear as well." Steven, this may seem new and "news" to you, but these are standard types of tests. I basically used them in the food industry for years, as well as other types of research. I'm glad Harman has brought them to the attention of the audio community, because there are more and better test for many uses than abx...and I'm happy to see that Harman is using some of them. You tell me how else, other than using ABX itself, you can determine whether a real perceived difference exists in one piece of audio gear versus another. I don't need anything else. ABX works. It allows me to make reliable predictions back and forth. I can look at measurements and predict the outcome of ABX tests--and be right. And I can look at the results of ABX tests and predict the magnitude of measured differences--and be right. You cannot do that with monadic tests, because you have no data. And you probably wouldn't be able to do so even if you had the data, because there would be so much noise in that data that it'd never tell you anything. Again, nobody uses monadic tests to determine *whether* there's a difference. Nobody. This call for 'validation' -- which at least one Stereophile reader parrots in the October 2005 letters column, in response to Jon Iverson's specious article on ABX tests ("The Blind LEading the Blind?" Aug 2005) -- is interesting. Those making this call should ask themselves: To the best of my knowledge, I was among the first (if not the first) to raise such a call...here...about two years ago. The need has simply become apparent to more people as the discussion has continued on various newsgroups. 1) Do ABX tests ever yield a 'difference' result for two *certainly* identical sources (e.g., a phantom switch)? No, they don't. 2) Do 'sighted' listening tests ever yield a 'difference' result in such tests? Yes, they do. Nobody is arguing in this case for sighted tests. You are using it as a strawman. And points 1) and 2) have nothing to do with validating abx, since the suspicion against it has nothing to do with either. |
#140
|
|||
|
|||
wrote in message ...
Harry Lavo wrote: Re Harman: They use sequential monadic testing. Because they found it squared better with their objective tests. Evidence, please. Based on hearsay at this point, but I will be getting there reprints. I should have inserted the word "apparently" after because and before "they". snip No, the truth of the matter is that it works well in spotting frequency and volume level irregularities and artifacts from compression, which is what it was developed for. No, it was developed to test for audible differences, independent of what those differences were. The fact that frequency and volume level differences ARE the only relevant differences between audio components make it perfectly suited to comparing said components. You and the other objectivists *assume* it works equally well for the open-ended evaluation of home audio components, but you have never validated it for same. Don't need to, for the reason noted above. Simple as that, and you run scared and retreat into circular reasoning whenever it is pointed out. Oh, please. Anytime you want to prove us wrong, you go right ahead, Harry. We're not stopping you. Well, don't hold your breath, but perhaps over the next year or two......... |
#141
|
|||
|
|||
"Stewart Pinkerton" wrote in message
... On 9 Oct 2005 00:50:05 GMT, "Harry Lavo" wrote: wrote in message ... ABX works. It allows me to make reliable predictions back and forth. I can look at measurements and predict the outcome of ABX tests--and be right. And I can look at the results of ABX tests and predict the magnitude of measured differences--and be right. You cannot do that with monadic tests, because you have no data. And you probably wouldn't be able to do so even if you had the data, because there would be so much noise in that data that it'd never tell you anything. As Mark and Michael have pointed out, you are engaged in the same circular reasoning that has destroyed your credibility among the audiophile community at large. That's one way never to have to think again...just assume away any possibililites that might bring your favorite test into question. Did you actually *read* that statement before you hit the 'send' button? Firstly, where do you get off claiming any knowledge of 'the audiophile community'? Secondly, and much more amusingly, "assume away any possibililites that might bring your favorite test into question" is an *exact* description of what you three are doing. No, we have not assumed it away. We have cited the need for verification and validation. You are the one refusing to accept that fact...that your test need validated for its intended purpose and hasn't been...in fact in another post you say it would be rediculous to assume its use for that very purpose. Again, nobody uses monadic tests to determine *whether* there's a difference. Nobody. And you know everything there is to know about this, right? Including what every audio research lab in the country is up to? Yes, right up until you can provide *evidence* to the contrary. And let's just keep this to properly established companies with genuine R&D facilities, shall we? Peter Qvortrup et al don't count - especially as you seem to be hung up on 'credibility'! Well, actually Harman has already apparently found that monadic tests give them more useful information for understanding what goes on between the actual reproduction of music by audio equipment and its perception by listeners. I think you would find that similar tests on, say, CD players would yield somewhat similar results, but with less variation and larger sample sizes required. I'm not sure about amplifiers or cables...but a validation test would certainly find out now, wouldn't it? |
#142
|
|||
|
|||
"Keith Hughes" wrote in message
... Harry Lavo wrote: "Keith Hughes" wrote in message ... Harry Lavo wrote: snip This is sheer nonsense, in the current context, as has been pointed out to you previously (by me, and I'll note that you did *not* reply). Statistical significance requires *ONLY* one participant, with multiple trials. Take *you* for example; you can easily do a sufficient number of trials, using whatever *blind* methodology you would like, on two components (say cables) for which *you* have identified, sighted, a consistent audible difference, to determine whether the chosen method statistically 'validates' your sighted results. That's it, finis. Horse pucky!. You are desribing evaluation of an ABX test, Keith. Where Harry? Please show us how "using whatever *blind* methodology you would like" constrains you to "ABX", or even intimates that ABX is involved. Because ABX and its cousins are the only tests that use repeated comparisons among individual users. That is audiometric testing, not social science testing such as preference testing (AB) and descriptive testing (monadic, or comparative monadic). There's a whole different world of testing out there that you apparently are not familiar with. The same applies to you as well, obviously. So what? We're talking about a narrow subject here. No, *you* are talking about a narrow subject. You are also talking about taking a test developed for use in a narrow way and applying it for use in a much broader way. That is why the potential test set must also be broadened. We are not talking about frequency distributions within a population, the only thing that would require a large population sample size, we're talking about using *your* method to detect (blind) the differences that *you* clearly hear...sighted. Yeah, and if a whole population hears them "blind" they are real. That's the *only* way you can tell if they are real. That is ludicrous. You are saying, with that statement, that if I, for example, can discriminate A and B in 60 of 60 blind, level matched trials, the only way to verify that there really *is* a difference is to increase the test population size. If you really believe that, then I certainly can't help you. No I would accept that as a real difference for you. And therefore in all probablility a real difference although perhaps one that only a few percent of the population might hear. If so, it would also show up in a monadic test if the test sample is large enough (it only takes a few percent far off the centerline of the bell-curve to create significant differences) But.. I am talking about differences that audiophiles claim to hear that don't show up in ABX testing. If they are real, they will show up in monadic perceptual testing, since it is a 'cleaner" test (e.g. fewer intervening variables versus normal listening). If there are not differences, they won't show up. But before we can conclude that ABX will also pick up these differences (and I'm talking here about things like depth of soundstage, transprency, holagraphy, etc.) we have to know if they are "real" (statistically) under conditions approximating relaxed home listening. Conditions that are a far cry from normal ABX-type testing. ABX has been validated for volume threshold deterction and other volume related artifacts; it has not been validated for other possible perception differences or the open-ended evaluation of audio components claimed to be heard under normal home use conditions. snip Again, irrelevant. This approach presupposes that presence of difference is an unknown, and/or that the frequency of detection capability within the population is not known, neither of which is the case here. Once again, you are using the test in question as the standard, rather than trying to independently confirm it for the purpose under question. Circular reasoning. Where on Earth did you get that from??? "Neither of which is the case here" does not refer to ABX or results therefrom, it refers to the presence of a large population of audiophiles (90% right?) who already *easily* and *reliably* discriminate between cables, amps, etc., using *some* method. We do not need to 'poll' the population, as it were, we need only verify extant observations using the same methods (blind) used to make those observations initially. I made no claims that 90% of audiophiles can easily and reliably discrimate. I said the 90% don't buy into abx testing as a valid means of evaluating the musicality of audio components. The proposed use of monadic testing as a control is to determine that in fact such differences can be discrimated by enough audiophiles under more ideal musical listening and test conditions than ABX to serve as a benchmark for ABX testing. I'm not talking about volume or frequency-response aberrations here. Then you know the difference is real (albeit perceived subjectively). *You* already say that you know the difference is real, right? That's the point, and that's why this is mere obfuscation. I say no such thing. I say the first step is to use monadic testing to determine if in fact the difference is real. OK, sorry, my mistake. You've never said that you could hear differences between cables, amps, CD players...right. Apology accepted. Let me be perfectly clear...I said *if* the monadic phase *shows* a statistically-significant difference then you *know* it is real (even if conventional measurements don't show differences in frequency response or volume to explain it). This gets at the difficulty EE's in particular have with accepting potential differnences. If it is *real* it shows up; if it is "not* it doesn't. Obviously, we would only want to use a control test where a "real" difference showed up that is not volume or frequency related. You then use an ABX test among a broad sample of yet another similarly-screened group of people, using short-snippets of movements X,Y,Z, A fictitious constraint you gratuitously apply, yet again. Not at all. You want to validate the test technique, so you've got to do it once each among a broadscale group so you are testing the technique, not any one individual, and using short snippets, the way it is almost always done because of fatique/time constraints. The constraints are totally realistic. If there is always time to do leisurely sighted evaluations, then clearly there is time to do leisurely AB, ABX, etc. testing. If you are really interested. Not so. For the monadic test...each person does just one test. For the comparative tests...they must do fiteen-twenty. For this to be practical, reality dictates (and actually practice entails) that short musical snippets are used for the testing since for open-ended evaluation of musical reproduction several types of musical example must be used. to see if in total they can detect the difference. If they test allows them to do so, you have validated the test. Right, you would have validated your 'monadic' testing for detection of 'difference', as that is what the ABX protocol is designed to do. Not at all, since you claim the ABX is *the most sensitive* test...if it shows up in the monadic ABX should pick it up...if it doesn't the test is no good. For "difference", remember? Right. Let me be more clear. If it shows up as a statistically significant "difference" in the monadic test (between two cells employing the two equipment variables under test) then it *should* show up in ABX as a significant difference. If it does not, you have invalidated the test. Clearly incorrect. This presupposes the superiority of the 'monadic' test protocol. You get statistically significant results from *BOTH* tests, and your conclusion is "see...ABX doesn't work!". Sorry, but that just ignores basic statistical precepts. You cannot discount one method just because it gives the results you don't want, when it has the same level of significance (albeit with results in the opposite direction) as your 'pet' method. Sorry yourself. See my comments just above. Which, to the extent they are not erroneous, are irrelevant to the statement. If you have two methods that give different results, to the same level of significance, you cannot *just* choose the one you like. That's BASIC statistics. You need a referee test, or barring that, a clear determination of the root cause of the disparity (i.e. flawed assumptions or execution for one, or both, methods). Right. That's why I have proposed the classic monadic, descriptive test as the control....because it is the least interruptive type of test that can be done and therefore there are fewer potentially intervening and contaminating variables. And obviously, all the tests must be executed properly and with proper controls. snip Whether you use one person, or a thousand, makes no difference whatsoever. Sure it does. It takes the artificially constrained test apparatus out of the equation. If the difference is real, and the sample size is large enough, the monadic test will reveal it. Then it is simply a matter of whether or not ABX does the same. *If* these 'artificial constraints' significantly affect subject response, as you claim, then you will *not* find it, no matter what the sample size. You will have added another independent variable (i.e. individual response to the 'constraint') that could inhibit or enhance the probe response, and you won't know which, or if either, is the case. We're talking discrimination here (i.e. a binary probe/response case), not preference, wherein you can allow multiple independent variables and, using a multivariate analysis tool such as RSM, identify response as well as interactions of variables. You can do the same with monadic testing...exactly the same thing. You just need larger sample sizes for a given level of significance. If *one* person using *any* blind protocol consistently identifies a difference, then that difference is real (to the level of significance the data allows). *Then* you can compare against ABX, or any other method. This is a test that you, personally, could easily conduct if you were truly interested. So where's your data? Again, you are assuming the test is valid, rather than validating the test. Totally circular. Again, you totally ignore the reality of the situation. I'm saying you have to have valid *Data* to first suspect, then validate the test, i.e. an observation under controlled, blind conditions, using *ANY method OTHER than ABX* that can be used to challenge ABX test. This whole "validate the test" pogrom appears to me to be simple misdirection. Let's recap the basic situation: 1. We have a subject population that can discriminate (according to them) between A and B under sighted conditions. We do not need to sample a large population for 'preference' as your monadic test scenario would, we have a ready-made subject base, each with their own 'method' they have "validated" sighted. 2. We can recreate *all* of the conditions 'they' normally use to make those discriminations (duration, configuration, location, relaxation, prestidigitation...etc.) with the *sole* exception of foreknowledge of whether they listen to A or to B. 3. If, under the *exact same conditions + blind*, 'they' can no longer discriminate, then their initial discrimination results are unsubstantiated, and must be assumed to be invalid. 4. If, under the *exact same conditions + blind*, they *CAN* discriminate, then their initial discrimination results are confirmed. 5. If subsequent testing of confirmed discrimination results, via ABX, results in a null response, *then* the method is inappropriate for that use. So you see, *the* method you decry is not even part of this scenario until Step 5. I performed this test with both cables and DAC's; Step 3 got me. You, Harry Lavo, a single individual, could perform such a test, at your leisure, and prove us all wrong (if you're right). So what's the problem? Why all the waffling on about methods? If you have a method that you think works for discrimination, then use it blind and see. It's really that simple. The sequence you describe is exactly the one I have outlined, Keith. Except that steps 1-4 would be carried out among a large population and step 2 would be combined with step 3 and step 4 (although it could be done exactly as you outline). However, I am not interested in what the group *thinks* they can hear sighted, I want to know *that* they can hear it blind in a monadic test. I don't want to measure "phantom" differences...first I want to establish that there is in fact a "real" difference blind that can be discriminated by a relatively large group of people. But I want to use a non-intrusive test to do it. *Then* do an ABX test wherein a similarly large group of people use ABX, *except* that instead of a few doing 17 samples, seventeen (for each) do one sample. This separates the test (e.g. ABX vs. monadic) from the individual doing the testing. The final test once ABX is validated is then for individuals to use it themselves, twenty times each if preferred. |
#143
|
|||
|
|||
Harry Lavo wrote:
wrote in message ... Harry Lavo wrote: Re Harman: They use sequential monadic testing. Because they found it squared better with their objective tests. Evidence, please. Based on hearsay at this point, but I will be getting there reprints. I should have inserted the word "apparently" after because and before "they". Thought so. A much more logical reason why they would have used monadic tests is that they were comparing dozens of speakers at a time. To do match-pair comparisons of all of them would have taken well-nigh forever. I think you'll be disappointed when you see those reprints. And note, once again, that they were doing *preference* tests, not *discrimination* tests. You still haven't come up with a single example of anyone using monadic testing for discrimination. bob |
#144
|
|||
|
|||
|
#147
|
|||
|
|||
On 9 Oct 2005 21:44:59 GMT, "Harry Lavo" wrote:
"Stewart Pinkerton" wrote in message ... On 9 Oct 2005 00:50:05 GMT, "Harry Lavo" wrote: Again, nobody uses monadic tests to determine *whether* there's a difference. Nobody. And you know everything there is to know about this, right? Including what every audio research lab in the country is up to? Yes, right up until you can provide *evidence* to the contrary. And let's just keep this to properly established companies with genuine R&D facilities, shall we? Peter Qvortrup et al don't count - especially as you seem to be hung up on 'credibility'! Well, actually Harman has already apparently found that monadic tests give them more useful information for understanding what goes on between the actual reproduction of music by audio equipment and its perception by listeners. Well actually no it hasn't, not for determing *difference*. Only *after* difference has been established do they move to *preference* testing, where monadic testing is certainly appropriate. Not e hoever that it is necessary to *first* establish difference. Without difference, preference is nonsensical. I think you would find that similar tests on, say, CD players would yield somewhat similar results, but with less variation and larger sample sizes required. I'm not sure about amplifiers or cables...but a validation test would certainly find out now, wouldn't it? Harry, ABX is perfectly valid - that's why the professionals use it. The armchair quarter-backing of those who don't like the results it gives, will not alter this fact. -- Stewart Pinkerton | Music is Art - Audio is Engineering |
#148
|
|||
|
|||
wrote in message ...
Harry Lavo wrote: wrote in message ... Harry Lavo wrote: Re Harman: They use sequential monadic testing. Because they found it squared better with their objective tests. Evidence, please. Based on hearsay at this point, but I will be getting there reprints. I should have inserted the word "apparently" after because and before "they". Thought so. A much more logical reason why they would have used monadic tests is that they were comparing dozens of speakers at a time. To do match-pair comparisons of all of them would have taken well-nigh forever. I think you'll be disappointed when you see those reprints. And note, once again, that they were doing *preference* tests, not *discrimination* tests. You still haven't come up with a single example of anyone using monadic testing for discrimination. You don't seem to understand that *preference* tests are matched pair tests. But their descriptive evaluations were monadic. |
#149
|
|||
|
|||
wrote in message ...
wrote: Harry Lavo wrote: wrote in message ... Harry Lavo wrote: Re Harman: They use sequential monadic testing. Because they found it squared better with their objective tests. Evidence, please. Based on hearsay at this point, but I will be getting there reprints. I should have inserted the word "apparently" after because and before "they". Thought so. A much more logical reason why they would have used monadic tests is that they were comparing dozens of speakers at a time. On second thought... I went and read Sean Olive's three AES papers from 2003-04 and wouldn't you know? He doesn't use monadic testing *at all*. What's worse, he uses *short snippets*. And even *quick switching*. Oh well, Harry, you'll always have Oohashi. bob We were speaking specifically of his latest round of loudspeaker tests, which Sean himself describes as "monadic". Save your sarcasm. |
#150
|
|||
|
|||
Harry Lavo wrote:
"Keith Hughes" wrote in message ... Harry Lavo wrote: snip Where Harry? Please show us how "using whatever *blind* methodology you would like" constrains you to "ABX", or even intimates that ABX is involved. Because ABX and its cousins are the only tests that use repeated comparisons among individual users. That is audiometric testing, not social science testing such as preference testing (AB) and descriptive testing (monadic, or comparative monadic). And this is audio, Harry, not social science. When you do taste testing for foods, you are free to rely on *all* organoleptic perceptual components as that *is* the context of use. That's why this type of test is not suitable for verification of *audible* differences - it is not designed to control the organoleptic components that contribute to a "preference". There's a whole different world of testing out there that you apparently are not familiar with. The same applies to you as well, obviously. So what? We're talking about a narrow subject here. No, *you* are talking about a narrow subject. You are also talking about taking a test developed for use in a narrow way and applying it for use in a much broader way. That is why the potential test set must also be broadened. Now, *We* are...audio. snip No I would accept that as a real difference for you. And therefore in all probablility a real difference although perhaps one that only a few percent of the population might hear. Thank you for making my point. Your test is a population distribution test, *not* a discrimination test. In the scenario I presented, the results were not real *for me*, they were real. From that point, we can discuss investigative ways for determining cause and extent. One need continue with a larger sample size *ONLY* if distribution within the population is of interest. If so, it would also show up in a monadic test if the test sample is large enough (it only takes a few percent far off the centerline of the bell-curve to create significant differences) But.. If it were just a few percent off the centerline, there would be a very low significance, buried in the noise. Tuchy's would likely identify them as outliers. I am talking about differences that audiophiles claim to hear that don't show up in ABX testing. If they are real, they will show up in monadic perceptual testing, since it is a 'cleaner" test (e.g. fewer intervening variables versus normal listening). If there are not differences, they won't show up. But before we can conclude that ABX will also pick up these differences (and I'm talking here about things like depth of soundstage, transprency, holagraphy, etc.) we have to know if they are "real" (statistically) under conditions approximating relaxed home listening. Conditions that are a far cry from normal ABX-type testing. Yes, you're talking about differences that have not been demonstrated under *any* test scenario other than sighted. snip I made no claims that 90% of audiophiles can easily and reliably discrimate. You claimed that 90% of audiophiles believed, counter to the objectivists, that components sounded different, and could be distinguished. The "easily and reliably" reflects the opinions typically espoused here, a la our radioactive buddy. I said the 90% don't buy into abx testing as a valid means of evaluating the musicality of audio components. And thus, the belief in ability to discriminate is based solely on sighted evaluations, right? snip 1. We have a subject population that can discriminate (according to them) between A and B under sighted conditions. We do not need to sample a large population for 'preference' as your monadic test scenario would, we have a ready-made subject base, each with their own 'method' they have "validated" sighted. 2. We can recreate *all* of the conditions 'they' normally use to make those discriminations (duration, configuration, location, relaxation, prestidigitation...etc.) with the *sole* exception of foreknowledge of whether they listen to A or to B. 3. If, under the *exact same conditions + blind*, 'they' can no longer discriminate, then their initial discrimination results are unsubstantiated, and must be assumed to be invalid. 4. If, under the *exact same conditions + blind*, they *CAN* discriminate, then their initial discrimination results are confirmed. 5. If subsequent testing of confirmed discrimination results, via ABX, results in a null response, *then* the method is inappropriate for that use. snip The sequence you describe is exactly the one I have outlined, Keith. Except that steps 1-4 would be carried out among a large population and step 2 would be combined with step 3 and step 4 (although it could be done exactly as you outline). Except that each person would conduct only *one* trial, which of course introduces another huge source of error (albeit random, *if* properly executed)...the reason that a huge sample size is needed for that type of test. Not nearly the "clean" test you claim. However, I am not interested in what the group *thinks* they can hear sighted, I want to know *that* they can hear it blind in a monadic test. Well, you clearly are interested Harry, because the *only* reason to question whether the boundaries of ABX testing can extend beyond where you believe it to be 'validated', is the presence of anecdotal evidence based on sighted evaluation. What other reasons are there (non-phenomenological that is)? I don't want to measure "phantom" differences...first I want to establish that there is in fact a "real" difference blind that can be discriminated by a relatively large group of people. But I want to use a non-intrusive test to do it. Well, first, you don't know that the test is "non-intrusive" to a greater extent than is ABX. You merely assume that it is. Second, you have *no* validation of your monadic testing for detecting *audible* differences. Further, you assume that audio and organoleptic perceptions are testable in the same fashion, and that the intrusiveness of any particular test constraint apply equally to both (these are implicit in your belief that the monadic test is suitable). Neither of which has been verified. Thus, you are using an unverified method as a reference against which to 'validate' a test that has been at least partially validated, by your own admission, for the sensory mode under test. An untenable approach IMO. *Then* do an ABX test wherein a similarly large group of people use ABX, *except* that instead of a few doing 17 samples, seventeen (for each) do one sample. This separates the test (e.g. ABX vs. monadic) from the individual doing the testing. No, Harry, it does not "separate the test from the individual" at all. That process is the same - you *assume* that multiple presentations are inherently more intrusive, and thus data-correlative, than is a single presentation, something I believe you have no data to support, relative to audio. There's a very valid reason that repetitive trials for organoleptic perception testing are problematic - the senses quickly become habituated, and discrimination ability is reduced. AFAIK, barring fatigue (or high volume related artifacts which should, of course, be controlled for in the test), this has not been shown to be an issue with auditory testing. The final test once ABX is validated is then for individuals to use it themselves, twenty times each if preferred. Again, Harry, you want to use a test that has not shown *any* utility for audible testing for a reference to 'validate' one that has. Your call to "validate the test" applies to an even greater measure to the test you propose as the reference. Keith Hughes |
#151
|
|||
|
|||
On 8 Oct 2005 17:16:07 GMT, Stewart Pinkerton wrote:
[snip] Read my lips - wire is wire. Wait a minute, now I'm confused: I thought we were arguing about "blind" testing, not "deaf" testing.... :-) -alan -- Alan Hoyle - - http://www.alanhoyle.com/ "I don't want the world, I just want your half." -TMBG Get Horizontal, Play Ultimate. |
#152
|
|||
|
|||
"Harry Lavo" wrote in message
... "Steven Sullivan" wrote in message 1) Do ABX tests ever yield a 'difference' result for two *certainly* identical sources (e.g., a phantom switch)? No, they don't. 2) Do 'sighted' listening tests ever yield a 'difference' result in such tests? Yes, they do. Nobody is arguing in this case for sighted tests. You are using it as a strawman. And points 1) and 2) have nothing to do with validating abx, since the suspicion against it has nothing to do with either. I don't suppose you would be willing to make a suggestion as to how one might "validate" the reliability of an ABX test--at least to your satisfaction? IOW, what procedure that would be convincing to an ABX doubter? I've always figured that double blind testing, of which ABX is an example, as being the way to validate OTHER tests--the gold standard as it were. I'm willing to devote a substantial amount of time to doing a satisfactory job of validating the ABX concept, but only if it will convince the skeptics. Norm Strong |
#153
|
|||
|
|||
On 9 Oct 2005 21:12:16 GMT, "Harry Lavo" wrote:
wrote in message ... Harry Lavo wrote: "Stewart Pinkerton" wrote in message ... On 7 Oct 2005 21:54:54 GMT, "Harry Lavo" wrote: Stewart Pinkerton wrote: That's how Science works. You observe something unusual, come up with a theory to explain it, use that theory to predict something else, and observe the truth or falsity of your prediction. The light-bending experiment came *after* the theory. What was it he "observed" that led to the theory? He developed his theory from the Lorentzian interpretation of Maxwell's work, and arguably had prior knowledge of the Michelson-Morley experimental results. He had many giants on whose shoulders to stand.... Okay, you've shown you can dazzle. Now please interpret what "observations" he developed his theory to explain. Hi Harry, Einstein was trying to explain the results of the Michelson-Morley experiment, which failed to find a difference in travel time between two perpendicular light beams--strange, because it was expected that the absolute motion of the Earth combined with the theory that light travelled through an absolutely fixed ether would lead to different travel times. Einstein's wildly brilliant solution was to propose that there is no absolute motion, that the speed of light looks the same to all observers. The point of Stewart & Bob is that Einstein's theory was based on troublesome observations. And, they say, there are no "troublesome observations" in audio; they have a way to detect if differences are audible, and this is consistent with the reigning theory of the ear's function. Where I think they are wrong is basing their model on the assumption that the ear and brain can be observed objectively without regard to observations carried out on the inside (observing one's own perception and listening to others describe their perceptions). So they end up with a model that describes the ear and brain very well---under one set of conditions. Secondly, I think they have "no troublesome observations" because they invoke perceptual illusion to explain away any observation they don't like, while at the same time admitting there are too many contributing factors to explain any given perception--it "could be" illusion, so it "must be" illusion; but we can never explain why any particular illusion occurred. Mike Thanks, Michael, for the actual explaination, which apparently was too mundane for Stewart to attempt. Actually, the moderator bounced my explanation, as he felt you might find my comments hurtful........................ :-) We seem in basic agreement that the big problem is that musical interpretation by the ear/brain is totally subjective and can only be described from within. We may differ in where the implications go from there (although perhaps not). But until proponents of short-snippet comparative testing can find some way of validating that their test does not interfere with normal musical cognition, which it seems to many myself included it does, then it will not be accepted by many/most audiophiles. The test is validated every day in the R&D labs of major players in the audio industry. That you three don't *like* the results it gives, e.g. that wire is just wire, doesn't invalidate the test. You are the ones making the extraordinary claims, so where is *your* evidence in support? -- Stewart Pinkerton | Music is Art - Audio is Engineering |
#154
|
|||
|
|||
"Stewart Pinkerton" wrote in message
... On 9 Oct 2005 21:44:59 GMT, "Harry Lavo" wrote: snip Well actually no it hasn't, not for determing *difference*. Only *after* difference has been established do they move to *preference* testing, where monadic testing is certainly appropriate. Not e hoever that it is necessary to *first* establish difference. Without difference, preference is nonsensical. I've been told that their preference tests discriminated better than their discrimination tests, which is why they didn't use them, but I won't state that as fact until I've obtained and read the test write-ups myself. If that finding is actually true, it would cast grave doubt on abx, since preference tests are simple blind AB tests. |
#155
|
|||
|
|||
Harry Lavo wrote:
But their descriptive evaluations were monadic. Only if "monadic" means whatever Harry Lavo wants it to mean at any given moment. You're the one who's lectured us for two years about how standard ABX tests were insufficiently sensitive because they forced the listener into "comparative mode." Well what the hell kind of mode do you think a listener is in when he's evaluating four speakers at once, changing at 15-30 second intervals, with 3-second gaps between changes? Try reading the research before you start pontificating about it, Harry. bob |
#156
|
|||
|
|||
Keith Hughes wrote:
Harry Lavo wrote: I don't want to measure "phantom" differences...first I want to establish that there is in fact a "real" difference blind that can be discriminated by a relatively large group of people. But I want to use a non-intrusive test to do it. Well, first, you don't know that the test is "non-intrusive" to a greater extent than is ABX. You merely assume that it is. Actually, Harry's "test," such as it is, would be far more intrusive than an ABX test, because it forces subjects to listen to and for the things Harry thinks they should be listening to and for. Whereas an ABX test doesn't ask you to do that; it allows you to listen however you would if you were deciding which of two cables to purchase. Second, you have *no* validation of your monadic testing for detecting *audible* differences. Further, you assume that audio and organoleptic perceptions are testable in the same fashion, and that the intrusiveness of any particular test constraint apply equally to both (these are implicit in your belief that the monadic test is suitable). Neither of which has been verified. Thus, you are using an unverified method as a reference against which to 'validate' a test that has been at least partially validated, by your own admission, for the sensory mode under test. An untenable approach IMO. Yeah, the whole idea that a standard listening test needs to be "validated" against a test that's never in history been used for that purpose is absurd. bob |
#157
|
|||
|
|||
wrote in message
... "Harry Lavo" wrote in message ... "Steven Sullivan" wrote in message 1) Do ABX tests ever yield a 'difference' result for two *certainly* identical sources (e.g., a phantom switch)? No, they don't. 2) Do 'sighted' listening tests ever yield a 'difference' result in such tests? Yes, they do. Nobody is arguing in this case for sighted tests. You are using it as a strawman. And points 1) and 2) have nothing to do with validating abx, since the suspicion against it has nothing to do with either. I don't suppose you would be willing to make a suggestion as to how one might "validate" the reliability of an ABX test--at least to your satisfaction? IOW, what procedure that would be convincing to an ABX doubter? I've always figured that double blind testing, of which ABX is an example, as being the way to validate OTHER tests--the gold standard as it were. I'm willing to devote a substantial amount of time to doing a satisfactory job of validating the ABX concept, but only if it will convince the skeptics. Norm, what do you think my half dozen last posts have been about? They've been about how to go about validating ABX...by using broadbased monadic testing and full musical excerpts to identify a subtle yet real difference unrelated to frequency response or signal levels. This test is the test least likely to interfere with normal listening habits and therefore the one most likely to catch such differences if they exist. Once such a difference is confirmed, then the test would be replicated using ABX techniques. If ABX picks up the difference, it is validated. And if it is validated, many of us will stop arguing and start using it. If it is not validated, hopefully some objectivists would consider abandoning it. |
#158
|
|||
|
|||
"Keith Hughes" wrote in message
... Harry Lavo wrote: "Keith Hughes" wrote in message ... Harry Lavo wrote: snip Where Harry? Please show us how "using whatever *blind* methodology you would like" constrains you to "ABX", or even intimates that ABX is involved. Because ABX and its cousins are the only tests that use repeated comparisons among individual users. That is audiometric testing, not social science testing such as preference testing (AB) and descriptive testing (monadic, or comparative monadic). And this is audio, Harry, not social science. When you do taste testing for foods, you are free to rely on *all* organoleptic perceptual components as that *is* the context of use. That's why this type of test is not suitable for verification of *audible* differences - it is not designed to control the organoleptic components that contribute to a "preference". There is a heavy social science side to audio that has been ignored....for the interpretation of sound, particularly musical value judgements (i.e. is the bass "right", does the orchestra sound "lifelike") are subjective judgements that can only be reported by people....no different than people reporting whether they liked a certain food, or color, or flavor, or thought a certain imitation sour cream mix tasted "almost like the real thing". Testing that ignores this aspect of the audiophile experience is automatically suspect, and that is one of the reasons why the abx test is not embraced by most audiophiles. There's a whole different world of testing out there that you apparently are not familiar with. The same applies to you as well, obviously. So what? We're talking about a narrow subject here. No, *you* are talking about a narrow subject. You are also talking about taking a test developed for use in a narrow way and applying it for use in a much broader way. That is why the potential test set must also be broadened. Now, *We* are...audio. snip Your irony escapes me. No I would accept that as a real difference for you. And therefore in all probablility a real difference although perhaps one that only a few percent of the population might hear. Thank you for making my point. Your test is a population distribution test, *not* a discrimination test. In the scenario I presented, the results were not real *for me*, they were real. From that point, we can discuss investigative ways for determining cause and extent. One need continue with a larger sample size *ONLY* if distribution within the population is of interest. Sorry, Keith, there are very sophisticated probability measurements to determine significance between two distributed populations, and if the difference exists (and the population of testers large enough) it will be picked up. Furthermore, the percentage of people hearing the same thing is likely to be higher because the test is less demanding and more closely approximates normal listening. If so, it would also show up in a monadic test if the test sample is large enough (it only takes a few percent far off the centerline of the bell-curve to create significant differences) But.. If it were just a few percent off the centerline, there would be a very low significance, buried in the noise. Tuchy's would likely identify them as outliers. In measuring probabilities against a null hypothesis in distributed samples, if their are outliers there is a reason for them....that's one of the beauties of using a distributed population. Their is virtually no chance of a true outlier screwing up the results. I am talking about differences that audiophiles claim to hear that don't show up in ABX testing. If they are real, they will show up in monadic perceptual testing, since it is a 'cleaner" test (e.g. fewer intervening variables versus normal listening). If there are not differences, they won't show up. But before we can conclude that ABX will also pick up these differences (and I'm talking here about things like depth of soundstage, transprency, holagraphy, etc.) we have to know if they are "real" (statistically) under conditions approximating relaxed home listening. Conditions that are a far cry from normal ABX-type testing. Yes, you're talking about differences that have not been demonstrated under *any* test scenario other than sighted. First, how many published component tests are you citing to support this fact? Cite them please. And of those, how many have *not* been ABX tests? In other words, how do you determine that the differences not being found are not the result of the test technique and environment itself. snip I made no claims that 90% of audiophiles can easily and reliably discrimate. You claimed that 90% of audiophiles believed, counter to the objectivists, that components sounded different, and could be distinguished. The "easily and reliably" reflects the opinions typically espoused here, a la our radioactive buddy. Thank you for toning down you claim. I said the 90% don't buy into abx testing as a valid means of evaluating the musicality of audio components. And thus, the belief in ability to discriminate is based solely on sighted evaluations, right? Wrong. ABX tests are only one kind of blind test. To the best of my knowledge nobody in the audio industry has yet had the motivation or resources to undertake the kind of validation testing I have proposed. That is another kind. Simple AB preference tests, done blind, are a third. I can probably name another four or six variations on these. Again, my argument is not with blind, other than its practicality for home use in the purchase of equipment. My problem is with short-snippet, comparative testing, of which ABX is the leading example. And the same goes for most other opponents whose position I have run into here on usenet. Your suggestion that we oppose blind testing is a strawman that is often used on usenet to avoid engaging over the real issues raised against ABX and its ilk. snip 1. We have a subject population that can discriminate (according to them) between A and B under sighted conditions. We do not need to sample a large population for 'preference' as your monadic test scenario would, we have a ready-made subject base, each with their own 'method' they have "validated" sighted. 2. We can recreate *all* of the conditions 'they' normally use to make those discriminations (duration, configuration, location, relaxation, prestidigitation...etc.) with the *sole* exception of foreknowledge of whether they listen to A or to B. 3. If, under the *exact same conditions + blind*, 'they' can no longer discriminate, then their initial discrimination results are unsubstantiated, and must be assumed to be invalid. 4. If, under the *exact same conditions + blind*, they *CAN* discriminate, then their initial discrimination results are confirmed. 5. If subsequent testing of confirmed discrimination results, via ABX, results in a null response, *then* the method is inappropriate for that use. snip The sequence you describe is exactly the one I have outlined, Keith. Except that steps 1-4 would be carried out among a large population and step 2 would be combined with step 3 and step 4 (although it could be done exactly as you outline). Except that each person would conduct only *one* trial, which of course introduces another huge source of error (albeit random, *if* properly executed)...the reason that a huge sample size is needed for that type of test. Not nearly the "clean" test you claim. However, I am not interested in what the group *thinks* they can hear sighted, I want to know *that* they can hear it blind in a monadic test. Well, you clearly are interested Harry, because the *only* reason to question whether the boundaries of ABX testing can extend beyond where you believe it to be 'validated', is the presence of anecdotal evidence based on sighted evaluation. What other reasons are there (non-phenomenological that is)? I'm not saying I'm not interested in the question, I'm saying that for purposes of establishing a control test is is irrelevant. The control must be both perceived and "real" in the sense of being able to be measured with statistical significance in monadic testing. Things that are perceived but are not real are totally irrelevant to the necessary control. The purpose of the control test is to see if ABX testing can pick up real differences that are not volume or frequency response related. I don't want to measure "phantom" differences...first I want to establish that there is in fact a "real" difference blind that can be discriminated by a relatively large group of people. But I want to use a non-intrusive test to do it. Well, first, you don't know that the test is "non-intrusive" to a greater extent than is ABX. You merely assume that it is. Second, you have *no* validation of your monadic testing for detecting *audible* differences. Further, you assume that audio and organoleptic perceptions are testable in the same fashion, and that the intrusiveness of any particular test constraint apply equally to both (these are implicit in your belief that the monadic test is suitable). Neither of which has been verified. Thus, you are using an unverified method as a reference against which to 'validate' a test that has been at least partially validated, by your own admission, for the sensory mode under test. An untenable approach IMO. Absolutely, I don't know. But I do know test design, and there has been plenty of discussion about how people listen here in arguing these issues...even Arnie's 10 criteria deal with some of them. I can certainly say that the monadic test as proposed comes a lot closer than does ABX. First, it only has to be done once, so it can use full segments of music, establishing musical context and allowing time for differences to surface. Second, it does not require active comparison at all. It simply requires normal audiophile-type reactions to the music and the sound. Third, any "rating" is done after the listening is over, not during it, and is based on recall..recall that can take into account perceptions both acute and vague, as well as feeling-states. Compare that to ABX where one must somehow get fifteen-twenty choices made, with multiple comparisons to be made before each choice, using an intervening box thus changing the physical parameters of the test setup, and of necessity (because of the time/stress factor) using only short snippets of music. Very relaxing and enjoyable, right? The monadic testing approximates as much as possible a normal listening environment and lets statistics tell us about diffrences in after-the-fact reported impressions/ratings. The ABX forces us to make constant, rational comparisons in a rush to beat fatique. *Then* do an ABX test wherein a similarly large group of people use ABX, *except* that instead of a few doing 17 samples, seventeen (for each) do one sample. This separates the test (e.g. ABX vs. monadic) from the individual doing the testing. No, Harry, it does not "separate the test from the individual" at all. That process is the same - you *assume* that multiple presentations are inherently more intrusive, and thus data-correlative, than is a single presentation, something I believe you have no data to support, relative to audio. There's a very valid reason that repetitive trials for organoleptic perception testing are problematic - the senses quickly become habituated, and discrimination ability is reduced. AFAIK, barring fatigue (or high volume related artifacts which should, of course, be controlled for in the test), this has not been shown to be an issue with auditory testing. You must be kidding. This is one of the things most commented upon by people using the technique...how quickly they lose the ability to discriminate along with growing fatigue. Even those using and supporting the test often report it as fatiguing and rather grueling with a sense developing in the late stages of great uncertainty. The ITU guidelines even comment upon this aspect of the test, as one of the reason for limiting the number of trials. The final test once ABX is validated is then for individuals to use it themselves, twenty times each if preferred. Again, Harry, you want to use a test that has not shown *any* utility for audible testing for a reference to 'validate' one that has. Your call to "validate the test" applies to an even greater measure to the test you propose as the reference. Keith, I am trying along with others here to show why ABX should be viewed with some skepticism for the purpose of open-ended evaluation of audio components, until and unless it is validated. It is that simple. |
#159
|
|||
|
|||
Harry Lavo wrote:
We were speaking specifically of his latest round of loudspeaker tests, which Sean himself describes as "monadic". Either provide a quote and citation, or admit you are making this up. bob |
#160
|
|||
|
|||
|
Reply |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Not In Love With Tivoli Audio? Maybe Here is Why-FAQ and Exegesis | Audio Opinions | |||
NPR reports on new brain research music | High End Audio | |||
Installing stand-by switch | Vacuum Tubes | |||
More cable questions! | Tech |