Home |
Search |
Today's Posts |
#1
|
|||
|
|||
Arny's PCABX site
I played arond with PCABX. The training room was unavailable so I
proceeded to the easiest sample I could find, which was the Bryston 2BLP amplifier, reference versus 5-time pass through. I played the "trumpet" clips. The sound was harsh, unlife-like, and non-musical. None of the features in a signal that I normally use to evaluate equipment were present... there was no beauty, there was no beauty in the decay. There was no musical shape, no way to sense dynamics, no ebb and flow, no meter, nothing to make the toe tap. It was very hard to hear any difference in A and B but I thought A was a tad brighter. I did 13 trials, really didn't feel sure I knew at all what was going on. I did not have the clips repeat; I manually hit stop before each clip was done. When the clips repeated the effect was harsh and unmusical. I basically listened to the clips once and paused a few seconds, then listened to the next. The "instant-switch" feature was not instantly switching; there was a long pause which interrupted the flow, and the next clip came on abrubtly. My impression after switching like that was always that the abrubt switch disturbed my ability to hear anything significant. So I didn't do it that way. I scored 10/13 or 7.5% probability of guessing. So I must have been hearing something, although I was never sure what I was doing. If this is representative of the industry's general science, no wonder its conclusions seem to have little to do with sound quality in normal listening. Even after I thought I could tell the difference between those clips, I had absolutely no way to refer that to a normal listening experience. The objectivists sometimes say that we should use blind tests to tell if there is a difference, then sighted tests to pick a preference. If you do that, you definitely won't be picking your preference on the basis of sound alone. What you heard in the blind test may not relate in any way to what you hear in normal listening, or even be audible in normal listening. It's kind of ironic that somebody would say you need to know what something sounds like on the basis of sound alone, but then decide what component to buy based on its looks. -Mike abrubt switch |
#2
|
|||
|
|||
X-Newsgroups: rec.audio.high-end
In article you wrote: I played arond with PCABX. The training room was unavailable so I proceeded to the easiest sample I could find, which was the Bryston 2BLP amplifier, reference versus 5-time pass through. I played the "trumpet" clips. The sound was harsh, unlife-like, and non-musical. None of the features in a signal that I normally use to evaluate equipment were present... there was no beauty, there was no beauty in the decay. There was no musical shape, no way to sense dynamics, no ebb and flow, no meter, nothing to make the toe tap. It was very hard to hear any difference in A and B but I thought A was a tad brighter. I did 13 trials, really didn't feel sure I knew at all what was going on. I did not have the clips repeat; I manually hit stop before each clip was done. When the clips repeated the effect was harsh and unmusical. I basically listened to the clips once and paused a few seconds, then listened to the next. The "instant-switch" feature was not instantly switching; there was a long pause which interrupted the flow, and the next clip came on abrubtly. My impression after switching like that was always that the abrubt switch disturbed my ability to hear anything significant. So I didn't do it that way. I scored 10/13 or 7.5% probability of guessing. So I must have been hearing something, although I was never sure what I was doing. If this is representative of the industry's general science, no wonder its conclusions seem to have little to do with sound quality in normal listening. Even after I thought I could tell the difference between those clips, I had absolutely no way to refer that to a normal listening experience. PCABX is one software ABX implementaion; there are others. See the list at http://www.hydrogenaudio.org/forums/...16&mode=linear Software abx is most commonly used for codec development, where it has proven extremely useful. Explore hydrogenaudio.org for more details. You can fairly easily set up your own tests using more 'musical' or 'natural' test material, if you like. Btw, 10/13 translates to just a hair less than the p=0.05 value preferred by most scientists; in fact if we round to two places it *is* 0.05. Sixteen or more trials would give somewhat more robust statistics. The objectivists sometimes say that we should use blind tests to tell if there is a difference, then sighted tests to pick a preference. If you do that, you definitely won't be picking your preference on the basis of sound alone. And any knowledgable objectivists knows that. If you want to be sure you've established preferences on sound alone, that's got to be a dbt too. That's how they do it with speakers at Harman/JBL for example. What you heard in the blind test may not relate in any way to what you hear in normal listening, or even be audible in normal listening. It's kind of ironic that somebody would say you need to know what something sounds like on the basis of sound alone, but then decide what component to buy based on its looks. Not at all! If they are of the opinion that certain things are likely to sound alike -- based on 'sound alone' tests like DBT as well as on physical and engineering principles -- the it's *entirely* rational to decide what to buy based on *looks*, price, etc. And if ABX didn't relate to normal listening, then it would never reliably 'work'. THere would be no correlation between, say, magnitude of difference and ability to discern it in ABX -- at any level. Yet there is. "Training" for comparaitive listening involves identifying levels of difference that pass an ABX, then decreasing the difference. Eventually a threshold level is reached where the detection rate is no better than chance. We don't see detection rate *increasing* as level difference goes down, for example. If blind testing was so grossly inappropriate and unreliable a way to test for audible difference, we would. Whereas with sighted tests, we can of course rather easily influence the listener to score 'differences' where *none* exist..or even to have them report differences getting *larger* when they are in fact decreasing. And we can rather easily reverse the bias of the responses too. The responses can be *demonstrably* divorced from the sound. We can do this by letting the listener 'know' what they are listening to by some means *other than* just listening. -- -S It's not my business to do intelligent work. -- D. Rumsfeld, testifying before the House Armed Services Committee |
#3
|
|||
|
|||
On 7 Apr 2005 01:09:48 GMT, "Michael Mossey"
wrote: The objectivists sometimes say that we should use blind tests to tell if there is a difference, then sighted tests to pick a preference. If you do that, you definitely won't be picking your preference on the basis of sound alone. No one ever suggested that a *buying* decision should be made *only* on the basis of sound quality. What you heard in the blind test may not relate in any way to what you hear in normal listening, or even be audible in normal listening. It's kind of ironic that somebody would say you need to know what something sounds like on the basis of sound alone, but then decide what component to buy based on its looks. Nothing ironic about it. I'd love an Oracle CD player, but I seriously doubt that it sounds any different than my Pioneer DV-575A. Now, if blind testing had shown that it *did* sound different, and that in fact it sounded *worse* (common enough in 'high end' gear), that would be a different matter. -- Stewart Pinkerton | Music is Art - Audio is Engineering |
#4
|
|||
|
|||
Steven Sullivan wrote:
X-Newsgroups: rec.audio.high-end In article you wrote: PCABX is one software ABX implementaion; there are others. See the list at http://www.hydrogenaudio.org/forums/...16&mode=linear Software abx is most commonly used for codec development, where it has proven extremely useful. Explore hydrogenaudio.org for more details. You can fairly easily set up your own tests using more 'musical' or 'natural' test material, if you like. Btw, 10/13 translates to just a hair less than the p=0.05 value preferred by most scientists; in fact if we round to two places it *is* 0.05. Sixteen or more trials would give somewhat more robust statistics. The objectivists sometimes say that we should use blind tests to tell if there is a difference, then sighted tests to pick a preference. If you do that, you definitely won't be picking your preference on the basis of sound alone. And any knowledgable objectivists knows that. If you want to be sure you've established preferences on sound alone, that's got to be a dbt too. That's how they do it with speakers at Harman/JBL for example. What you heard in the blind test may not relate in any way to what you hear in normal listening, or even be audible in normal listening. It's kind of ironic that somebody would say you need to know what something sounds like on the basis of sound alone, but then decide what component to buy based on its looks. Not at all! If they are of the opinion that certain things are likely to sound alike -- based on 'sound alone' tests like DBT as well as on physical and engineering principles -- the it's *entirely* rational to decide what to buy based on *looks*, price, etc. Alright, I will concede you can buy based on anything you like. And if ABX didn't relate to normal listening, then it would never reliably 'work'. THere would be no correlation between, say, magnitude of difference and ability to discern it in ABX -- at any level. Yet there is. "Training" for comparaitive listening involves identifying levels of difference that pass an ABX, then decreasing the difference. Eventually a threshold level is reached where the detection rate is no better than chance. We don't see detection rate *increasing* as level difference goes down, for example. If blind testing was so grossly inappropriate and unreliable a way to test for audible difference, we would. I'm glad you are making an argument here and not just bashing my original point. I don't quite follow you, however. It seems to me that you are saying ABX provides data that is consistent with a model of the ear, i.e. "greater signal in" provides "greater signal out." However, anything I've suggested about normal listening is consistent with that model. Whereas with sighted tests, we can of course rather easily influence the listener to score 'differences' where *none* exist..or even to have them report differences getting *larger* when they are in fact decreasing. And we can rather easily reverse the bias of the responses too. The responses can be *demonstrably* divorced from the sound. We can do this by letting the listener 'know' what they are listening to by some means *other than* just listening. I don't put much faith in sighted listening nor have I ever. -Mike |
#5
|
|||
|
|||
"Steven Sullivan" wrote in message
... X-Newsgroups: rec.audio.high-end In article you wrote: Btw, 10/13 translates to just a hair less than the p=0.05 value preferred by most scientists; this isn't quite the case 0.05 level of significance is not so much preferred as a long established rule of thumb. Recalling that the phrase "rule of thumb" describes Solomon's maximum allowed diameter for the stick a husband could use to beat his wife the rule of thumb may not make sense in a contemporary situation. In situations where a "miss" may have high consequences the tolerance may be much tighter. Where we are talking about relatively low risk like guiding a consumer purchase it may be much lower. personally if the audio component wasn't overly expensive I'd be thrilled to buy something where it met a .75 tolerance level. BTW thanks for your interesting comments on the psychology of hearing. There's nothing more refreshing than an open mind. |
#6
|
|||
|
|||
Michael Mossey wrote:
Steven Sullivan wrote: levels of difference that pass an ABX, then decreasing the difference. Eventually a threshold level is reached where the detection rate is no better than chance. We don't see detection rate *increasing* as level difference goes down, for example. If blind testing was so grossly inappropriate and unreliable a way to test for audible difference, we would. I'm glad you are making an argument here and not just bashing my original point. I don't quite follow you, however. It seems to me that you are saying ABX provides data that is consistent with a model of the ear, i.e. "greater signal in" provides "greater signal out." However, anything I've suggested about normal listening is consistent with that model. Normal listening is consistent when differences of objectively 'gross' -- e.g., when two different songs are played. But there comes a point for every listener where differences enter the range where psychological noise competes with signal. At this point 'sighted' reports of difference start to become no more reliable than guessing. To filter out the noise, we have to thwart the psychological sources of the noise. THis is called double blind testing. You might have heard of it. Whereas with sighted tests, we can of course rather easily influence the listener to score 'differences' where *none* exist..or even to have them report differences getting *larger* when they are in fact decreasing. And we can rather easily reverse the bias of the responses too. The responses can be *demonstrably* divorced from the sound. We can do this by letting the listener 'know' what they are listening to by some means *other than* just listening. I don't put much faith in sighted listening nor have I ever. Your skeptciism of blind testing is unsupported by anything *except* sighted results. -- -S It's not my business to do intelligent work. -- D. Rumsfeld, testifying before the House Armed Services Committee |
#7
|
|||
|
|||
Steven Sullivan wrote:
Michael Mossey wrote: Steven Sullivan wrote: levels of difference that pass an ABX, then decreasing the difference. Eventually a threshold level is reached where the detection rate is no better than chance. We don't see detection rate *increasing* as level difference goes down, for example. If blind testing was so grossly inappropriate and unreliable a way to test for audible difference, we would. I'm glad you are making an argument here and not just bashing my original point. I don't quite follow you, however. It seems to me that you are saying ABX provides data that is consistent with a model of the ear, i.e. "greater signal in" provides "greater signal out." However, anything I've suggested about normal listening is consistent with that model. Normal listening is consistent when differences of objectively 'gross' -- e.g., when two different songs are played. But there comes a point for every listener where differences enter the range where psychological noise competes with signal. At this point 'sighted' reports of difference start to become no more reliable than guessing. To filter out the noise, we have to thwart the psychological sources of the noise. THis is called double blind testing. You might have heard of it. Hang on-- double blind testing doesn't remove psychological noise. I don't think that's the right way to say it. I think what you mean is that it removes the *bias* from sighted results. In double-blind tests, people still sometimes end up guessing randomly. The noise is still there. In fact, during my recent blind tests I observed that expectation can form during blind testing. Expectation is a kind of psychological noise. In fact, I hypothesize that a double-blind test which reduces internal "expecting" during the test would lead to a more accurate result. Sighted reports are probably much *more* reliable than guessing. They reliably tell you which component looks more impressive. Whereas with sighted tests, we can of course rather easily influence the listener to score 'differences' where *none* exist..or even to have them report differences getting *larger* when they are in fact decreasing. And we can rather easily reverse the bias of the responses too. The responses can be *demonstrably* divorced from the sound. We can do this by letting the listener 'know' what they are listening to by some means *other than* just listening. I don't put much faith in sighted listening nor have I ever. Your skeptciism of blind testing is unsupported by anything *except* sighted results. My skepticism is not about *blind* testing--it is about specific techniques. The skepticism *originates* in intuition and introspection about the experience of listening to something blind. Last I checked, intuition was a fine place for skepticism to *originate*. Testing comes next. If I win the lottery or something, I'll do all the necessary tests. -Mike |
#8
|
|||
|
|||
Michael Mossey wrote:
Steven Sullivan wrote: Michael Mossey wrote: Steven Sullivan wrote: levels of difference that pass an ABX, then decreasing the difference. Eventually a threshold level is reached where the detection rate is no better than chance. We don't see detection rate *increasing* as level difference goes down, for example. If blind testing was so grossly inappropriate and unreliable a way to test for audible difference, we would. I'm glad you are making an argument here and not just bashing my original point. I don't quite follow you, however. It seems to me that you are saying ABX provides data that is consistent with a model of the ear, i.e. "greater signal in" provides "greater signal out." However, anything I've suggested about normal listening is consistent with that model. Normal listening is consistent when differences of objectively 'gross' -- e.g., when two different songs are played. But there comes a point for every listener where differences enter the range where psychological noise competes with signal. At this point 'sighted' reports of difference start to become no more reliable than guessing. To filter out the noise, we have to thwart the psychological sources of the noise. THis is called double blind testing. You might have heard of it. Hang on-- double blind testing doesn't remove psychological noise. I don't think that's the right way to say it. I think what you mean is that it removes the *bias* from sighted results. In double-blind tests, people still sometimes end up guessing randomly. The noise is still there. Bias is a form of psychological noise. If subjects are 'consciously' guessing randomly, then they should stop the test right there. They are no longer hearing a difference, so what's there to test? In fact, during my recent blind tests I observed that expectation can form during blind testing. Expectation is a kind of psychological noise. In fact, I hypothesize that a double-blind test which reduces internal "expecting" during the test would lead to a more accurate result. You can do that by randomizing the trials, and by not presenting results until the end of the test. Sighted reports are probably much *more* reliable than guessing. They reliably tell you which component looks more impressive. Sighted reports tend to be internally consistent, but for substantiation of real difference, they tend to stink. Whereas with sighted tests, we can of course rather easily influence the listener to score 'differences' where *none* exist..or even to have them report differences getting *larger* when they are in fact decreasing. And we can rather easily reverse the bias of the responses too. The responses can be *demonstrably* divorced from the sound. We can do this by letting the listener 'know' what they are listening to by some means *other than* just listening. I don't put much faith in sighted listening nor have I ever. Your skeptciism of blind testing is unsupported by anything *except* sighted results. My skepticism is not about *blind* testing--it is about specific techniques. The skepticism *originates* in intuition and introspection about the experience of listening to something blind. Last I checked, intuition was a fine place for skepticism to *originate*. Testing comes next. If I win the lottery or something, I'll do all the necessary tests. I'm skeptical of your intuition. -- -S It's not my business to do intelligent work. -- D. Rumsfeld, testifying before the House Armed Services Committee |
#9
|
|||
|
|||
On 10 Apr 2005 18:16:33 GMT, Steven Sullivan wrote:
Michael Mossey wrote: Steven Sullivan wrote: Bias is a form of psychological noise. If subjects are 'consciously' guessing randomly, then they should stop the test right there. They are no longer hearing a difference, so what's there to test? Well, it's always possible that they are deluding themselves into believing they hear no difference and are merely guessing when in fact they actually are hearing a difference and are not really guessing. After all, people like me who believe that wires all sound the same might still actually be hearing a difference but allowing our biases to convince us that we weren't. I see no harm in continuing the test on the admittedly slight possibility that this is the case. We do have instances where people who were brain damaged so that they could not consciously see one half of their visual field were nevertheless highly accurate when asked to simply guess what the object in question might be. They thought they were guessing, but the evidence showed that they weren't. Ed Seedhouse, Victoria, B.C. |
Reply |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Commercial Success of Hi Rez formats predicted by ABX tests | Audio Opinions | |||
A/B/X Testing (was: dB vs. Apparent Loudness) | Audio Opinions | |||
Statistics and PCABX (was weakest Link in the Chain) | High End Audio | |||
John Hardy Co. web site is now online | Pro Audio | |||
Audiophilia - a mild form of mental illness? - A revisitation. Has Anything Changed? | Audio Opinions |