Home |
Search |
Today's Posts |
#81
|
|||
|
|||
The end of the DBT debate?
"watch king" wrote in message
... I have to totally disagree with MKuller and since I have had a number of direct emails to me that seem to totally misunderstand (or actually distort and twist) many of the things I've said, I'd like to clarify what I have been saying. There is a factor which should be considered when discussing testing. Double blind testing doesn't require a number of comparative switches back and forth, back and forth or forth and back, to make the choice as to which test item is better. The reason the shorter test comparison times should be used is that it magnifies ANY difference of quality, subtle or otherwise, that exist between one test item and another. By charting the statistical significance of variations in listener results, relatively short comparison periods are also the most sensitive test method that can be used to determine when there was NO audible difference between one test item and another. When designing blind tests, the least useful method of blind testing was the "unlimited time", one day to the next method. In this method the listener was only presented with the volume control and nothing else. They could listen for as long as they chose one day and then using the same program material in the same sequence, they would listen to the other item being tested (or perhaps the same item over again). What happened was when the length of the test was increased, the ability to hear subtle differences between items was reduced. So the ability of the listeners to hear subtle differences when listeners compared two items for a minute or two 50 times in a row was much greater than if the listeners were able to listen for 5 minutes or 50 minutes or all day and then had exactly the same amount of time to listen to the next item in the test for an equivalent amount of time at equivalent loudness levels. With a thoughtful choice of high quality program material and accurate level balancing it is far easier for any ear, trained or otherwise to pick out even the most subtle difference between one audio product and another with these relatively shorter listening comparison times (5-15 seconds for each product). With challenging material the switch time between products can be as short as 5 seconds, 5 seconds, 5 seconds etc. and audiences trained or untrained, marveled at their ability to hear even the tiniest of differences between products. I never cared one way or the other which style of testing was used to show that one items sounded better than another. I was willing to run the tests whichever way showed the ability to highlight the subtle differences better because this would allow for the most differences to be heard. It just turned out that during many tests used to design the final tests to be used under certifiable conditions, more differences and more subtle differences were always more audible using totally blind tests and relatively short switches with challenging audio material. Testing professionals who were able to take as long as they wanted to determine whether one product sounded better than another, the results were always similar but the test listeners were able to be more certain of quality differences and they were able to make their determinations more quickly using comparisons in the 5-15 second range, than any other time interval. All the testing showed that professionally trained listeners or untrained listeners had less certainty in their decisions the longer the comparison period was extended. In fact many of the most biased listeners became frustrated and tried to force themselves to point out differences even when those differences weren't there because the products tested one day were the same as the day before. So it all comes down to the blindness factor. This is the part that reduces the ego and listening bias the most. No matter how long listeners want to have to make comparisons (even up to a week if they so choose) the reality is that they will just be less likely to make clear choices the longer the test comparison periods run. This difficulty in making choices unless they can color their judgement by knowing which device is playing, is what forces biased listeners to require that they know what item is playing in order to choose the product their ego investment requires. It was incredibly embarrassing and irritating that golden ear'd audiophiles would often choose the product they had publicly said was poorer or could make no distinction at all when they couldn't know what items were being tested. The only allowance the listening testers could use during these long term tests was the volume control. And the loudness levels through the test program were recorded, so that for the rest of the tests the volume levels were kept the same during the same sections of the material. Not only are double, triple or more blind tests the best way to determine differences between the quality of one audio product and another, it seems this kind of testing is the only way to determine the truly subtle differences. The only thing helpful about knowing which item is playing is that it helps listeners bias the results to back up what their emotional ego investment says the choice should be. This is why credible listening comparisons of wire are so maddening for those who claim there are differences in the sound of two wires of equal gauge (and not using frequency response modifying devices). Run any such test for 6 days. Let the listeners listen for as long or as short of comparison times as they choose. Make sure to run product A against itself some of the time and the same for B. This means that perhaps the sequence should sometimes have A compared to A and B compared to B. There have been many listening tests run using comparison periods from 10-10-10-10-10-10 seconds to 1 day-1 day-1 day-1 day with one item and then one day with another item (maybe) and the results of the listeners scoring was always random. In other words if the listeners aren't shown how to be biased in favor of one product vs another they usually don't hear a difference with wire. It is only when they are forced to make sighted decisions to back up their own pronouncements, does it happen that they choose the product they claimed was better. It is peculiar that anyone can find anything in anything I've written in this forum that supports the value of any kind of listening tests except blind listening tests. True the tests have to be designed well. A lot of test music and natural sound has to be reviewed to find passages that will most easily identify the differences between products. The listening rooms have to be made as neutral as possible. There are really many factors which go into a well designed and well run listening test. But the only way to get useful listening test results is if the products tested are tested totally blind. There is no reason to force listeners to make choices using relatively small time increments, but all but the most biased listeners agree after trying various time increments for testing, the shorter the comparison time feasible, the better at highlighting clear differences in products which can be "voted for" by listeners with the greatest confidence. I hope that now there are no misunderstandings about what I have said concerning the tens of thousands of listening tests I've supervised. Sighted tests are always the worst way to do listening tests and blind tests are the only ones of any value except to those that have an ego stake in the results to the point where they will deny the results of tests if those results don't match their own claims. Having worked with many many manufacturers over the years and many consumers up to super-consumers, the best designers of equipment and the best designers of systems were always those who were totally dispassionate about whether their product or someone else's was better sounding. That made these designers market only products that were cost effective while throwing away most of the other products they designed that were needed for "product line completion". Sometimes designers would lose sight of this reality because they began to believe all the things the sycophants around them were saying, and that is about the time when their products stopped being cost effective. Until their egos got in the way or they became too proud to accept criticism or reality, most great audio product designers could accept that their product may not sound as good as someone else's as long as it sold for less money. And these same designers could also admit that their product might not be worth marketing at all if it was more expensive but did not demonstrate an audible improvement over what was in the market already. The problem occurs when designers who don't care if their products are better, still need to sell them. Worse yet of course are designers who sell products they know are inferior but are more expensive. These people have to sell their products to pay mortgages, or even buy food for their families, but they likely don't care about consumers at all, considering exploitable consumers to be only "pockets with" cash in them which needs to be extracted. I've often heard manufacturers who say that they are such good salespeople that even if there is no real audible benefit to their product they can make enough consumers believe there is a difference to keep their businesses afloat. If enough well designed and managed blind listening tests were run, more than half of the audio equipment manufacturers in today's market would be forced out of business. Either their products wouldn't be worth what they charge for them, or the quality claims they've made for long periods of time would be proven false. This is the major reason why many manufacturers stoke the "flames of confusion" in the minds of the audio buying public. It is because their quality claims are nonsense and they just don't want this to be proven to the buying public. When Disney hired me a year after the ESS listening tests to run tests that would be a factor (33%) in choosing loudspeakers for EPCOT and Tokyo, they were certain I really didn't care whose products came out sounding the best. Disney thoroughly tested every product many manufacturers made and some of the products from a few specialty manufacturers. Often the listening and scientific test results went totally against market myth and manufacturer hype. Disney used a number of lesser known items and some very popular items. In every case the result was that EPCOT sounded better to both engineers and the untrained listening public than any theme park had ever sounded. Tokyo Disneyland sounded much better than the original in some cases and exactly the same as Anaheim Disneyland in cases where the sound couldn't be improved. I never cared who won any of the listening tests in any of the tests I've managed. Not knowing the participants identity in these tests (blind testing) was the only way, the best sounding equipment could be determined when there were quality differences, and also the only way to accurately determine when there was no difference in the way a group of products sounded. What turned out to be quite unfortunate for ESS was that one of their less expensive speakers sounded much better than almost any other speaker in the world, including their more expensive speakers. This turned out to be exactly the kind of information that a super-consumer like Disney wanted to find. Eventually there were many products that have been tested this way. That's because this kind of blind testing highlights only which products sound better. What gets sifted out of the situation is which company has better salespeople, or which company has more media allies. It eliminates the idea that by making something look impressive, a listener can be biased to believe that the product sounds better. And blind testing is the only way to find out which products are truly overpriced compared to others. So for any audiophile who wants the best sound and doesn't have an unlimited budget to be able to buy everything on the market, blind listening tests are the only way to buy one system using their given budget the most effectively. Watchking Listening isn't a competitive activity, buying equipment is. We don't get enough sand in our glass. Did you really do "tens of thousands of tests"? By my calculations, if you ran three tests a day, that would be 750 in a typical working year, or about thirteen years work of non-stop three times a day testing. Certainly possible, but an awful lot. Can you comment on how the tests accumulated? Are you talking about the organizations testing, or testing you personally designed and supervised? More importantly, giving the benefit of the doubt that you've run lots and lots of tests, could you please provide a little more info. Were these tests a-b preference tests? It sounds like it. Round-robin among competing units/brands? Or were they "evaluative" tests..eg. which had silkier highs, which had deeper bass, etc. What statistical criteria were used to determine significance, and how large was the typical test panel, and how many "trials" were typically used? Was "better" a criteria you established beforehand, given the purpose of the piece of gear (e.g. which speaker had the most even dispersion in the listening area)? Or was it open-ended, whatever emerged as "better" to the listening panel? When testing speakers, how did you handle the physical placement of the speakers to assure comparability? How did you handle the listening environment for different types of speakers? |
#82
|
|||
|
|||
The end of the DBT debate?
|
#83
|
|||
|
|||
Yet another DBT post
|
#84
|
|||
|
|||
The End of the DBT Debate?
Nousaine wrote:
"Harry Lavo" wrote: ...snip..... However, if i did want to do a comparative blind test, I would want it to be an a-b, not an abx. And I would want it to follow hard on the heels of several hours of warm-up listening, where I had firmly reestablished those signatures in mind before "going blind". And I would want to use the same music I had just been listening to and control the switching. And I would want to do it alone with no chance of cheating built into the test. This kind of implies that you need to "re-learn" to sound of your system from day to day; doesn't it? What I find totally fascinating is the idea that you need several hours of warm-up testing before you can identify "signatures" or differences, even though it is your own set-up. What about all the previous careful note-taking? Isn't that supposed to help you pinpoint those signatures or differences? And what about those "wife in the kitchen can hear it immediately" kind of differences, which we have seen many audiophiles claim when cables have been switched? If it indeed takes several hours of listening before differences can be established, that runs counter to just about every equipment review I have read, where reviewers typically notice those changes or differences immediately. Also, it would make the consumers' lives a lot easier: everything sounds the same if you never listen for more than a couple of hours in one sitting! |
#86
|
|||
|
|||
Yet another DBT post
"Harry Lavo" wrote in message
... For open ended evaluation, you don't know initially what you are looking for. It make days for things to gel that "a" sounds somewhat thisway, and "b" sounds somewhat more thatway. From extended, evaluative listening and non-quick switching. Then a tentative conclusion is drawn. Now you know what you are listening "for". It may be something subtle and perceptual, such as "imaging". Once you have it firmly grasped in mind what the signature is of "a" and how it might vary from "b", quick switching can help precisely because it "interupts" the perception you have grasped and altered it slightly (or not) over the flow of music. We are talking about open-ended component evaluation. If I simply give you two components, say "different" or "same", or "is it a" or "is it b" and force a choice quick switching works against you because you haven't yet really been able to determine what it is you are listening for in audio terms. "Same" or "different" are not audio terms. They are "sound artifact" terms on simple one or two dimensions. Under quick switching under these circumstances, the brain seems to "panic" in that it can't sort audio patterns quickly and has no frame of reference; this by itself creates anxiety, which in turn creates even more confusion and panic. I believe this is why audiophiles cite stress and fatigue in trying to do this kind of testing when dealing with very subtle, perceptual factors and why the test favors a "null conclusion" unless we are dealing with straightforward factors that the sensate function can handle without much need for the intuitive or emotional functions (volume, frequency response). All of the preceding is pure, baseless conjecture. Do I know this for sure? No. But it is reasonable and verifiable. Reasonable, it seems to me, only to those who either are unfamiliar with or choose to ignore all of the published research into human hearing perception, which offers no evidence for any of it. That is why I proposed a control test that is double-blind, relaxed, evaluative, and leisurely. I presume by this you mean something like the test protocol used by Oohashi to “confirm” the existence of his alleged “hypersonic effect,” except that you would allow longer listening times than he used. Along with testing of the same respondents using sighted, evaluative listening and at another time relatively short, terse, comparative ("same","different") double- blind testing as is traditionally recommended here. If the control test gave results similar to traditional dbt/abx, it would verify that that traditional dbt/abx testing was a valid "shortcut" to evaluative testing. If the control test gave results similar to sighted open-ended evaluative testing, then it would suggest that evaluative testing even though sighted was a more encompassing and valid approach for component evaluation. I don’t think this test, as you envision it, is really possible to perform in the real world. You would need to run the same “evaluative” test multiple times on the same subject (at least twice--once sighted and once blind--even if you used multiple subjects). The problem is that his answers on the first trial would influence him in any subsequent trials. (i.e., he’ll be looking for the same set of characteristics he has already “heard” once.) And if his answers aren’t independent, any comparison of those answers would be meaningless. So while I understand what you’re trying to get at, and what you hope to prove, you can’t get there from here. So if you really want to stop the "jaw flapping" and try to resolve the differences of the two camps, first you have to acknowledge the possibility that we might have a point, and that it is worth trying to resolve somehow. Well, no, I don’t have to acknowledge any such thing. There is no possibility that you have a point, because there is absolutely no support for any of your conjectures anywhere in the voluminous research on human hearing perception. If you can find such support, I will then be prepared to concede that you may have a point. Also, if you can provide any direct evidence for any of your conjectures, I will concede that you may have a point. But I’m not holding my breath, on either score. bob __________________________________________________ _______________ Learn how to choose, serve, and enjoy wine at Wine @ MSN. http://wine.msn.com/ |
#87
|
|||
|
|||
Yet another DBT post
(Nousaine) wrote:
Let's see; exactly "what" do we hear in nature? We hear timbre (frequency) and loudness (volume) over time. What else is there? With recordings we manipulate pitch and level over time to fool our natural hearing. But those three factors are all there is to "hear." That's like saying red, blue and yellow are the *only* colors we see, because that's all there is. You are focusing on one single dimension - loudness or frequency response and ignoring everything that is more complex - multi-dimensional - pattern-recognition - because single dimensional phenomena appear to be all your test can show in the way it's used. What about more complex audible differences? How about dynamic range? How about imaging and soundstage reproduction? What about the *quality* of high frequency or bass reproduction, rather than the *quantity*. How about all of those things above together at one time? We hear those things in nature, do we not? Regards, Mike |
#88
|
|||
|
|||
The End of the DBT Debate?
"chung" wrote in message
... Nousaine wrote: "Harry Lavo" wrote: ...snip..... However, if i did want to do a comparative blind test, I would want it to be an a-b, not an abx. And I would want it to follow hard on the heels of several hours of warm-up listening, where I had firmly reestablished those signatures in mind before "going blind". And I would want to use the same music I had just been listening to and control the switching. And I would want to do it alone with no chance of cheating built into the test. This kind of implies that you need to "re-learn" to sound of your system from day to day; doesn't it? What I find totally fascinating is the idea that you need several hours of warm-up testing before you can identify "signatures" or differences, even though it is your own set-up. What about all the previous careful note-taking? Isn't that supposed to help you pinpoint those signatures or differences? And what about those "wife in the kitchen can hear it immediately" kind of differences, which we have seen many audiophiles claim when cables have been switched? I am talking about getting that sound "in-your-head" firmly enough to serve as a remembered reference through a series of tests that can add confusion if that reference is not firmly there. There is a difference. If it indeed takes several hours of listening before differences can be established, that runs counter to just about every equipment review I have read, where reviewers typically notice those changes or differences immediately. Also, it would make the consumers' lives a lot easier: everything sounds the same if you never listen for more than a couple of hours in one sitting! |
#89
|
|||
|
|||
Yet another DBT post
"Bob Marcus" wrote in message
news:%JfUb.180300$nt4.779028@attbi_s51... "Harry Lavo" wrote in message ... For open ended evaluation, you don't know initially what you are looking for. It make days for things to gel that "a" sounds somewhat thisway, and "b" sounds somewhat more thatway. From extended, evaluative listening and non-quick switching. Then a tentative conclusion is drawn. Now you know what you are listening "for". It may be something subtle and perceptual, such as "imaging". Once you have it firmly grasped in mind what the signature is of "a" and how it might vary from "b", quick switching can help precisely because it "interupts" the perception you have grasped and altered it slightly (or not) over the flow of music. We are talking about open-ended component evaluation. If I simply give you two components, say "different" or "same", or "is it a" or "is it b" and force a choice quick switching works against you because you haven't yet really been able to determine what it is you are listening for in audio terms. "Same" or "different" are not audio terms. They are "sound artifact" terms on simple one or two dimensions. Under quick switching under these circumstances, the brain seems to "panic" in that it can't sort audio patterns quickly and has no frame of reference; this by itself creates anxiety, which in turn creates even more confusion and panic. I believe this is why audiophiles cite stress and fatigue in trying to do this kind of testing when dealing with very subtle, perceptual factors and why the test favors a "null conclusion" unless we are dealing with straightforward factors that the sensate function can handle without much need for the intuitive or emotional functions (volume, frequency response). All of the preceding is pure, baseless conjecture. Do I know this for sure? No. But it is reasonable and verifiable. Reasonable, it seems to me, only to those who either are unfamiliar with or choose to ignore all of the published research into human hearing perception, which offers no evidence for any of it. That is why I proposed a control test that is double-blind, relaxed, evaluative, and leisurely. I presume by this you mean something like the test protocol used by Oohashi to Â"confirmÂ" the existence of his alleged Â"hypersonic effect,Â" except that you would allow longer listening times than he used. yep. Along with testing of the same respondents using sighted, evaluative listening and at another time relatively short, terse, comparative ("same","different") double- blind testing as is traditionally recommended here. If the control test gave results similar to traditional dbt/abx, it would verify that that traditional dbt/abx testing was a valid "shortcut" to evaluative testing. If the control test gave results similar to sighted open-ended evaluative testing, then it would suggest that evaluative testing even though sighted was a more encompassing and valid approach for component evaluation. I donÂ't think this test, as you envision it, is really possible to perform in the real world. You would need to run the same Â"evaluativeÂ" test multiple times on the same subject (at least twice--once sighted and once blind--even if you used multiple subjects). The problem is that his answers on the first trial would influence him in any subsequent trials. (i.e., heÂ'll be looking for the same set of characteristics he has already Â"heardÂ" once.) And if his answers arenÂ't independent, any comparison of those answers would be meaningless. So while I understand what youÂ're trying to get at, and what you hope to prove, you canÂ't get there from here. No problem if the sighted, open-end test is done first...let's say with 16 trials if we are talking one person. I'd actually prefer 100 people doing it once. Then three months later the test is done blind, again with 16 trials or preferably 100 people. Nobody would know or remember the exact scores they gave on the evaluative criteria, and in the second blind test they wouldn't know which was which so it would hardly be relevant if they did. So if you really want to stop the "jaw flapping" and try to resolve the differences of the two camps, first you have to acknowledge the possibility that we might have a point, and that it is worth trying to resolve somehow. Well, no, I donÂ't have to acknowledge any such thing. There is no possibility that you have a point, because there is absolutely no support for any of your conjectures anywhere in the voluminous research on human hearing perception. If you can find such support, I will then be prepared to concede that you may have a point. Also, if you can provide any direct evidence for any of your conjectures, I will concede that you may have a point. But IÂ'm not holding my breath, on either score. Somebody always has to be first! :-) A fair amount of science was first postulated by "crazies". May be crazy, may be. Or may however remotely possibly be right. But you don't have to think I am right. Enough people have raised similar issues on this forum over the years...enough that anybody really seeking the truth should a least design a control test to knock down the objections. Why do you suppose Oohashi choose that particular form of testing, and why do you suppose he found statistical correlation where conventional theory suggested it shouldn't exist? Does not a better listening test technique suggest itself as a possibility? Oohashi himself makes reference to earlier tests that showed no difference using conventional techniques; that is why his group set the listening test up the way they did, because they suspected that might be one of the factors getting in the way. And the results don't dispute the possibility that he was right. |
#90
|
|||
|
|||
The end of the DBT debate
Well Harry to begin with the ESS test itself sponsored and certified
by various departments inside the test site universities like Georgia Tech (example-Dr. Patronis-Psychoacoustics), Univ of Wisc, Univ of Washington & UCLA. There were 10, 40 minute tests per day of approx 40 minutes each with a 3 minute verbal intro and written intros handed to each participant while they waited in line. There were 20 individual test comparisons per test session comparing one loudspeaker to another. There were three back and forth switches for each required decision of X (some number between 1 and 9 randomly chosen each time) and Y (same) or "no difference". There were full seats (12-15, depending on the listening neutrality of the room) at every session after the first morning session which often had invited or celebrity guests and/or departmental auditors sitting in, often with measurement tools. The tests ran for an average of 6 days at each university site. This makes for over 10,000 individual man/listening/tests. This does not count the numerous pretest set-ups used to shake down the tests ability to function in locations in Sacramento. There was also a preliminary "Test" test at UC Davis where the loudspeaker test was run over and over with the same sequences of speakers but other parts of the system were changed out like amplifiers and out of 20 possible musical and voice programs, 10 were chosen for the final testing on the road. There were probably another 3-4,000 individual man/listening tests done to make the test as meaningful as possible, with the broadest variety of music (Pop, rock, dance, female vocal, male vocal, symphonic, quartet and show music [orchestral movie themes]. Set-up day in each actual test location there were perhaps 500 individual comparison tests run so that each seat in the listening area could be ear-tested and measured with a full house. Listeners were only given the choice of whether they wanted to sit in the louder seats, average loudness seats or quieter seats. Being "bookshelf" speakers all speakers were located above head height when seated, for equal low frequency enhancement and the need to "look up" for all listening. During a 2 minute break mid-session, speakers were moved to better align competitors for the rest of the head to head tests. While there were many comparisons that did not produce statistically significant findings, this was because the quality differences between many of the competitors was insignificant. In fact the imaging in every seat using master tapes or other first generation material was incredible. Unfortunately that meant that some people were experiencing some of the best sound they had ever heard and they could "forget" to record their test comparison results on some musical passages. Statistically, "blanks" didn't count the same as "no difference". A preference rate of at least 3-2 was considered significant because it meant that 50% more listeners preferred one item over another. Out of dozens of different combination comparisons with 32 loudspeaker models, only about 40 of the comparison tests had results that were statistically significant although using one standard deviation as a decisive outcome if the preference was 4-3 then 50 test comparisons were significant. The total number of listener tests that had to have a real choice (when eliminating "no difference") had to total at least 171 decisions, and the number of "no difference" votes could never be more than 30%. No test seats were put outside the claimed dispersion area of any loudspeaker, although celebrity sit-ins often sat without tests scoring sheets in outer seats so everyone could experience even absorption. Test staff also used these non-test seats. The auditing accounting firm was in charge of the actual "significance analysis" numerically, although as a former statistic teaching assistant I understood what "significance" meant. 100 preferences vs 70 might be meaningful, but 100 preferences vs 95 likely meant nothing. Obviously 100 preferences vs 25 would make an irrefutable statement. Since this group of tests was set up to test loudspeakers, that's all that was tested. This helped keep most of the other parameters fixed and not able to vary the results. But on later blind listening tests to sort out equipment for EPCOT or Tokyo Disneyland, hundreds of listeners compared various amplifiers and loudspeakers. Since these items were to be heard inside of "something", this was the best "blind" situation possible. Even theater systems are behind screens and so were tested that way. While thwe opinions of the Imagineers was considered most important, the listeners representing the typical visitor to the parks was asked to make comments in addition to direct votes. Every listener was treated like their opinion was valued (which it was) and every listener took their task seriously (which can be difficult when listening to the sound of dancing pixie's or whatever). One helpful part of the test was to have portions introduced by people whose voices were then played back through the systems being tested. Luckily a movie company has many professional narrators on staff. While the total number of direct comparisons might have nearly totaled 20,000 in these two situations, other companies hired me to run product development tests with certain market segments. The total number of tests run here might have been between 500-1000, and rarely were more than 3 listeners listening at a time. I don't want to give the impression that I was ever the only engineer managing and designing the tests I speak of. At ESS there was another engineer from inside the company and one brought in from outside to assist in designing the tests. At Disney I worked on a team of 8 professional engineers. In other cases there were never fewer than 3 engineers involved. It is always better to have multiple engineers involved with designing the tests and setting them up, and the involvement of a music teacher usually helps too, especially in sorting out program material. The general public with an interest in music who categorized themselves as "interested enough" in hearing sound equipment was the "panel of listeners" used for the ESS tests. Desire definitely played a big part in who took the tests at all. The only "payoff" was a drawing for a free cassette tape each round. People often waited in lines for hours. The weather was nice sometimes but cold or rainy other times (especially during Homecoming week at Univ Wisc Madison). The age of 80% of the campus test group was between 18 and 24 (pre-walkman and concert deafness syndrome), so the test ears were pretty sharp in this case. The Disney tests had multiple levels of listeners taking tests. The Imagineers were all trained audio engineers with either theater, studio or specialize audio experience (mine being musical sound reinforcement and testing), and the project designers often had years of experience listening to Disney's other attractions as well as live music as their reference. But many of test listeners were "Middle Americans" because these were the people who were Disney's target market. I would have enjoyed interviewing every listener of the campus tests at length but this just wasn't feasible. But every day I tried to have 1 or 2 small group discussion sessions to get a handle on what college listeners used as criteria for making their judgments. Music students were the most articulate in this regard. Often they could easily pin down why a certain test using a female voice was so revealing or how the piano on one loudspeaker sounded more revealing than another. As I mentioned before musical preference could make a big difference. More results could have been used except that some listeners heard little of no difference between test products except with certain types of program materials. In other words sometimes when the test listeners didn't care about certain music they just didn't want to listen closely. Of course engineers and theme park fanatics were very articulate about what they liked or didn't like about te products under test for EPCOT. writing in the test sheet margins was very common. This helped the people who were choosing the EPCOT amps, loudspeakers, custom transformers, and preamp signal processors. Overhead crowd control speakers were judged more on vocal clarity whether they were used for musical playback most of the time or not. According to the job the product did the test results could be judged with proper weighting. But when (as is the case with the French theater whose sound system I was able to design) sonic quality overall was the standard, then the best sounding competitor using the actual program that would be used in the park, was 90% assured of being chosen. Here are the quality criteria for "home sound" that I have picked up from test subjects before and after they do listening tests. In people's homes most non-electro-produced music should create a sharp image that goes at least to the outside of the left and right loudspeaker and possibly somewhat outside that limit. Imaging is often a description of the "Conductor's position" in listening to orchestral and choral music. The conductor has to be able to identify clearly any single performer in order to make corrections or changes to balance the totality of the work. In pop music images are created artificially but they can be just as clear. Next for home sound and nearly as important as imaging the vocals should always be as clear and undistorted as possible and with the exception of the image, the voices (especially female voices) should be as natural as possible. Humans are conditioned more by the sound of the female voice during their mental, physical and emotional formation from years 0 to 7 than any other sound source. To make a voice sound natural in a room there needs to be almost no hint of spurious sound from a loudspeaker cabinet, internal resonance or microphonics from tubes. In addition the loudspeaker should have very wide and constant directivity as well as correct phase alignment in the voice band. These are hard goals to meet. Beyond the two criteria, sharp imaging and total realism in vocals, people's musical interests weight various other quality criteria differently. Pop listeners surprisingly prefer low distortion sound at fairly loud level so as not to induce listener fatigue. Some other listeners prefer organ and/or driving rock or other bass guitar oriented music. Extended frequency response with low distortion and low background noise may be more important than ultimate loudness for jazz or folk music listeners whose instruments are usually acoustic. Etc. Etc. Often these criteria can conflict with each other but not always. Usually it is when speakers are designed "up" to widen bandwidth that they produce pooer images and less vocal realism. But this is only a comment on testing not a condemnation of "mammoth" speaker designs. Watchking "Listening isn't a competetive activity, buying equipment is." We don't get enough sand in our glass |
#91
|
|||
|
|||
Yet another DBT post
"Harry Lavo" wrote :
For open ended evaluation, you don't know initially what you are looking for. It make days for things to gel that "a" sounds somewhat thisway, and "b" sounds somewhat more thatway. From extended, evaluative listening and non-quick switching. Then a tentative conclusion is drawn. Now you know what you are listening "for". It may be something subtle and perceptual, such as "imaging". Once you have it firmly grasped in mind what the signature is of "a" and how it might vary from "b", quick switching can help precisely because it "interupts" the perception you have grasped and altered it slightly (or not) over the flow of music. We are talking about open-ended component evaluation. If I simply give you two components, say "different" or "same", or "is it a" or "is it b" and force a choice quick switching works against you because you haven't yet really been able to determine what it is you are listening for in audio terms. "Same" or "different" are not audio terms. They are "sound artifact" terms on simple one or two dimensions. Under quick switching under these circumstances, the brain seems to "panic" in that it can't sort audio patterns quickly and has no frame of reference; this by itself creates anxiety, which in turn creates even more confusion and panic. I believe this is why audiophiles cite stress and fatigue in trying to do this kind of testing when dealing with very subtle, perceptual factors and why the test favors a "null conclusion" unless we are dealing with straightforward factors that the sensate function can handle without much need for the intuitive or emotional functions (volume, frequency response). "Bob Marcus" wrote: All of the preceding is pure, baseless conjecture. Because it doesn't seem to agree with your ideas. "Harry Lavo" wrote: Do I know this for sure? No. But it is reasonable and verifiable. "Bob Marcus" wrote: Reasonable, it seems to me, only to those who either are unfamiliar with or choose to ignore all of the published research into human hearing perception, which offers no evidence for any of it. That is very difficult to believe. The evidence is there - you have to chose *not* to ignore it because it might not fit in with your *beliefs* about blind testing and audible differences. It's pretty clear the objectivists do not have all of the answers about DBTs or this debate would not be continuing. Convincing yourself is one thing - convincing us sceptics is another - it requires more than continued denial. The dichotomy between evaluating/judging and single-dimension/mulitple-dimension testing makes sense to anyone who has ever tried open-ended component evaluation sighted and then the same test blind. If you try paying attention to what is going on in your own head during the process - it's called *self-awareness* - Harry's description above is the best explanation I've seen so far. It also explains why DBTs which are positive deal with one dimension (loudness or frequency response of more than 2dB difference) and all the rest are null. The test, in the way it is applied, seems to interfere with identifying the differences you are testing for. You can either continue denying it, or admit there might be a problem and help find for a solution with your claimed familiarity with "all of the published research into human hearing perception". Regards, Mike |
#92
|
|||
|
|||
Yet another DBT post
Mkuller wrote:
(Nousaine) wrote: Let's see; exactly "what" do we hear in nature? We hear timbre (frequency) and loudness (volume) over time. What else is there? With recordings we manipulate pitch and level over time to fool our natural hearing. But those three factors are all there is to "hear." That's like saying red, blue and yellow are the *only* colors we see, because that's all there is. You are focusing on one single dimension - loudness or frequency response and ignoring everything that is more complex - multi-dimensional - pattern-recognition - because single dimensional phenomena appear to be all your test can show in the way it's used. What about more complex audible differences? How about dynamic range? That's just the difference between the loud passages and the noise floor of the system. DBT's are excellent for discriminating background noise, and peak loudness before clipping. How about imaging and soundstage reproduction? Imaging is one of the easiest things to DBT. When playing A, you should be able to place the sound source (a vocal, or a particular instrument), using a familiar piece of recording. Switch to B, you should sense either the same, or a different location for those sources. Just compare X to A and B and pick the one that's the closest. Frankly, saying that DBT's cannot discriminate imaging is an excuse. What about the *quality* of high frequency or bass reproduction, rather than the *quantity*. What about it? Those are simple frequency response effects. It should be really easy to pinpoint tonal differences of instruments in a DBT/ABX test. Remember all the anecdotes about how something immediately stands out when a component is changed? How about all of those things above together at one time? All you need to match X to A or B on any one of these effects, if you noticed previous differences in sighted listening. How hard is it? Of course, it would be hard if your hard-held beliefs are on the line... We hear those things in nature, do we not? Regards, Mike |
#93
|
|||
|
|||
The End of the DBT Debate?
Harry Lavo wrote:
"chung" wrote in message ... Nousaine wrote: "Harry Lavo" wrote: ...snip..... However, if i did want to do a comparative blind test, I would want it to be an a-b, not an abx. And I would want it to follow hard on the heels of several hours of warm-up listening, where I had firmly reestablished those signatures in mind before "going blind". And I would want to use the same music I had just been listening to and control the switching. And I would want to do it alone with no chance of cheating built into the test. This kind of implies that you need to "re-learn" to sound of your system from day to day; doesn't it? What I find totally fascinating is the idea that you need several hours of warm-up testing before you can identify "signatures" or differences, even though it is your own set-up. What about all the previous careful note-taking? Isn't that supposed to help you pinpoint those signatures or differences? And what about those "wife in the kitchen can hear it immediately" kind of differences, which we have seen many audiophiles claim when cables have been switched? I am talking about getting that sound "in-your-head" firmly enough to serve as a remembered reference through a series of tests that can add confusion if that reference is not firmly there. There is a difference. In a ABX/DBT, you have unlimited time to get the sound "in your head" when you first listen to A and B. And you are saying that after months of open-ended evaluative listening, you still don't have that sound "in your head" firmly enough? How about all that note-taking? And you must have found some passages where the differences (or signatures) are miximized, no? If it indeed takes several hours of listening before differences can be established, that runs counter to just about every equipment review I have read, where reviewers typically notice those changes or differences immediately. Also, it would make the consumers' lives a lot easier: everything sounds the same if you never listen for more than a couple of hours in one sitting! |
#94
|
|||
|
|||
The end of the DBT debate
"watch king" wrote in message
... Well Harry to begin with the ESS test itself sponsored and certified by various departments inside the test site universities like Georgia Tech (example-Dr. Patronis-Psychoacoustics), Univ of Wisc, Univ of Washington & UCLA. There were 10, 40 minute tests per day of approx 40 minutes each with a 3 minute verbal intro and written intros handed to each participant while they waited in line. There were 20 individual test comparisons per test session comparing one loudspeaker to another. There were three back and forth switches for each required decision of X (some number between 1 and 9 randomly chosen each time) and Y (same) or "no difference". There were full seats (12-15, depending on the listening neutrality of the room) at every session after the first morning session which often had invited or celebrity guests and/or departmental auditors sitting in, often with measurement tools. The tests ran for an average of 6 days at each university site. This makes for over 10,000 individual man/listening/tests. This does not count the numerous pretest set-ups used to shake down the tests ability to function in locations in Sacramento. There was also a preliminary "Test" test at UC Davis where the loudspeaker test was run over and over with the same sequences of speakers but other parts of the system were changed out like amplifiers and out of 20 possible musical and voice programs, 10 were chosen for the final testing on the road. There were probably another 3-4,000 individual man/listening tests done to make the test as meaningful as possible, with the broadest variety of music (Pop, rock, dance, female vocal, male vocal, symphonic, quartet and show music [orchestral movie themes]. Set-up day in each actual test location there were perhaps 500 individual comparison tests run so that each seat in the listening area could be ear-tested and measured with a full house. Listeners were only given the choice of whether they wanted to sit in the louder seats, average loudness seats or quieter seats. Being "bookshelf" speakers all speakers were located above head height when seated, for equal low frequency enhancement and the need to "look up" for all listening. During a 2 minute break mid-session, speakers were moved to better align competitors for the rest of the head to head tests. While there were many comparisons that did not produce statistically significant findings, this was because the quality differences between many of the competitors was insignificant. In fact the imaging in every seat using master tapes or other first generation material was incredible. Unfortunately that meant that some people were experiencing some of the best sound they had ever heard and they could "forget" to record their test comparison results on some musical passages. Statistically, "blanks" didn't count the same as "no difference". A preference rate of at least 3-2 was considered significant because it meant that 50% more listeners preferred one item over another. Out of dozens of different combination comparisons with 32 loudspeaker models, only about 40 of the comparison tests had results that were statistically significant although using one standard deviation as a decisive outcome if the preference was 4-3 then 50 test comparisons were significant. The total number of listener tests that had to have a real choice (when eliminating "no difference") had to total at least 171 decisions, and the number of "no difference" votes could never be more than 30%. No test seats were put outside the claimed dispersion area of any loudspeaker, although celebrity sit-ins often sat without tests scoring sheets in outer seats so everyone could experience even absorption. Test staff also used these non-test seats. The auditing accounting firm was in charge of the actual "significance analysis" numerically, although as a former statistic teaching assistant I understood what "significance" meant. 100 preferences vs 70 might be meaningful, but 100 preferences vs 95 likely meant nothing. Obviously 100 preferences vs 25 would make an irrefutable statement. Since this group of tests was set up to test loudspeakers, that's all that was tested. This helped keep most of the other parameters fixed and not able to vary the results. But on later blind listening tests to sort out equipment for EPCOT or Tokyo Disneyland, hundreds of listeners compared various amplifiers and loudspeakers. Since these items were to be heard inside of "something", this was the best "blind" situation possible. Even theater systems are behind screens and so were tested that way. While thwe opinions of the Imagineers was considered most important, the listeners representing the typical visitor to the parks was asked to make comments in addition to direct votes. Every listener was treated like their opinion was valued (which it was) and every listener took their task seriously (which can be difficult when listening to the sound of dancing pixie's or whatever). One helpful part of the test was to have portions introduced by people whose voices were then played back through the systems being tested. Luckily a movie company has many professional narrators on staff. While the total number of direct comparisons might have nearly totaled 20,000 in these two situations, other companies hired me to run product development tests with certain market segments. The total number of tests run here might have been between 500-1000, and rarely were more than 3 listeners listening at a time. I don't want to give the impression that I was ever the only engineer managing and designing the tests I speak of. At ESS there was another engineer from inside the company and one brought in from outside to assist in designing the tests. At Disney I worked on a team of 8 professional engineers. In other cases there were never fewer than 3 engineers involved. It is always better to have multiple engineers involved with designing the tests and setting them up, and the involvement of a music teacher usually helps too, especially in sorting out program material. The general public with an interest in music who categorized themselves as "interested enough" in hearing sound equipment was the "panel of listeners" used for the ESS tests. Desire definitely played a big part in who took the tests at all. The only "payoff" was a drawing for a free cassette tape each round. People often waited in lines for hours. The weather was nice sometimes but cold or rainy other times (especially during Homecoming week at Univ Wisc Madison). The age of 80% of the campus test group was between 18 and 24 (pre-walkman and concert deafness syndrome), so the test ears were pretty sharp in this case. The Disney tests had multiple levels of listeners taking tests. The Imagineers were all trained audio engineers with either theater, studio or specialize audio experience (mine being musical sound reinforcement and testing), and the project designers often had years of experience listening to Disney's other attractions as well as live music as their reference. But many of test listeners were "Middle Americans" because these were the people who were Disney's target market. I would have enjoyed interviewing every listener of the campus tests at length but this just wasn't feasible. But every day I tried to have 1 or 2 small group discussion sessions to get a handle on what college listeners used as criteria for making their judgments. Music students were the most articulate in this regard. Often they could easily pin down why a certain test using a female voice was so revealing or how the piano on one loudspeaker sounded more revealing than another. As I mentioned before musical preference could make a big difference. More results could have been used except that some listeners heard little of no difference between test products except with certain types of program materials. In other words sometimes when the test listeners didn't care about certain music they just didn't want to listen closely. Of course engineers and theme park fanatics were very articulate about what they liked or didn't like about te products under test for EPCOT. writing in the test sheet margins was very common. This helped the people who were choosing the EPCOT amps, loudspeakers, custom transformers, and preamp signal processors. Overhead crowd control speakers were judged more on vocal clarity whether they were used for musical playback most of the time or not. According to the job the product did the test results could be judged with proper weighting. But when (as is the case with the French theater whose sound system I was able to design) sonic quality overall was the standard, then the best sounding competitor using the actual program that would be used in the park, was 90% assured of being chosen. Here are the quality criteria for "home sound" that I have picked up from test subjects before and after they do listening tests. In people's homes most non-electro-produced music should create a sharp image that goes at least to the outside of the left and right loudspeaker and possibly somewhat outside that limit. Imaging is often a description of the "Conductor's position" in listening to orchestral and choral music. The conductor has to be able to identify clearly any single performer in order to make corrections or changes to balance the totality of the work. In pop music images are created artificially but they can be just as clear. Next for home sound and nearly as important as imaging the vocals should always be as clear and undistorted as possible and with the exception of the image, the voices (especially female voices) should be as natural as possible. Humans are conditioned more by the sound of the female voice during their mental, physical and emotional formation from years 0 to 7 than any other sound source. To make a voice sound natural in a room there needs to be almost no hint of spurious sound from a loudspeaker cabinet, internal resonance or microphonics from tubes. In addition the loudspeaker should have very wide and constant directivity as well as correct phase alignment in the voice band. These are hard goals to meet. Beyond the two criteria, sharp imaging and total realism in vocals, people's musical interests weight various other quality criteria differently. Pop listeners surprisingly prefer low distortion sound at fairly loud level so as not to induce listener fatigue. Some other listeners prefer organ and/or driving rock or other bass guitar oriented music. Extended frequency response with low distortion and low background noise may be more important than ultimate loudness for jazz or folk music listeners whose instruments are usually acoustic. Etc. Etc. Often these criteria can conflict with each other but not always. Usually it is when speakers are designed "up" to widen bandwidth that they produce pooer images and less vocal realism. But this is only a comment on testing not a condemnation of "mammoth" speaker designs. Watchking Thank you for the more detailed description. I and the others here now better understand how and why you tested, and what standards were applied. As for your findings on what people "reacted to" in a home environment, it totally squares with my experience and I would guess would as well for most folks here. As a member of that "jazz, acoustical instrument" group, I have never found loudness per se important; I do find a lack of obvious distortion to reasonably loud living room levels helpful in preventing fatigue. Imaging and dimensionality are much more important to me, even though I now have a surround system. That is why I always bow out of recomendations to rock lovers and organ fans...I simply do not have the same frame of reference. |
#95
|
|||
|
|||
Yet another DBT post
(Nousaine) wrote:
Let's see; exactly "what" do we hear in nature? We hear timbre (frequency) and loudness (volume) over time. What else is there? With recordings we manipulate pitch and level over time to fool our natural hearing. But those three factors are all there is to "hear." Mkuller wrote: That's like saying red, blue and yellow are the *only* colors we see, because that's all there is. You are focusing on one single dimension - loudness or frequency response and ignoring everything that is more complex - multi-dimensional - pattern-recognition - because single dimensional phenomena appear to be all your test can show in the way it's used. What about more complex audible differences? How about dynamic range? chung wrote: That's just the difference between the loud passages and the noise floor of the system. DBT's are excellent for discriminating background noise, and peak loudness before clipping. No, what you are describing is the *potential* dynamic range. The actual dynamic range is the difference between the softest passage and the loudest. Audio equipment varies in its ability to reproduce a wide dynamic contrast between these two. Have you ever seen *this* differentiated in a DBT? How about imaging and soundstage reproduction? Imaging is one of the easiest things to DBT. When playing A, you should be able to place the sound source (a vocal, or a particular instrument), using a familiar piece of recording. Switch to B, you should sense either the same, or a different location for those sources. Just compare X to A and B and pick the one that's the closest. Frankly, saying that DBT's cannot discriminate imaging is an excuse. Have you ever seen a DBT differentiate the imaging capabilities of two audio components? What about the *quality* of high frequency or bass reproduction, rather than the *quantity*. What about it? Those are simple frequency response effects. It should be really easy to pinpoint tonal differences of instruments in a DBT/ABX test. Remember all the anecdotes about how something immediately stands out when a component is changed? No they are not *simple* frequency response effects*; that would be "quantity", not "quality". The same goes for *transparency* and *resolution of inner detail*. These are mulit-dimensional effects that I have never seen differentiated in a DBT. Have you? How about all of those things above together at one time? All you need to match X to A or B on any one of these effects, if you noticed previous differences in sighted listening. How hard is it? Of course, it would be hard if your hard-held beliefs are on the line... Have you ever seen it done? Who has hard-held beliefs here? What about other things besides, loudness, tone and frequency response? We have phase, timing, transient attack, note decay, etc. all multi-dimensional attributes which have never been demonstrated to differentiate audio components in a DBT, but nonetheless are real audible effects. Regards, Mike |
#96
|
|||
|
|||
Yet another DBT post
"Mkuller" wrote in message
news:iEwUb.229879$xy6.1167040@attbi_s02... "Harry Lavo" wrote : For open ended evaluation, you don't know initially what you are looking for. It make days for things to gel that "a" sounds somewhat thisway, and "b" sounds somewhat more thatway. From extended, evaluative listening and non-quick switching. Then a tentative conclusion is drawn. Now you know what you are listening "for". It may be something subtle and perceptual, such as "imaging". Once you have it firmly grasped in mind what the signature is of "a" and how it might vary from "b", quick switching can help precisely because it "interupts" the perception you have grasped and altered it slightly (or not) over the flow of music. We are talking about open-ended component evaluation. If I simply give you two components, say "different" or "same", or "is it a" or "is it b" and force a choice quick switching works against you because you haven't yet really been able to determine what it is you are listening for in audio terms. "Same" or "different" are not audio terms. They are "sound artifact" terms on simple one or two dimensions. Under quick switching under these circumstances, the brain seems to "panic" in that it can't sort audio patterns quickly and has no frame of reference; this by itself creates anxiety, which in turn creates even more confusion and panic. I believe this is why audiophiles cite stress and fatigue in trying to do this kind of testing when dealing with very subtle, perceptual factors and why the test favors a "null conclusion" unless we are dealing with straightforward factors that the sensate function can handle without much need for the intuitive or emotional functions (volume, frequency response). "Bob Marcus" wrote: All of the preceding is pure, baseless conjecture. Because it doesn't seem to agree with your ideas. No. It's because it doesn't agree with a lot of things that we know about perception, cognition, consumer behavior, and I guess with a lot of things that we know about psychoacoustics, not to mention electrical engineering. And the differences are not only theoretical, but also methodological. Let's suppose that Harry's "theory" makes sense (which it doesn't). The test that he came up with to test his "theory" is, there is no other word for this, H-O-R-R-I-B-L-E from a scientific point of view. Basically, it's full of confounds whose origins are in tons of biases (perceptual, judgmental, processing, decision-related) that he has no controls for. How do I now that? Well, you can take my word for it, or you can go to a local library and start reading. Really, there is no other way. "Harry Lavo" wrote: Do I know this for sure? No. But it is reasonable and verifiable. "Bob Marcus" wrote: Reasonable, it seems to me, only to those who either are unfamiliar with or choose to ignore all of the published research into human hearing perception, which offers no evidence for any of it. That is very difficult to believe. The evidence is there - you have to chose *not* to ignore it because it might not fit in with your *beliefs* about blind testing and audible differences. It's pretty clear the objectivists do not have all of the answers about DBTs or this debate would not be continuing. This is not the point at all. Nobody has all answers about anything, but some people have some answers about some things. It would be beneficial if subjectivists try to make an effort to familiarize themselves with those answers. Basically, what subjectivists are saying is: since 50 (100?) years of multidisciplinary, peer-reviewed research doesn't agree with our unreliable, biased perceptions (and perceptions are *by definition* unreliable and biased), then all those researchers are wrong and we are right. Well, it doesn't work that way. If someone wants to come up with a new theory, his theory has to acommodate the existing theory first and build from there. I'm prety sure that someone is not going to be a layperson. Convincing yourself is one thing - convincing us sceptics is another - it requires more than continued denial. No, no, no... In this story, subjectivists are believers, and objectivists are sceptics. The dichotomy between evaluating/judging judgment = evaluation!!!!! You think that you can evaluate things without judging them? This is getting really interesting. and single-dimension/mulitple-dimension testing makes sense to anyone who has ever tried open-ended component evaluation sighted and then the same test blind. If you try paying attention to what is going on in your own head during the process - it's called *self-awareness* - Harry's description above is the best explanation I've seen so far. Theoretically, Harry would like to have it "both ways", so to speak. He would like to use consciusly monitored "true" emotions, that arise from sponteanous (intuitive as he would say) gestalt perception of a piece of music, as proxies for evaluation of aural stimuli (sound). At the same time, he needs a lengthy period of time (weeks, months) to do all kinds of "piecemeal" processing of sound (imaging, soundstaging, air, "midband plasticity", bass extension, slam, speed, ...... I am sure you can add a lot more properties here) before he "arrives", cognitively speaking, at the gestalt. First of all, this is totally impossible, again cognitively speaking. If you feel any emotions after all this time and all the proceesing that you've done, they will be representative of about zillion things, and not of the sound quality alone. Second, all this proceesing is, more or less, biased to a very high degree for a large number of reasons (and Harry has no single control for any of them). I have described this in some detail in one of my previous posts, but none of subjectivists felt the urge to respond with any plausible arguments. I do agree on one thing with Harry -- testee anxiety. However, the cause of that anxiety is not brain "panicking", but let's not go there. The value of a DBT ABX test is that it has necessary confound controls, yet it is a very sensitive test given *what we know about judgment and decision making, perception and cognitive psychology in general, as well as what we know about psychoacoustics and properties of signals*. Do we know everything about all these things? Of course not, but when somebody makes a thoeretical progress in the field of perception of auditory signals, he will be using ABX DBT (or something similar), and nothing like Harry's test. Also, when trying to build a plausible theory of perception (and cognition) it is really desirable not to start with behavioralists and Jung. Really. This is a non-starter in 2004 and speaks volumes about such attempt. No disrespect. It also explains why DBTs which are positive deal with one dimension (loudness or frequency response of more than 2dB difference) and all the rest are null. The test, in the way it is applied, seems to interfere with identifying the differences you are testing for. You can either continue denying it, or admit there might be a problem and help find for a solution with your claimed familiarity with "all of the published research into human hearing perception". You got this also the other way around. The theory and empirical evidence (both of multidisciplinary nature) totally predict "why DBTs which are positive deal with one dimension (loudness or frequency response of more than 2dB difference) and all the rest are null". If you think that there is something wrong with this picture, the burden of proof is on you and on other subjectivists. However, you have to do a much better job than Harry. |
#97
|
|||
|
|||
Yet another DBT post
Mkuller wrote:
(Nousaine) wrote: Let's see; exactly "what" do we hear in nature? We hear timbre (frequency) and loudness (volume) over time. What else is there? With recordings we manipulate pitch and level over time to fool our natural hearing. But those three factors are all there is to "hear." Mkuller wrote: That's like saying red, blue and yellow are the *only* colors we see, because that's all there is. You are focusing on one single dimension - loudness or frequency response and ignoring everything that is more complex - multi-dimensional - pattern-recognition - because single dimensional phenomena appear to be all your test can show in the way it's used. What about more complex audible differences? How about dynamic range? chung wrote: That's just the difference between the loud passages and the noise floor of the system. DBT's are excellent for discriminating background noise, and peak loudness before clipping. No, what you are describing is the *potential* dynamic range. The actual dynamic range is the difference between the softest passage and the loudest. Audio equipment varies in its ability to reproduce a wide dynamic contrast between these two. Have you ever seen *this* differentiated in a DBT? DBT can clearly differentiate *actual* noise floors and clipping levels. You do agree it is easy to tell dynamic range differences in DBT's, right? You don't see this differentiated in DBT's of cables because competent cables all behave the same way. Cables do not add or subtract from dynamic range. I am sure that are amps that can be differentiated in DBT's if the actual dynamic range is different. Compare a 20W amp vs a 200W amp. You can tell them apart in a DBT. How about imaging and soundstage reproduction? Imaging is one of the easiest things to DBT. When playing A, you should be able to place the sound source (a vocal, or a particular instrument), using a familiar piece of recording. Switch to B, you should sense either the same, or a different location for those sources. Just compare X to A and B and pick the one that's the closest. Frankly, saying that DBT's cannot discriminate imaging is an excuse. Have you ever seen a DBT differentiate the imaging capabilities of two audio components? No, because speakers (and vinyl gear) determine imaging, not electronics. If you have evidence otherwise, let'e hear it. But you do agree that it is easy to tell imaging in DBT's, right? What about the *quality* of high frequency or bass reproduction, rather than the *quantity*. What about it? Those are simple frequency response effects. It should be really easy to pinpoint tonal differences of instruments in a DBT/ABX test. Remember all the anecdotes about how something immediately stands out when a component is changed? No they are not *simple* frequency response effects*; that would be "quantity", not "quality". It's the relative quantity that determines the perceived quality. Really, you're making it much harder than it is. If the tonality or timbre is different, why can't you tell them apart blind? The same goes for *transparency* and *resolution of inner detail*. These are mulit-dimensional effects that I have never seen differentiated in a DBT. Have you? Transparency has to do with dynamic range, distortion and frequency response. So does "inner detail". Why wouldn't you detect differences in these properties in DBT's? (Hint, because all competent cables sound alike.) How about all of those things above together at one time? All you need to match X to A or B on any one of these effects, if you noticed previous differences in sighted listening. How hard is it? Of course, it would be hard if your hard-held beliefs are on the line... Have you ever seen it done? Who has hard-held beliefs here? What about other things besides, loudness, tone and frequency response? We have phase, timing, transient attack, note decay, etc. all multi-dimensional attributes which have never been demonstrated to differentiate audio components in a DBT, but nonetheless are real audible effects. So why can't they be differentiated in DBT's, if they can be determined sighted? Phase? Can you determine phase in sighted testing? Timing and transient attacks are all tied to frequency response. Note decay is tied to frequency response and dynamic range. I would think that those can be easily pin-pointed if differences exist. In the DBT's of cables as such, the reason why the result is negative is because cables do not have affect these so-called prpoerties. Signal-to-noise ratio, frequency response and distortion can be thought of as multi-dimensional. As you said, they can be readily differentiated in DBT's. Regards, Mike |
#98
|
|||
|
|||
Yet another DBT post
"josko" wrote in message
... "Mkuller" wrote in message news:iEwUb.229879$xy6.1167040@attbi_s02... "Harry Lavo" wrote : For open ended evaluation, you don't know initially what you are looking for. It make days for things to gel that "a" sounds somewhat thisway, and "b" sounds somewhat more thatway. From extended, evaluative listening and non-quick switching. Then a tentative conclusion is drawn. Now you know what you are listening "for". It may be something subtle and perceptual, such as "imaging". Once you have it firmly grasped in mind what the signature is of "a" and how it might vary from "b", quick switching can help precisely because it "interupts" the perception you have grasped and altered it slightly (or not) over the flow of music. We are talking about open-ended component evaluation. If I simply give you two components, say "different" or "same", or "is it a" or "is it b" and force a choice quick switching works against you because you haven't yet really been able to determine what it is you are listening for in audio terms. "Same" or "different" are not audio terms. They are "sound artifact" terms on simple one or two dimensions. Under quick switching under these circumstances, the brain seems to "panic" in that it can't sort audio patterns quickly and has no frame of reference; this by itself creates anxiety, which in turn creates even more confusion and panic. I believe this is why audiophiles cite stress and fatigue in trying to do this kind of testing when dealing with very subtle, perceptual factors and why the test favors a "null conclusion" unless we are dealing with straightforward factors that the sensate function can handle without much need for the intuitive or emotional functions (volume, frequency response). "Bob Marcus" wrote: All of the preceding is pure, baseless conjecture. Because it doesn't seem to agree with your ideas. No. It's because it doesn't agree with a lot of things that we know about perception, cognition, consumer behavior, and I guess with a lot of things that we know about psychoacoustics, not to mention electrical engineering. And the differences are not only theoretical, but also methodological. Let's suppose that Harry's "theory" makes sense (which it doesn't). The test that he came up with to test his "theory" is, there is no other word for this, H-O-R-R-I-B-L-E from a scientific point of view. Basically, it's full of confounds whose origins are in tons of biases (perceptual, judgmental, processing, decision-related) that he has no controls for. How do I now that? Well, you can take my word for it, or you can go to a local library and start reading. Really, there is no other way. Well, I'd be interested in where you think the lack of controls are. Since it is a *highly* controlled sequence of tests, with statistical rigor built in. Let's see if you really understand what I proposed, or if perhaps you have had a H-O-R-R-B-L-E misundertanding of same. :-) "Harry Lavo" wrote: Do I know this for sure? No. But it is reasonable and verifiable. "Bob Marcus" wrote: Reasonable, it seems to me, only to those who either are unfamiliar with or choose to ignore all of the published research into human hearing perception, which offers no evidence for any of it. That is very difficult to believe. The evidence is there - you have to chose *not* to ignore it because it might not fit in with your *beliefs* about blind testing and audible differences. It's pretty clear the objectivists do not have all of the answers about DBTs or this debate would not be continuing. This is not the point at all. Nobody has all answers about anything, but some people have some answers about some things. It would be beneficial if subjectivists try to make an effort to familiarize themselves with those answers. Basically, what subjectivists are saying is: since 50 (100?) years of multidisciplinary, peer-reviewed research doesn't agree with our unreliable, biased perceptions (and perceptions are *by definition* unreliable and biased), then all those researchers are wrong and we are right. Well, it doesn't work that way. If someone wants to come up with a new theory, his theory has to acommodate the existing theory first and build from there. I'm prety sure that someone is not going to be a layperson. Convincing yourself is one thing - convincing us sceptics is another - it requires more than continued denial. No, no, no... In this story, subjectivists are believers, and objectivists are sceptics. The dichotomy between evaluating/judging judgment = evaluation!!!!! You think that you can evaluate things without judging them? This is getting really interesting. and single-dimension/mulitple-dimension testing makes sense to anyone who has ever tried open-ended component evaluation sighted and then the same test blind. If you try paying attention to what is going on in your own head during the process - it's called *self-awareness* - Harry's description above is the best explanation I've seen so far. Theoretically, Harry would like to have it "both ways", so to speak. He would like to use consciusly monitored "true" emotions, that arise from sponteanous (intuitive as he would say) gestalt perception of a piece of music, as proxies for evaluation of aural stimuli (sound). At the same time, he needs a lengthy period of time (weeks, months) to do all kinds of "piecemeal" processing of sound (imaging, soundstaging, air, "midband plasticity", bass extension, slam, speed, ...... I am sure you can add a lot more properties here) before he "arrives", cognitively speaking, at the gestalt. First of all, this is totally impossible, again cognitively speaking. If you feel any emotions after all this time and all the proceesing that you've done, they will be representative of about zillion things, and not of the sound quality alone. Second, all this proceesing is, more or less, biased to a very high degree for a large number of reasons (and Harry has no single control for any of them). I have described this in some detail in one of my previous posts, but none of subjectivists felt the urge to respond with any plausible arguments. I do agree on one thing with Harry -- testee anxiety. However, the cause of that anxiety is not brain "panicking", but let's not go there. The value of a DBT ABX test is that it has necessary confound controls, yet it is a very sensitive test given *what we know about judgment and decision making, perception and cognitive psychology in general, as well as what we know about psychoacoustics and properties of signals*. Do we know everything about all these things? Of course not, but when somebody makes a thoeretical progress in the field of perception of auditory signals, he will be using ABX DBT (or something similar), and nothing like Harry's test. Also, when trying to build a plausible theory of perception (and cognition) it is really desirable not to start with behavioralists and Jung. Really. This is a non-starter in 2004 and speaks volumes about such attempt. No disrespect. It also explains why DBTs which are positive deal with one dimension (loudness or frequency response of more than 2dB difference) and all the rest are null. The test, in the way it is applied, seems to interfere with identifying the differences you are testing for. You can either continue denying it, or admit there might be a problem and help find for a solution with your claimed familiarity with "all of the published research into human hearing perception". You got this also the other way around. The theory and empirical evidence (both of multidisciplinary nature) totally predict "why DBTs which are positive deal with one dimension (loudness or frequency response of more than 2dB difference) and all the rest are null". If you think that there is something wrong with this picture, the burden of proof is on you and on other subjectivists. However, you have to do a much better job than Harry. |
#99
|
|||
|
|||
Yet another DBT post
Dear Debaters,
Is it really necessary to quote entire multi-screen posts in order to embed or add one line of riposte? I think not. And I apologize in advance and in retrospect for doing/ever having done this myself. Yrs., -- -S. "They've got God on their side. All we've got is science and reason." -- Dawn Hulsey, Talent Director |
#100
|
|||
|
|||
Yet another DBT post
"Harry Lavo" wrote in message
news:yPzUb.100288$U%5.493205@attbi_s03... "josko" wrote in message No. It's because it doesn't agree with a lot of things that we know about perception, cognition, consumer behavior, and I guess with a lot of things that we know about psychoacoustics, not to mention electrical engineering. And the differences are not only theoretical, but also methodological. Let's suppose that Harry's "theory" makes sense (which it doesn't). The test that he came up with to test his "theory" is, there is no other word for this, H-O-R-R-I-B-L-E from a scientific point of view. Basically, it's full of confounds whose origins are in tons of biases (perceptual, judgmental, processing, decision-related) that he has no controls for. How do I now that? Well, you can take my word for it, or you can go to a local library and start reading. Really, there is no other way. Well, I'd be interested in where you think the lack of controls are. Since it is a *highly* controlled sequence of tests, with statistical rigor built in. Let's see if you really understand what I proposed, or if perhaps you have had a H-O-R-R-B-L-E misundertanding of same. :-) Maybe I did misunderstand you. I know that you posted the details of your test more than once on this forum. So, before I look at them again, could you please tell me the name of the thread, or the actual post, where I'll find the best/cleanest version of your proposal. Sorry for the tone of my post above. It has nothing to do with your idea, but with something else. |
#101
|
|||
|
|||
Yet another DBT post
wrote :
We are talking about open-ended component evaluation. If I simply give you two components, say "different" or "same", or "is it a" or "is it b" and force a choice quick switching works against you because you haven't yet really been able to determine what it is you are listening for in audio terms. "Same" or "different" are not audio terms. They are "sound artifact" terms on simple one or two dimensions. snip As I said last week, it appears Harry's post was the last of the real debate/discussion. "josko" wrote: It's because it doesn't agree with a lot of things that we know about perception, cognition, consumer behavior, and I guess with a lot of things that we know about psychoacoustics, not to mention electrical engineering. And the differences are not only theoretical, but also methodological. snip How do I now that? Well, you can take my word for it, or you can go to a local library and start reading. Really, there is no other way. snip "Bob Marcus" wrote: Reasonable, it seems to me, only to those who either are unfamiliar with or choose to ignore all of the published research into human hearing perception, which offers no evidence for any of it. josko Basically, what subjectivists are saying is: since 50 (100?) years of multidisciplinary, peer-reviewed research doesn't agree with our unreliable, biased perceptions (and perceptions are *by definition* unreliable and biased), then all those researchers are wrong and we are right. Well, it doesn't work that way. When the second objectivist invokes the condescending "what you are saying doesn't agree with 100s of years of research on psychometrics", and they stop answering direct questions - they're really saying the "debate" is over and they have nothing new to add. snip judgment = evaluation!!!!! You think that you can evaluate things without judging them? This is getting really interesting. One final point - until you apply a value scale, it is only "evaluating", not "judging". Everyone knows that, come on... Regards, Mike |
#102
|
|||
|
|||
Yet another DBT post
|
#103
|
|||
|
|||
The end of the DBT debate
|
#104
|
|||
|
|||
Yet another DBT post
|
#105
|
|||
|
|||
Yet another DBT post
"Nousaine" wrote in message
news:BqGUb.189768$nt4.805097@attbi_s51... "Harry Lavo" wrote: "Bob Marcus" wrote in message ...snips..... "Same" or "different" are not audio terms. They are "sound artifact" terms on simple one or two dimensions. Same and Different are clearly un-misunderstandable terms like A and B and X. They have no "sound artifact" dimensions. They have no "dimension" they can't sound the same tonally but different spatially. They are either "same" in all respects or "different" in one, more or all respects. Yes, but they have nothing to do with open ended evaluation or critical listening to music. They can be done just fine with noise artifacts, but that says nothing about the dynamic quality of the equipment when reproducing music or elusive factors such as transparency. Under quick switching under these circumstances, the brain seems to "panic" in that it can't sort audio patterns quickly and has no frame of reference; this by itself creates anxiety, which in turn creates even more confusion and panic. "Panic?" "I can't tell them apart" causes panic? It's a pretty simple decision. There could ONLY be stress if you had decided beforehand that they sounded different and now when you are asked to make a choice based on sound alone you can't make up your mind. Furthermore in the typical bloind test scores are only known after the experiment is concluded. The only possible stress mechanisms is "doubt" wheh you can't "hear" all those imaginary differences that were so apparent until you are required to decide based on sound and sound alone. Rational humans will simply report what they "hear" and not what they "want to hear." "Panic?" Please. This response indicates you really don't understand what I wrote about the need for a "musical reference" to be established first before a comparison can legitimately be made. I believe this is why audiophiles cite stress and fatigue in trying to do this kind of testing when dealing with very subtle, perceptual factors and why the test favors a "null conclusion" A bias-controlled test "favors" no result. That's why it's called "blind." If bias-controls cause "stress" it can only be due to bias-masking. Until a control test is done, this is an 'assertion' as far as open-ended evaluation of musical reproduction is concerned. unless we are dealing with straightforward factors that the sensate function can handle without much need for the intuitive or emotional functions (volume, frequency response). These items are related to acoustical sound. They are also the only things we "physically" hear. Exactly how does "emotion" play into it; unless you get emotionally stressed when you can't "hear" non-acoustical differences when required to limit yourself to acoustical cause. But music isn't just about what we physically "hear". It is about how the brain interprets what we physically "hear". As has been pointed out here many times. All of the preceding is pure, baseless conjecture. Do I know this for sure? No. But it is reasonable and verifiable. Your serve. Reasonable, it seems to me, only to those who either are unfamiliar with or choose to ignore all of the published research into human hearing perception, which offers no evidence for any of it. That is why I proposed a control test that is double-blind, relaxed, evaluative, and leisurely. I presume by this you mean something like the test protocol used by Oohashi to Â"confirmÂ" the existence of his alleged Â"hypersonic effect,Â" except that you would allow longer listening times than he used. yep. Along with testing of the same respondents using sighted, evaluative listening and at another time relatively short, terse, comparative ("same","different") double- blind testing as is traditionally recommended here. If the control test gave results similar to traditional dbt/abx, it would verify that that traditional dbt/abx testing was a valid "shortcut" to evaluative testing. If the control test gave results similar to sighted open-ended evaluative testing, then it would suggest that evaluative testing even though sighted was a more encompassing and valid approach for component evaluation. I donÂ't think this test, as you envision it, is really possible to perform in the real world. You would need to run the same Â"evaluativeÂ" test multiple times on the same subject (at least twice--once sighted and once blind--even if you used multiple subjects). The problem is that his answers on the first trial would influence him in any subsequent trials. (i.e., heÂ'll be looking for the same set of characteristics he has already Â"heardÂ" once.) And if his answers arenÂ't independent, any comparison of those answers would be meaningless. So while I understand what youÂ're trying to get at, and what you hope to prove, you canÂ't get there from here. No problem if the sighted, open-end test is done first...let's say with 16 trials if we are talking one person. I'd actually prefer 100 people doing it once. Then three months later the test is done blind, again with 16 trials or preferably 100 people. Nobody would know or remember the exact scores they gave on the evaluative criteria, and in the second blind test they wouldn't know which was which so it would hardly be relevant if they did. I'd say thay every enthusiast would have an internal flag on what they scored on an open test. How would an "internal flag" (whatever that is) disrupt a blind test where the testee has no knowledge of which unit is which. Unless he can identify the two units by this "flag". If he can, then it is proof that the units are different and can be heard in a blind test, isn't it? So if you really want to stop the "jaw flapping" and try to resolve the differences of the two camps, first you have to acknowledge the possibility that we might have a point, and that it is worth trying to resolve somehow. Well, no, I donÂ't have to acknowledge any such thing. There is no possibility that you have a point, because there is absolutely no support for any of your conjectures anywhere in the voluminous research on human hearing perception. If you can find such support, I will then be prepared to concede that you may have a point. Also, if you can provide any direct evidence for any of your conjectures, I will concede that you may have a point. But IÂ'm not holding my breath, on either score. Somebody always has to be first! :-) A fair amount of science was first postulated by "crazies". May be crazy, may be. Or may however remotely possibly be right. All Audio Myths are propagated by crazies. Legitimate science is no longer propagated by nuts. The method is to conduct replicable experiments that can be verified. Let's see. Some scientific breakthroughs are postulated by "crazies". All audio myths are propogated by "crazies". Therefore, all "crazy" theories are audio myths? Examine your boolean logic again, Tom. None of the cuurent Audio Urban Legends has ever been replicated even those of 30 years maturity. Funny, things like isolation feet and tube damping rings are in widespread use today. Mass delusion that they are effective, I suppose. But you don't have to think I am right. Enough people have raised similar issues on this forum over the years...enough that anybody really seeking the truth should a least design a control test to knock down the objections. Why not verify your hypotheses first? It is you who are asserting your test works for open-ended component evaluation testing. It is up to you to provide the evidence via a control test before you can expect everybody else to "buy it". Why do you suppose Oohashi choose that particular form of testing, and why do you suppose he found statistical correlation where conventional theory suggested it shouldn't exist? I'd say that either you're mis-interpreting response, he was choosing a bias-introducing mechansim or looking for a result. But why hasn't everybody else jumped on the bandwagon IF this is seriously important? Who says they aren't/won't be in the future. Trends do not happen overnight. Does not a better listening test technique suggest itself as a possibility? Even IF this method was superior why hasn't the subjectivist community on the bandwagon and delivered confirmation of amp/cable sound? Why haven't you? I am talking about a difference in test technique. Oohashi himself makes reference to earlier tests that showed no difference using conventional techniques; that is why his group set the listening test up the way they did, because they suspected that might be one of the factors getting in the way. Which one? As far as I can determine, short-snippet comparative testing short-circuiting the physiological pleasure response. And the results don't dispute the possibility that he was right. OK; as I said prior ......you're in the batter's box. I'll be posting over the weekend. |
#106
|
|||
|
|||
The end of the DBT debate
"Nousaine" wrote in message
news:BSFUb.189677$nt4.804788@attbi_s51... "Harry Lavo" ...snips.... As a member of that "jazz, acoustical instrument" group, I have never found loudness per se important; I do find a lack of obvious distortion to reasonably loud living room levels helpful in preventing fatigue. As a person who doesn't particularly find traditional jazz an interesting musical experience I do attend jazz performance (particularly 'Festivals' such as Elkhart, Indiana) because it's a great way to see/hear acoustical instruments in a small room (even clubs) environment. I do have reference on-location recordings where ORTF microphones were placed within 2-feet of my head position for reference. And if there is one single factor that seems most important for providing the best sense of realism is when the playback system has the capability of delivering a loudness level similar to that experience on location. For drums and trumpets that is often very, very loud, especially at seats where you get a good "look" at the performance. Much louder than one might think. Next, is timbral naturalness. Finally spatiality. Of course, to a great degree these are intertwined. But IME the largest limitation for 'realism' is dynamic limitations of the playback system. Imaging and dimensionality are much more important to me, even though I now have a surround system. That is why I always bow out of recomendations to rock lovers and organ fans...I simply do not have the same frame of reference. I'm a big spatial-rendition fan; but without adequate dynamic capability realism can be compromised. For example, I often hear systems that have plausible horizontal 'placement' of acoustic images but dyamic limitations can change the size (and often depth; front-back) of instruments and a realistic sense of envelopment. I think you will find that that is one of the main advantages of full-range surround systems. The ability to swell and envelop dynamically, especially in the bass, is enhanced by the multiple amps (electrical) and speakers (acoustical, especially in the bass). In addition, the ambience itself is a part of the phenomenon, and surround (if recorded that way) provides that as well. Don't get me wrong, I listen (sometimes) at levels approaching club levels, but it just doesn't take that much power in a full-range surround system in a 20x12x7.5' room. |
#107
|
|||
|
|||
Yet another DBT post
|
#108
|
|||
|
|||
The End of the DBT Debate?
"Steven Sullivan" wrote in message
... Harry Lavo wrote: "Buster Mudd" wrote in message ... "Harry Lavo" wrote: For open ended evaluation, you don't know initially what you are looking for. It make days for things to gel that "a" sounds somewhat thisway, and "b" sounds somewhat more thatway. From extended, evaluative listening and non-quick switching. Then a tentative conclusion is drawn. Now you know what you are listening "for". It may be something subtle and perceptual, such as "imaging". Once you have it firmly grasped in mind what the signature is of "a" and how it might vary from "b", quick switching can help precisely because it "interupts" the perception you have grasped and altered it slightly (or not) over the flow of music. Would you suppose that after having spent the requisite days/weeks/months of extended, evaluative listening and non-quick switching, after having drawn tentative conclusions, & after having firmly grasped in mind what the signature is of "a" and how it might vary from "b" ...that *THEN* you could pass a conventional ABX double blind test between "a" & "b" ? No need, I'd already have the answer without ever having to make a conscious choice...it would have grown organically out of the listening. You have 'an' answer....but you also have an inescapable question mark, from the POV of established perceptual research practice, unless you verify under blind conditions. Since you've already agreed that sighted evaluation is inherently flawed in a way that a blind comparison can resolve, and you appear to be a dedicated audiophile, I can't see why the 'answer' from extended evaluative *sighted* listening would satisfy you. However, if I did want to do a blind confirmation, I would do it in an evaluative fashion using the same music I had been listening to, and identifying/rating the components on a scale designed to get at the factors I had grown to identify as distinguishing. I would do a one-two hour 'warm up" sighted before going blind for each trial. And I would do fifteen or twenty of those trials over a pretty long period of time. And then apply statistical analysis. It would never be a conventional a-b or a-b-x comparative test. But it *would* be a blind A-B or ABX test. However, if i did want to do a comparative blind test, I would want it to be an a-b, not an abx. And I would want it to follow hard on the heels of several hours of warm-up listening, where I had firmly reestablished those signatures in mind before "going blind". And I would want to use the same music I had just been listening to and control the switching. And I would want to do it alone with no chance of cheating built into the test. Whatever, Harry. The key question is: would you believe the results if they contradicted your sighted percptions? Why is this posted again. I've already answered it. I would, if the "blind testing" had been validated by appropriate control tests. If such validation were not done, then I wouldn't even take the conventional blind comparative test as I would be buying a pig in a poke. |
#109
|
|||
|
|||
Yet another DBT post
Imaging is one of the easiest things to DBT. When playing A, you should
be able to place the sound source (a vocal, or a particular instrument), using a familiar piece of recording. Switch to B, you should sense either the same, or a different location for those sources. Just compare X to A and B and pick the one that's the closest. Frankly, saying that DBT's cannot discriminate imaging is an excuse. Have you ever seen a DBT differentiate the imaging capabilities of two audio components? I agree with Mr. Chung on this one. One may argue that using music when listening for audible differences may be problematic in time synced ABX DBTs because the signal may be changing more than the potential difference imposed upon the signal by the components being tested but... Even with the ever changing sound of the music if the imaging changes it should be easy to identifiy in a time synced quick switching ABX DBTs. The music may change in time but the imaging should not. |
#110
|
|||
|
|||
Yet another DBT post
chung wrote:
Frankly, saying that DBT's cannot discriminate imaging is an excuse. mkuller wrote: Have you ever seen a DBT differentiate the imaging capabilities of two audio components? (S888Wheel) wrote: I agree with Mr. Chung on this one. One may argue that using music when listening for audible differences may be problematic in time synced ABX DBTs because the signal may be changing more than the potential difference imposed upon the signal by the components being tested but... Even with the ever changing sound of the music if the imaging changes it should be easy to identifiy in a time synced quick switching ABX DBTs. The music may change in time but the imaging should not. Both of you missed the point. Yes, we all agree that imaging, dynamic contrasts, etc. are real audible phenomena (multi-dimensional). The question is not whether these things *should* be audible as differences in a DBT with audio components, but whether they *ever actually have been shown* as differences in a DBT. I'm not aware of any DBTs that have shown these as audible differences - my question is why not? Is it a problem with the sensitivity of the DBT or in the types of *single-dimensional* differences (loudness and gross frequency response only) that DBTs show? I believe the test is the problem. Regards, Mike |
#111
|
|||
|
|||
Yet another DBT post
Mkuller wrote:
chung wrote: Frankly, saying that DBT's cannot discriminate imaging is an excuse. mkuller wrote: Have you ever seen a DBT differentiate the imaging capabilities of two audio components? (S888Wheel) wrote: I agree with Mr. Chung on this one. One may argue that using music when listening for audible differences may be problematic in time synced ABX DBTs because the signal may be changing more than the potential difference imposed upon the signal by the components being tested but... Even with the ever changing sound of the music if the imaging changes it should be easy to identifiy in a time synced quick switching ABX DBTs. The music may change in time but the imaging should not. Both of you missed the point. Yes, we all agree that imaging, dynamic contrasts, etc. are real audible phenomena (multi-dimensional). The question is not whether these things *should* be audible as differences in a DBT with audio components, but whether they *ever actually have been shown* as differences in a DBT. I'm not aware of any DBTs that have shown these as audible differences - my question is why not? Is it a problem with the sensitivity of the DBT or in the types of *single-dimensional* differences (loudness and gross frequency response only) that DBTs show? I believe the test is the problem. Regards, Mike Hmmm, maybe you missed my response in an earlier post: "No, because speakers (and vinyl gear) determine imaging, not electronics. If you have evidence otherwise, let'e hear it. But you do agree that it is easy to tell imaging in DBT's, right?" The test is not the problem. The fact is that speakers dominate imaging, and cables as well as competent amps simply do not change imaging. That's why DBT's rarely show imaging differences: DBT's are very rarely done on speakers. |
#112
|
|||
|
|||
Yet another DBT post
"chung" wrote in message
news:hLAVb.6230$032.22901@attbi_s53... Mkuller wrote: chung wrote: Frankly, saying that DBT's cannot discriminate imaging is an excuse. mkuller wrote: Have you ever seen a DBT differentiate the imaging capabilities of two audio components? (S888Wheel) wrote: I agree with Mr. Chung on this one. One may argue that using music when listening for audible differences may be problematic in time synced ABX DBTs because the signal may be changing more than the potential difference imposed upon the signal by the components being tested but... Even with the ever changing sound of the music if the imaging changes it should be easy to identifiy in a time synced quick switching ABX DBTs. The music may change in time but the imaging should not. Both of you missed the point. Yes, we all agree that imaging, dynamic contrasts, etc. are real audible phenomena (multi-dimensional). The question is not whether these things *should* be audible as differences in a DBT with audio components, but whether they *ever actually have been shown* as differences in a DBT. I'm not aware of any DBTs that have shown these as audible differences - my question is why not? Is it a problem with the sensitivity of the DBT or in the types of *single-dimensional* differences (loudness and gross frequency response only) that DBTs show? I believe the test is the problem. Regards, Mike Hmmm, maybe you missed my response in an earlier post: "No, because speakers (and vinyl gear) determine imaging, not electronics. If you have evidence otherwise, let'e hear it. But you do agree that it is easy to tell imaging in DBT's, right?" The test is not the problem. The fact is that speakers dominate imaging, and cables as well as competent amps simply do not change imaging. That's why DBT's rarely show imaging differences: DBT's are very rarely done on speakers. Chung, you keep repeating that only speakers create differences in imaging, as if it is a somehow settled fact. It simply isn't. Let me relay to you a tale, anecdotal as it is: When I was living on Long Island, my listening room was in a finished attic that was almost perfect acoustically. The roof sloped up and then back. The long was opposite the listening sofa was "ledged" so it was at two difference depths and had bookcases to diffuse sound. The speakers stood in the middle of the room (almost) about 10' from the couch. The sides of the speakers didn't have a wall within twenty feet on either side. There were no standing waves to speak of and no reflections or bounces...four 3' x 3' squares of acoustic foam above the couch were the ceiling rose got rid of the only reflection. It was a simply beautiful listening room. When my beloved ARC D90B amp needed new caps, I decided to sell it instead (honestly, w/full disclosure) and invest in a new power amp. Brought three in for a week at a time on loan. Only change made in the system...not so much as a 1/4" change in anything else. 1) Amp 1, transistor, made by one of the leading names in the industry. Flat constricted sound, grainy on top, a bit murky in upper bass. Imaging and sound totally unrealistic. 2) Amp 2, transistor, made by another leading name in industry. Beautiful imaging, soundstages out beyond speakers, instruments and voices had a dimensional 3D quality usually associated with tubes. Yet not quite right, as the voices and instruments sounded like they were wrapped in gold foil. The were 3D but all you heard was the "surface" of the image...there didn't seem to be depth or weight. 3) Amp 3, tube. Same beautiful imaging, soundstage, dimensionality. Only now the 3D images were fleshed out and palpable. You literally felt you could reach out and touch the performers. Now, not a *thing* had changed except those amps. Nor was I the only one who could hear it. My female partner at the time walked into my listening sessions and accurately caught/described the character of the sound of each with no prompting from me. There is more to it than speakers, Chung. |
#113
|
|||
|
|||
Yet another DBT post
|
#114
|
|||
|
|||
Yet another DBT post
Mkuller wrote:
chung wrote: Frankly, saying that DBT's cannot discriminate imaging is an excuse. mkuller wrote: Have you ever seen a DBT differentiate the imaging capabilities of two audio components? (S888Wheel) wrote: I agree with Mr. Chung on this one. One may argue that using music when listening for audible differences may be problematic in time synced ABX DBTs because the signal may be changing more than the potential difference imposed upon the signal by the components being tested but... Even with the ever changing sound of the music if the imaging changes it should be easy to identifiy in a time synced quick switching ABX DBTs. The music may change in time but the imaging should not. Both of you missed the point. Yes, we all agree that imaging, dynamic contrasts, etc. are real audible phenomena (multi-dimensional). The question is not whether these things *should* be audible as differences in a DBT with audio components, but whether they *ever actually have been shown* as differences in a DBT. I am sure that DBT's of speakers, done at places like H-K or Paradigm, to name a few maufacturers, routinely show differences in imaging. I'm not aware of any DBTs that have shown these as audible differences - my question is why not? Since your awareness is not sufficiently comprehensive, the question is a moot one. Is it a problem with the sensitivity of the DBT or in the types of *single-dimensional* differences (loudness and gross frequency response only) that DBTs show? I believe the test is the problem. Regards, Mike |
#115
|
|||
|
|||
Yet another DBT post
Mkuller wrote:
Both of you missed the point. Yes, we all agree that imaging, dynamic contrasts, etc. are real audible phenomena (multi-dimensional). The question is not whether these things *should* be audible as differences in a DBT with audio components, but whether they *ever actually have been shown* as differences in a DBT. I'm not aware of any DBTs that have shown these as audible differences - my question is why not? Is it a problem with the sensitivity of the DBT or in the types of *single-dimensional* differences (loudness and gross frequency response only) that DBTs show? I believe the test is the problem. chung wrote: Hmmm, maybe you missed my response in an earlier post: "No, because speakers (and vinyl gear) determine imaging, not electronics. If you have evidence otherwise, let'e hear it. But you do agree that it is easy to tell imaging in DBT's, right?" The test is not the problem. The fact is that speakers dominate imaging, and cables as well as competent amps simply do not change imaging. That's why DBT's rarely show imaging differences: DBT's are very rarely done on speakers. Once again, your opinion stated as a fact. I have clearly heard, many, many times, electronics (both tubed and solid state) affect imaging and soundstaging. Certainly it helps if you start with good loudspeakers capable of demonstrating these effects and a room that doesn't interfere. Here's a link to pictures of my room: http://img.audioasylum.com/cgi/view....19321&session= I'm talking about preamps, amplifiers, and CD players. You say they don't affect imaging and claim DBTs are your proof. That's a pretty circular arguement, since DBTs have never been shown to be capable of differentiating imaging differences - of anything... Regards, Mike |
#116
|
|||
|
|||
Yet another DBT post
Harry Lavo wrote:
1) Amp 1, transistor, made by one of the leading names in the industry. Flat constricted sound, grainy on top, a bit murky in upper bass. Imaging and sound totally unrealistic. 2) Amp 2, transistor, made by another leading name in industry. Beautiful imaging, soundstages out beyond speakers, instruments and voices had a dimensional 3D quality usually associated with tubes. Yet not quite right, as the voices and instruments sounded like they were wrapped in gold foil. The were 3D but all you heard was the "surface" of the image...there didn't seem to be depth or weight. 3) Amp 3, tube. Same beautiful imaging, soundstage, dimensionality. Only now the 3D images were fleshed out and palpable. You literally felt you could reach out and touch the performers. Now, not a *thing* had changed except those amps. Nor was I the only one who could hear it. My female partner at the time walked into my listening sessions and accurately caught/described the character of the sound of each with no prompting from me. There is more to it than speakers, Chung. Maybe you missed the part about competent? Examples of things that can change imaging: (a) mismatches in frequency response between L and R channels, (b) mistracking of gain between L and R channels as volume control is adjusted, (c) mismatches in L/R output impedances that are significantly large (like those found in SET's), (d) mismatches of phono-equalization between L/R channels, (e) balance control not centered, (f) loudness contour mismatches between L/R channels, and (g) any gross frequency response errors or insufficient S/N (found in SET's, e.g.). There is no examples that I can found of imaging differences caused by cables or interconnects, unless they are grossly defective. In any event, if those differences are so real and "palpable" in your sighted testing that your female partner at the time had no trouble identifying them, you think you have trouble differentiating them in a DBT? Would you still need extensive evaluative sessions before you can compare? |
#117
|
|||
|
|||
Yet another DBT post
Mkuller wrote:
Mkuller wrote: Both of you missed the point. Yes, we all agree that imaging, dynamic contrasts, etc. are real audible phenomena (multi-dimensional). The question is not whether these things *should* be audible as differences in a DBT with audio components, but whether they *ever actually have been shown* as differences in a DBT. I'm not aware of any DBTs that have shown these as audible differences - my question is why not? Is it a problem with the sensitivity of the DBT or in the types of *single-dimensional* differences (loudness and gross frequency response only) that DBTs show? I believe the test is the problem. chung wrote: Hmmm, maybe you missed my response in an earlier post: "No, because speakers (and vinyl gear) determine imaging, not electronics. If you have evidence otherwise, let'e hear it. But you do agree that it is easy to tell imaging in DBT's, right?" The test is not the problem. The fact is that speakers dominate imaging, and cables as well as competent amps simply do not change imaging. That's why DBT's rarely show imaging differences: DBT's are very rarely done on speakers. Once again, your opinion stated as a fact. I have clearly heard, many, many times, electronics (both tubed and solid state) affect imaging and soundstaging. Certainly it helps if you start with good loudspeakers capable of demonstrating these effects and a room that doesn't interfere. Here's a link to pictures of my room: http://img.audioasylum.com/cgi/view....19321&session= I'm talking about preamps, amplifiers, and CD players. You say they don't affect imaging and claim DBTs are your proof. That's a pretty circular arguement, since DBTs have never been shown to be capable of differentiating imaging differences - of anything... I didn't use DBT as a proof. I am saying that competent amps and cables do not change imaging, which is determined by speakers and their placement. If you don't agree, then list the amps/cables that have different imaging, and we can see if we can set up a sighted test and repeat your observations. I am also saying that if there is imaging difference between two pieces of equipment (like two amps or two cables), it should be very easy to identify them, blind or sighted. Why would DBT hide image differences? You are the one who is providing a circular argument. You are saying that there have to be imaging differences between two amps/cables, since you can tell them in sighted testing. Then you make the *assumption* that you cannot tell these differences in DBT's, and then use that assumption as the reason why DBT's don't work. Regards, Mike |
#118
|
|||
|
|||
Yet another DBT post
chung wrote:
snip Now, not a *thing* had changed except those amps. Nor was I the only one who could hear it. My female partner at the time walked into my listening sessions and accurately caught/described the character of the sound of each with no prompting from me. There is more to it than speakers, Chung. snip In any event, if those differences are so real and "palpable" in your sighted testing that your female partner at the time had no trouble identifying them, you think you have trouble differentiating them in a DBT? Would you still need extensive evaluative sessions before you can compare? More to the point, as described (i.e. "My female partner at the time walked into my listening sessions and accurately caught/described the character of the sound of each with no prompting from me), she was *easily* able to not only distinguish difference (that all-confounding *decision* you keep attributing such nefarious capabilities to), but to comprehensively characterize the differences between the amps *Essentially BLIND*, unless the "no prompting from me" is a complete mischaracterization. So, Mr. Lavo, by your own 'anecdote', "blindness" confers no disadvantages in differentiating amps. Clearly, the process of merely codifying the test protocol, and applying statistical treatments of the data, post-test, can't possibly be a hindrance in detecting *real* sonic differences. At least for your 'female partner' in any event. Keith Hughes |
#119
|
|||
|
|||
Yet another DBT post
Harry Lavo wrote:
"chung" wrote in message news:3xUVb.11981$032.40528@attbi_s53... Harry Lavo wrote: 1) Amp 1, transistor, made by one of the leading names in the industry. Flat constricted sound, grainy on top, a bit murky in upper bass. Imaging and sound totally unrealistic. 2) Amp 2, transistor, made by another leading name in industry. Beautiful imaging, soundstages out beyond speakers, instruments and voices had a dimensional 3D quality usually associated with tubes. Yet not quite right, as the voices and instruments sounded like they were wrapped in gold foil. The were 3D but all you heard was the "surface" of the image...there didn't seem to be depth or weight. 3) Amp 3, tube. Same beautiful imaging, soundstage, dimensionality. Only now the 3D images were fleshed out and palpable. You literally felt you could reach out and touch the performers. Now, not a *thing* had changed except those amps. Nor was I the only one who could hear it. My female partner at the time walked into my listening sessions and accurately caught/described the character of the sound of each with no prompting from me. There is more to it than speakers, Chung. Maybe you missed the part about competent? Examples of things that can change imaging: (a) mismatches in frequency response between L and R channels, (b) mistracking of gain between L and R channels as volume control is adjusted, (c) mismatches in L/R output impedances that are significantly large (like those found in SET's), (d) mismatches of phono-equalization between L/R channels, (e) balance control not centered, (f) loudness contour mismatches between L/R channels, and (g) any gross frequency response errors or insufficient S/N (found in SET's, e.g.). These were power amps, Chung Which of the above apply. All of them, except (d), (e) and (f) apply to power amps, if they are not properly designed. Plus, they were from large and respected companies, not boutiques. Incompetence is not limited to boutiques. So what do you think caused the imaging differences? Were you listening at equal loudness levels? There is no examples that I can found of imaging differences caused by cables or interconnects, unless they are grossly defective. I do not recall that Mike was talking about wires in this thread. A lot of the DBT's he took as producing negative results were on cables. You think he believes that cables do not cause imaging differences? In any event, if those differences are so real and "palpable" in your sighted testing that your female partner at the time had no trouble identifying them, you think you have trouble differentiating them in a DBT? Would you still need extensive evaluative sessions before you can compare? If a DBT could prove to reveal these types of differences (which a control test is needed for), then I'm sure one would reveal these differences. By proof you mean agreement with sighted testing? So you are saying if both sighted and blind tests show imaging differences, then you think DBT can show imaging differences? Hmmm, you cover all the bases here . Mike thinks a DBT would not; I honestly don't know. He may be right. What!!! Your female partner (who was implied to be not an audiophile) could hear the palpable differences immediately without any prompt from you, and you think you may not be able to identity them, blind? This is an absolutely amazing admission from a subjectivist. You are saying that once we don't tell you what you are listening to, your hearing ability as an audiophile is much worse than the average person! |
#120
|
|||
|
|||
Yet another DBT post
"Keith Hughes" wrote in message
... chung wrote: snip Now, not a *thing* had changed except those amps. Nor was I the only one who could hear it. My female partner at the time walked into my listening sessions and accurately caught/described the character of the sound of each with no prompting from me. There is more to it than speakers, Chung. snip In any event, if those differences are so real and "palpable" in your sighted testing that your female partner at the time had no trouble identifying them, you think you have trouble differentiating them in a DBT? Would you still need extensive evaluative sessions before you can compare? More to the point, as described (i.e. "My female partner at the time walked into my listening sessions and accurately caught/described the character of the sound of each with no prompting from me), she was *easily* able to not only distinguish difference (that all-confounding *decision* you keep attributing such nefarious capabilities to), but to comprehensively characterize the differences between the amps *Essentially BLIND*, unless the "no prompting from me" is a complete mischaracterization. So, Mr. Lavo, by your own 'anecdote', "blindness" confers no disadvantages in differentiating amps. Clearly, the process of merely codifying the test protocol, and applying statistical treatments of the data, post-test, can't possibly be a hindrance in detecting *real* sonic differences. At least for your 'female partner' in any event. She knew I had swapped in another amp...but we hadn't discussed "how it sounded" since she had just walked in and heard it for the first time. So it wasn't blind in the sense discussed here. Nor was it comparative; it was evaluative. She described how it sounded to her. Sorry, no points Keith. :-) |
Reply |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
How to Post to Usenet | Car Audio | |||
[Admin] Rec.Audio.High-End Newsgroup Guidelines | High End Audio | |||
[Admin] Rec.Audio.High-End Newsgroup Guidelines | High End Audio | |||
[Admin] Rec.Audio.High-End Newsgroup Guidelines | High End Audio | |||
[Admin] Rec.Audio.High-End Newsgroup Guidelines | High End Audio |