About Us

Oliver Costich

On Thu, 17 Jan 2008 17:44:27 -0600, MiNe 109
wrote:

In article ,
Oliver Costich wrote:

On Thu, 17 Jan 2008 12:56:23 -0500, Walt
wrote:

wrote:
On Jan 16, 10:52?am, John Atkinson wrote:
http://online.wsj.com/article/SB1200...?mod=hpp_us_in...

Money quote: "I was struck by how the best-informed people at the
show -- like John Atkinson and Michael Fremer of Stereophile
Magazine -- easily picked the expensive cable."

So will you be receiving your $1 million from Randi anytime soon?

Don't count on it. From TFA: "But of the 39 people who took this test,
61% said they preferred the expensive cable." Hmmme. 39 trials. 50-50
chance. How statistically significant is 61%? You do the math.
(HINT: it ain't.)

Here's the math: Claim is p (proportion of correct answers) .5. Null
hypothesis is p=.5. The null hypothsis cannot be rejected (and the
claim cannot be supported) at the 95% significance level.

Welcome to the group! Out of curiosity, what significance level does 61%
support?

Stephen

First you have to find out where the 61% came from. In this case, I
presume it is 24 out of 39. From the sample data and the claim about
the population proportion, you can compute a number called the
P-Value, not to be confused with the population probability in the
claim, usually denoted "p". To be able to support a claim that more
than half of the population can do better then guessing, you need the
P-Value for p=.5, which in this case is .07477. To support the claim
that p.5 at the 95% confidence level, you need the P-Value to be less
than (1-significance level). So for 95%, you need a P-Value of less
than .05, for 93% you need a P-Value less than .07. Looks like 24 out
of 39 supports the claim at the 92% level.

However, that's not how you does statistics. You don't compute the
P-Value and then fish around for a significance level that supports
your claim (or rejects it depending what side of the argument you are
on.

And of course this doesn't even address the single-blind nature of the
test. See http://en.wikipedia.org/wiki/Clever_Hans

The data from badly designed experiments is useless for analysis. I
would have thought that was obvious.

John Atkinson[_2_]

On Jan 18, 8:23*am, "Arny Krueger" wrote:
"John Atkinson" wrote in

Remind me again how many times Arny Krueger has been
quoted in the Wall Street Journal?

This is not logical discussion or even just rhetoric, this is abuse.

Er, no. It is a straightforward question, Mr. Krueger. How many times
have you been quoted in the WSJ?

At least he has stopped claiming that his neglected, rarely updated,
almost-never-promoted websites get as much traffic as Stereophile's...

No argument from Mr. Krueger about this, at least. :-)

or that his recordings are as commercially available as my own. :-)

Nor this, though I do note that he continues to argue with
professional
recording engineer Iain Churches that his own work is somehow
comparable. BTW, Mr. Krueger, my most-recent choral recording --
see http://www.stereophile.com/news/121007cantus/ -- was No.9
in NPR's Top Next-Generation Classical CDs of 2007. Even if I
am unaware of truncated reverb tails, as you mistakenly claim in
another thread. How are your own choral recordings doing?

John Atkinson
Editor, Stereophile
"Well Informed" - The Wall Street Journal

Oliver Costich

On Fri, 18 Jan 2008 08:21:43 -0500, "Arny Krueger"
wrote:

"MiNe 109" wrote in message

In article ,
Oliver Costich wrote:

On Thu, 17 Jan 2008 12:56:23 -0500, Walt
wrote:

wrote:
On Jan 16, 10:52?am, John Atkinson
wrote:
http://online.wsj.com/article/SB1200...?mod=hpp_us_in...

Money quote: "I was struck by how the best-informed
people at the show -- like John Atkinson and Michael
Fremer of Stereophile Magazine -- easily picked the
expensive cable."

So will you be receiving your $1 million from Randi
anytime soon?

Don't count on it. From TFA: "But of the 39 people who
took this test, 61% said they preferred the expensive
cable." Hmmme. 39 trials. 50-50 chance. How
statistically significant is 61%? You do the math.
(HINT: it ain't.)

Here's the math: Claim is p (proportion of correct
answers) .5. Null hypothesis is p=.5. The null
hypothsis cannot be rejected (and the claim cannot be
supported) at the 95% significance level.

Welcome to the group! Out of curiosity, what significance
level does 61% support?

You haven't formed the question properly. 61% is statisically signifcant or
not, depending on the total number of trials.

Yes. 61% of 39 is not, but 61% of 50 is.

Oliver Costich

On Fri, 18 Jan 2008 07:43:13 -0600, MiNe 109
wrote:

In article ,
"Arny Krueger" wrote:

"MiNe 109" wrote in message

In article ,
Oliver Costich wrote:

On Thu, 17 Jan 2008 12:56:23 -0500, Walt
wrote:

wrote:
On Jan 16, 10:52?am, John Atkinson
wrote:
http://online.wsj.com/article/SB1200...?mod=hpp_us_in...

Money quote: "I was struck by how the best-informed
people at the show -- like John Atkinson and Michael
Fremer of Stereophile Magazine -- easily picked the
expensive cable."

So will you be receiving your $1 million from Randi
anytime soon?

Don't count on it. From TFA: "But of the 39 people who
took this test, 61% said they preferred the expensive
cable." Hmmme. 39 trials. 50-50 chance. How
statistically significant is 61%? You do the math.
(HINT: it ain't.)

Here's the math: Claim is p (proportion of correct
answers) .5. Null hypothesis is p=.5. The null
hypothsis cannot be rejected (and the claim cannot be
supported) at the 95% significance level.

Welcome to the group! Out of curiosity, what significance
level does 61% support?

You haven't formed the question properly. 61% is statisically signifcant or
not, depending on the total number of trials.

Okay, in 39 trials, what level of significance does 61% indicate?

Stephen

About 92%

Oliver Costich

On Fri, 18 Jan 2008 08:45:24 -0600, MiNe 109
wrote:

In article ,
"Arny Krueger" wrote:

"MiNe 109" wrote in message

In article ,
"Arny Krueger" wrote:

"MiNe 109" wrote in message

In article ,
Oliver Costich wrote:

On Thu, 17 Jan 2008 12:56:23 -0500, Walt
wrote:

wrote:
On Jan 16, 10:52?am, John Atkinson
wrote:
http://online.wsj.com/article/SB1200...?mod=hpp_us_in.
..

Money quote: "I was struck by how the best-informed
people at the show -- like John Atkinson and Michael
Fremer of Stereophile Magazine -- easily picked the
expensive cable."

So will you be receiving your $1 million from Randi
anytime soon?

Don't count on it. From TFA: "But of the 39 people
who took this test, 61% said they preferred the
expensive cable." Hmmme. 39 trials. 50-50 chance.
How statistically significant is 61%? You do the
math. (HINT: it ain't.)

Here's the math: Claim is p (proportion of correct
answers) .5. Null hypothesis is p=.5. The null
hypothsis cannot be rejected (and the claim cannot be
supported) at the 95% significance level.

Welcome to the group! Out of curiosity, what
significance level does 61% support?

You haven't formed the question properly. 61% is
statisically signifcant or not, depending on the total
number of trials.

Okay, in 39 trials, what level of significance does 61%
indicate?

In this case nothing, because the basic experiment seems to be so flawed.

In a perfectly designed test with 39 trials, what level of significance
does 61% indicate?

Stephen

Still about 92% and Generalisimo Franco is still dead.

Oliver Costich

On Thu, 17 Jan 2008 15:57:58 -0800 (PST), "Shhhh! I'm Listening to
Reason!" wrote:

On Jan 17, 5:25*pm, Oliver Costich wrote:

Don't count on it. *From TFA: "But of the 39 people who took this test,
61% said they preferred the expensive cable." Hmmme. *39 trials. 50-50
chance. *How statistically significant is 61%? *You do the math.

Why is this important to you, so much so that you have blasted so many
posts in this thread?

Is this really, really important?

Only is you want to understand what the test tells you and lots of got
it wrong.

(HINT: it ain't.)

To you, and I don't care much either. I do care about such bad logic.

OK then. ;-)

Clyde Slick

On 18 Ian, 18:19, John Atkinson wrote:
On Jan 18, 8:23*am, "Arny Krueger" wrote:

How are your own choral recordings doing?

Singing the blues.

Bill Riel

In article 191df265-d5b6-4ce5-a4ef-
,
says...
On 18 Ian, 14:20, "Arny Krueger" wrote:

One of the inspirations for the development of double blind
testing was my wife

I am touched that you find your wife to be such an inspiriational
experience.
BTW. just what other woman did you double blind test her against?

LOL! Very good.

--
Bill

Oliver Costich

On Fri, 18 Jan 2008 00:48:21 -0800, "JBorg, Jr."
wrote:

Oliver Costich wrote:
Walt wrote:
vinylanach wrote:

So will you be receiving your $1 million from Randi anytime soon?

Don't count on it. From TFA: "But of the 39 people who took this
test, 61% said they preferred the expensive cable." Hmmme. 39
trials. 50-50 chance. How statistically significant is 61%? You do
the math. (HINT: it ain't.)

Here's the math: Claim is p (proportion of correct answers) .5. Null
hypothesis is p=.5. The null hypothsis cannot be rejected (and the
claim cannot be supported) at the 95% significance level.

Well yes, Mr. Costich, the test results aren't scientifically valid but it
didn't disproved that the sound differences heard by participants did
not physically exist.

Of course not. Certainty is not in the realm of statistical analysis.
Let's say you want to claim the a certain coin is biased to produce
heads when flipped. That you flip it 39 times and get 24 heads is not
sufficient to support the claim at a 95% confidence level. If you
lower your standard or do a lot more flips and still get 61%, the
conclusion will change

I'm sure there are audible differences. The issue is whether they are
enough to make consisten determinations. A bigger issue for those of
use who just listen to music is whether the diffeneces are detectable
when you are emotionally involved in the music and not just playing
"golden ears".

George M. Middius

Clyde Slick said:

Remind me again how many times Arny Krueger has been
quoted in the Wall Street Journal?

This is not logical discussion or even just rhetoric, this is abuse.

HUH?????

Krooger is practicing his martyr shtick for church. Only two days till the
next roast.

George M. Middius

MiNe 109 said:

In a perfectly designed test with 39 trials, what level
of significance does 61% indicate?

It is on the web - do your own research.

Thanks! You've been a big help in formulating the correct question.

Stephen, please stop abusing the Krooborg.

Oliver Costich

On Fri, 18 Jan 2008 01:02:47 -0800, "JBorg, Jr."
wrote:

Oliver Costich wrote:
Walt wrote:
John Atkinson wrote:

Remind me again how many times Arny Krueger has been
quoted in the Wall Street Journal?

Ok. So you've been quoted in the WSJ. So have Uri Geller and Ken
Lay.

What's your point?

So has Osama Bin Laden. The point is that he's devoid of a sound
argument.

Mr. Costich, there is no sound argument to improve upon a strawman
arguments. It just doesn't exist.

Agreed.

//Walt

Incidentally Mr. Costich, how well do you know Arny Krueger if you
don't mind me asking so.

I only know of his existence from the news group, if that's his real
name:-) BTW, I don't necessarily agree with much of his opinion.

Oliver Costich

On Fri, 18 Jan 2008 01:19:57 -0800, "JBorg, Jr."
wrote:

Oliver Costich wrote:

I think that the Nobel Prize also pays a million bucks. I'd go for the
double play:-)

What sort of test should one have in mind for this type of opportunity
to ensure success, Mr. Costich ?

The proof is in the pudding:-)

George M. Middius

John Atkinson said:

How are your own choral recordings doing?

No calls to the plumber in the last three months, thank you very much.

Oliver Costich

On Thu, 17 Jan 2008 15:54:42 -0800 (PST), "Shhhh! I'm Listening to
Reason!" wrote:

On Jan 17, 5:15*pm, Oliver Costich wrote:

In other words, that 61% of a sample of 39 got the correct result
isn't sufficient evidence that in the general population of listeners
more than half can pick the better cable.

So, I'd say "that's hardly that".

I'm curious what percent of the "best informed" got. I mean, you could
mix in hot dog vendors, the deaf, people who might try to fail just to
be contrary, you, and so on, and get different results. Apparently JA
and MF did better than random chance.

However "random chance" is defined. To make a valid statement about
the abilities of the "best informed", you'd have to define that
population and do the experiment on them. If 24 of them got it right
out of 39, then you'd still not be able to support the calim and the
95% confidence level.

The real issue to me is "who cares". People who want expensive cables,
wires, cars, clothes, or whatever, will buy them. People who want to
tell other people what they should or shouldn't buy will come out of
the woodwork to bitch about it. ;-)

This seems to have really gotten your dander up. Why?

I don't care much about it either. If people want to buy overpriced
stuff based on bogus claims that's fine with me. What bugs me is that
they try to support the claims based on bogus experiments and bad
analysis. I spend way too much time in classrooms trying to
communicate the importance of critical thinking to today's college
students (and it ain't easy) to just let this sloppy logic pass.

By the way, I don't use lamp cord or Home Depot interconnects in my
system.

Oliver Costich

On Fri, 18 Jan 2008 01:21:16 -0800, "JBorg, Jr."
wrote:

Shhhh! wrote:
Oliver Costich wrote:

In other words, that 61% of a sample of 39 got the correct result
isn't sufficient evidence that in the general population of listeners
more than half can pick the better cable.

So, I'd say "that's hardly that".

I'm curious what percent of the "best informed" got. I mean, you could
mix in hot dog vendors, the deaf, people who might try to fail just to
be contrary, you, and so on, and get different results.

Well asked.

What population of listeners was the claim made for and how was it
defined? My guess is that however it's constructed, it a lot bigger
than 39.

Oliver Costich

On Fri, 18 Jan 2008 00:36:18 +0000, Eeyore
wrote:

Oliver Costich wrote:

Back to reality: 61% correct in one experiment fails to reject that
they can't tell the difference.

61% is statistically close enough as doesn't matter to pure 50-50% random choice.

Try flicking coins and see if you get a perfect 50-50 distribution for any given
sample size. You WON'T. In fact pure 50-50 would be the exception by a mile.

No, 61% is as good as proof that there's NO difference. Which there ISN'T of
course. Copper is copper is copper. High pricing, alleged magic and phoney
marketing doesn't make if any different.

Graham

You have to be more specific. 61% may or may not be significant
depending on the sample size and significance level desired. It's also
possible that the hyptohesis based on the claim can be rejected when
it's true and you can fail to reject it when it's false. That's the
nature of statiscal analysis.

Based on the data given, even accepting the design of the experiment
(a different issue), at the 95$ significance level, you can't support
the claim. That does not mean it is false. it justs means there's not
enough evidence.

Sometimes guilty people get acquitted.

Oliver Costich

On Fri, 18 Jan 2008 07:54:54 -0800 (PST), Clyde Slick
wrote:

On 18 Ian, 00:15, Oliver Costich wrote:
On Wed, 16 Jan 2008 10:52:40 -0800 (PST), John Atkinson

wrote:
http://online.wsj.com/article/SB1200...?mod=hpp_us_in...

Money quote: "I was struck by how the best-informed people at the
show -- like John Atkinson and Michael Fremer of Stereophile
Magazine -- easily picked the expensive cable."

So that's that, then. :-)

John Atkinson
Editor, Stereophile

From the article: Using two identical CD players, I tested a $2,000,
eight-foot pair of Sigma Retro Gold cables from Monster Cable, which
are as thick as your thumb, against 14-gauge, hardware-store speaker
cable. Many audiophiles say they are equally good. I couldn't hear a
difference and was a wee bit suspicious that anyone else could. But of
the 39 people who took this test, 61% said they preferred the
expensive cable.

Back to reality: 61% correct in one experiment fails to reject that
they can't tell the difference. If the claim is that listeners can
tell the better cable more the half the time, then to support that you
have to be able to reject that the in the population of all audio
interested listeners, the correct guesses occur half the time or less.
61% of 39 doesn't do it. (Null hypothesis is p=.5, alternative
hypothesis is p.5. The null hypthesis cannot be rejected with the
sample data given.)

In other words, that 61% of a sample of 39 got the correct result
isn't sufficient evidence that in the general population of listeners
more than half can pick the better cable.

So, I'd say "that's hardly that".

you seem to be mixing difference with preference, you reference both,
for the same test.

For the purpose of statistical analysis it makes no difference.

And just what is the general population of
listeners.

You tell me. I presume that those who attend CES and would be a good
one to use. What would you use and how would you construct a simple
random sample from it?

Are you testing the 99% who don't give a rat's
ass anyway? If so, so what. Or are you testing people who actually
care.

Oliver Costich

On Fri, 18 Jan 2008 08:30:20 -0500, "Arny Krueger"
wrote:

"JBorg, Jr." wrote in message

Arny Krueger wrote:
JBorg, Jr. wrote
Arny Krueger wrote:

More proof that single blind tests are nothing more
than defective double blind tests.

From this article, the author wrote, "... the expensive
cables sounded roughly 5% better. Remember, by
definition, an audiophile is one who will bear any
burden, pay any price, to get even a tiny improvement
in sound." Only 5% ?

Even so, it was proabably 100% imagination.

How can that be so? From the article, it said, "... 39
people who took this test, 61% said they preferred the
expensive cable."

At what percentage do you consider it imagination, and
when it is not.

Well Borg, this post is more evidence that ignorance of basic statistics is
a common problem among golden ears. It's not a well-formed question. It's
not the percentage of correct answers that defines statistical signicance,
its both the percentage of correct answers and the total number of trials.

And the predetermined level of significance.

And, that's all based on the idea that basic experiment was well-designed.

The most fundamental question is whether the experiment was well-designed.

Somehow, this showdown at the CES looked like a
DBT sans blackbox.

Nope. This comment is even more evidence that ignorance of basic
experimental design is a common problem among golden ears. The basic rule
of double blind testing is that no clue other than the independent variable
is available to the listener. In this alleged test, the person who
controlled the cables interacted with the listeners. In a proper DBT, nobody
or anything that could possibly reveal the indentity of the object chosen
for comparison is acessible in any way to the listener.

Harry Lavo

"Oliver Costich" wrote in message
...
On Fri, 18 Jan 2008 09:10:59 -0500, "Harry Lavo"
wrote:

"Arny Krueger" wrote in message
...
"Oliver Costich" wrote in
message

On Thu, 17 Jan 2008 07:32:07 -0500, "Arny Krueger"
wrote:

"Harry Lavo" wrote in message

Somewhere in
your college education, you skipped the class in logic,
I guess.

In my several years of graduate school in mathemeatics, I
skipped neither the logic nor the statistics classes.

Nor did I. I did extensive undergraduate and postgraduate work in math
and
statistics. One of the inspirations for the development of double blind
testing was my wife who has a degree in experimental psychology. Another
was a friend with a degree in mathematics.

Logic is on the side of not making decisions about human
behavior without sufficient testing using good design of
experiment method and statistical analysis.

4 of the 6 ABX partners had technical degrees ranging from BS to PhD.

Very little of the claims about people being able to
discern differences in cables is supported by such
testing.

When it comes to audible differences between cables that is not
supported
by science and math, which is what this thread is about, none of it is
supported by well-designed experiments.

Well, then rather than "braying and flaying" why don't you communicate the
statistics.

As reported 61% of 39 people chose the correct cable. That according to
my
calculator was 24 people.

According to my Binomial Distribution Table, that provides less than a 5%
chance of error...in other words the percentage is statistically
significant. In fact, it is significant at the 98% level....a 2% chance
of
error.

I did in other posts but here's a summary. Hypothesis test of claim
that p.5 (p is the probability that more the half of listeners can do
better than guessing). Null hypothesis is p=.5. The P-value is .0748
but would need to be below .05 to support the claim at the 95%
Confidence Level.

You rounded off .054 to .05. You would need to get a probability of
less than .05 to assert the claim, and NO, .054 isn't "close enough"
for statistical validity. I don't know where you got the 98% from.

I saw your previous post, found it hard to believe with a sample of 39, and
so checked it myself. I used a professionally published 100x100 Binomial
Distribution Table with the correct P value for every combination of
right/total sample. Without checking your math, I'm not about to yield to
your numbers.

And I didn't round off anything...the probabilities are right out of the
table....020 for 24/39 and .008 for 25/39.

Had one more chosen correctly, the error probability would have been less
than 1%, or "beyond a shadow of a doubt".

If it had been 25 instead of 24 it would have supported the claim at
the 95% level but not at 97% or higher. But that's the point. You
don't get to wiggle around the numbers so you get what you want. If it
had been one less, you you still make the claim? What about if 39 more
people did the experiment and only 20 got it right. You can only draw
so much support for a claim from a single sample.

And nothing that can only be tested statistically is "beyond a shadow
of a doubt" unless you mean "supported at a very high level of
confidence" which isn't the case here, even with another correct
"guess". Statistics can only be used to support a claim up to the
probability (1-confidence level) of falsely supporting an invalid
conclusion.

The underlying model for determining whether binary selection is
random is tossing a coin. Tossing a coin 39 times and getting 24 heads
doesn't mean the coin is baised towards heads.

I understand all that...I will hope you intended this for others.

So presumably John and Michael did at least this well to be singled out by
the reporter.

Who obviously was deeply knowledgable about statistics.

Perhaps not, and so he could be wrong. But presumable the test designer
would have corrected him if he were wildly so.

Is this why you are desparately flaying at the test, Arny...inventing
"possibibilites" without a single shred of evidence to support your
conjectures? Because you know (if you truly do know math and statistics)
that the test statistics hold up (but don't have the integrity to say so)?

Shhhh! I'm Listening to Reason!

On Jan 18, 9:26*am, Walt wrote:
Shhhh! I'm Listening to Reason! wrote:

On Jan 17, 5:25 pm, Oliver Costich wrote:

Don't count on it. *From TFA: "But of the 39 people who took this test,
61% said they preferred the expensive cable." Hmmme. *39 trials. 50-50
chance. *How statistically significant is 61%? *You do the math.

Why is this important to you, so much so that you have blasted so many
posts in this thread?

I've blasted "so many posts"? *WTF?

I count three. *This will make four. *You must have me confused with
somebody else.

Please look at who this post was in response to. I think you are
confused about who you are.;-)

Shhhh! I'm Listening to Reason!

On Jan 18, 9:31*am, "Arny Krueger" wrote:
"Walt" wrote in message

Shhhh! I'm Listening to Reason! wrote:
On Jan 17, 5:25 pm, Oliver Costich
wrote:
Don't count on it. *From TFA: "But of the 39 people
who took this test, 61% said they preferred the
expensive cable." Hmmme. *39 trials. 50-50 chance. How statistically
significant is 61%? *You do the
math.

Why is this important to you, so much so that you have
blasted so many posts in this thread?

I've blasted "so many posts"? *WTF?

I count three. *This will make four. *You must have me
confused with somebody else.

I think I counted 7 posts to this thread from ****R. The interesting
question about the Middiot Clique is which of them is less self-aware.

;-)

Right now Stephen, Jenn, ****R and the Middiot himself are duking it out for
the dishonor. ;-)

The difference being, GOIA, is that I did not post the same response
to multiple posts.

We get the fact that you don't consider the methodology valid. Oliver
made double-damned *sure* we knew that he didn't (which was my point).
What you and the others have not responded to is whether that "proves"
no difference existed, or (even more importantly) why it matters to
you in the least.

As I've said before, GOIA, I'm actually in your camp when it comes to
wires and cables. I just don't see why "you people" go so bonkers when
somebody doesn't agree with you. I know, I know, you're just trying to
save them money. But it's theirs to spend as they see fit, isn't it?

LOL!

Oliver Costich

On Fri, 18 Jan 2008 09:08:37 -0800 (PST), Clyde Slick
wrote:

On 18 Ian, 18:00, Oliver Costich wrote:
On Fri, 18 Jan 2008 09:10:59 -0500, "Harry Lavo"
wrote:

"Arny Krueger" wrote in message
...
"Oliver Costich" wrote in
messagenews:7eovo350khiqqsqqk5iisucqn7s7d1pd8s@4ax .com

On Thu, 17 Jan 2008 07:32:07 -0500, "Arny Krueger"
wrote:

"Harry Lavo" wrote in message

Somewhere in
your college education, you skipped the class in logic,
I guess.

In my several years of graduate school in mathemeatics, I
skipped neither the logic nor the statistics classes.

Nor did I. I did extensive undergraduate and postgraduate work in math and
statistics. One of the inspirations for the development of double blind
testing was my wife who has a degree in experimental psychology. Another
was a friend with a degree in mathematics.

Logic is on the side of not making decisions about human
behavior without sufficient testing using good design of
experiment method and statistical analysis.

4 of the 6 ABX partners had technical degrees ranging from BS to PhD.

Very little of the claims about people being able to
discern differences in cables is supported by such
testing.

When it comes to audible differences between cables that is not supported
by science and math, which is what this thread is about, none of it is
supported by well-designed experiments.

Well, then rather than "braying and flaying" why don't you communicate the
statistics.

As reported 61% of 39 people chose the correct cable. *That according to my
calculator was 24 people.

According to my Binomial Distribution Table, that provides less than a 5%
chance of error...in other words the percentage is statistically
significant. *In fact, it is significant at the 98% level....a 2% chance of
error.

I did in other posts but here's a summary. Hypothesis test of claim
that p.5 (p is the probability that more the half of listeners can do
better than guessing). Null hypothesis is p=.5. The P-value is .0748
but would need to be below .05 to support the claim at the 95%
Confidence Level.

You rounded off .054 to .05. You would need to get a probability of
less than .05 to assert the claim, and NO, .054 isn't "close enough"
for statistical validity. *I don't know where you got the 98% from.

Had one more chosen correctly, the error probability would have been less
than 1%, or "beyond a shadow of a doubt".

If it had been 25 instead of 24 it would have supported the claim at
the 95% level but not at 97% or higher. But that's the point. You
don't get to wiggle around the numbers so you get what you want. If it
had been one less, you you still make the claim? What about if 39 more
people did the experiment and only 20 got it right. You can only draw
so much support for a claim from a single sample.

And nothing that can only be tested statistically is "beyond a shadow
of a doubt" unless you mean "supported at a very high level of
confidence" which isn't the case here, even with another correct
"guess". Statistics can only be used to support a claim up to the
probability (1-confidence level) of falsely supporting an invalid
conclusion.

The underlying model for determining whether binary selection is
random is tossing a coin. Tossing a coin 39 times and getting 24 heads
doesn't mean the coin is baised towards heads.

As a practical matter as a "CONSUMER", I don't really care
whether or not a statistically relevant number of people,
from a sample of people I care nothing about, heard differences,
or had a preference. What matters too me, as a "CONSUMER",
is what my particular preference is.

That's fine and as it ought to be. But this thread was about an
experiment that some would like to make claims about.

Harry Lavo

"Oliver Costich" wrote in message
...
On Thu, 17 Jan 2008 17:44:27 -0600, MiNe 109
wrote:

In article ,
Oliver Costich wrote:

On Thu, 17 Jan 2008 12:56:23 -0500, Walt
wrote:

wrote:
On Jan 16, 10:52?am, John Atkinson
wrote:
http://online.wsj.com/article/SB1200...?mod=hpp_us_in...

Money quote: "I was struck by how the best-informed people at the
show -- like John Atkinson and Michael Fremer of Stereophile
Magazine -- easily picked the expensive cable."

So will you be receiving your $1 million from Randi anytime soon?

Don't count on it. From TFA: "But of the 39 people who took this test,
61% said they preferred the expensive cable." Hmmme. 39 trials. 50-50
chance. How statistically significant is 61%? You do the math.
(HINT: it ain't.)

Here's the math: Claim is p (proportion of correct answers) .5. Null
hypothesis is p=.5. The null hypothsis cannot be rejected (and the
claim cannot be supported) at the 95% significance level.

Welcome to the group! Out of curiosity, what significance level does 61%
support?

Stephen

First you have to find out where the 61% came from. In this case, I
presume it is 24 out of 39. From the sample data and the claim about
the population proportion, you can compute a number called the
P-Value, not to be confused with the population probability in the
claim, usually denoted "p". To be able to support a claim that more
than half of the population can do better then guessing, you need the
P-Value for p=.5, which in this case is .07477. To support the claim
that p.5 at the 95% confidence level, you need the P-Value to be less
than (1-significance level). So for 95%, you need a P-Value of less
than .05, for 93% you need a P-Value less than .07. Looks like 24 out
of 39 supports the claim at the 92% level.

However, that's not how you does statistics. You don't compute the
P-Value and then fish around for a significance level that supports
your claim (or rejects it depending what side of the argument you are
on.

The fact is, there is nothing magical about 95%, except that it has been
widely accepted in the scientific community to meet their standards of
"probably so". It gives odds of 19:1 that the null hypothesis is invalid.

A 93% value gives odds of 13:1.

A 99% value gives odds of 99:1.

See, it's all a level of the amount of risk you are willing to take in being
wrong. For me personally, I'd be happy with 90% when it came to making an
audio choice...its not a life or death decision, and I'd happily accept odds
in my favor of 9:1.

In the food business, we typically used the 95% confidence level, but
sometimes set the standard to 99% if the consequences of being wrong were
severe. Coca-Cola may never have launched "New Coke" if they had been that
careful.

And of course this doesn't even address the single-blind nature of the
test. See http://en.wikipedia.org/wiki/Clever_Hans

The data from badly designed experiments is useless for analysis. I
would have thought that was obvious.

Except that nobody has presented any evidence that this was a badly designed
test. On the face of it it was apparently a decently-designed single-blind
test. And single blind tests are not automatically invalid. They just have
a potential weakness that must diligently be guarded against.

Oliver Costich

On Fri, 18 Jan 2008 12:08:44 -0600, MiNe 109
wrote:

In article ,
Oliver Costich wrote:

On Fri, 18 Jan 2008 08:45:24 -0600, MiNe 109
wrote:

In article ,
"Arny Krueger" wrote:

"MiNe 109" wrote in message

In article ,
"Arny Krueger" wrote:

"MiNe 109" wrote in message

In article ,
Oliver Costich wrote:

On Thu, 17 Jan 2008 12:56:23 -0500, Walt
wrote:

wrote:
On Jan 16, 10:52?am, John Atkinson
wrote:
http://online.wsj.com/article/SB1200...ml?mod=hpp_us_
in.
..

Money quote: "I was struck by how the best-informed
people at the show -- like John Atkinson and Michael
Fremer of Stereophile Magazine -- easily picked the
expensive cable."

So will you be receiving your $1 million from Randi
anytime soon?

Don't count on it. From TFA: "But of the 39 people
who took this test, 61% said they preferred the
expensive cable." Hmmme. 39 trials. 50-50 chance.
How statistically significant is 61%? You do the
math. (HINT: it ain't.)

Here's the math: Claim is p (proportion of correct
answers) .5. Null hypothesis is p=.5. The null
hypothsis cannot be rejected (and the claim cannot be
supported) at the 95% significance level.

Welcome to the group! Out of curiosity, what
significance level does 61% support?

You haven't formed the question properly. 61% is
statisically signifcant or not, depending on the total
number of trials.

Okay, in 39 trials, what level of significance does 61%
indicate?

In this case nothing, because the basic experiment seems to be so flawed.

In a perfectly designed test with 39 trials, what level of significance
does 61% indicate?

Stephen

Still about 92% and Generalisimo Franco is still dead.

How about Suharto?

Stephen

Not yet. The time to his death is not normally distributed:-)

Harry Lavo

"Oliver Costich" wrote in message
...
On Fri, 18 Jan 2008 07:43:13 -0600, MiNe 109
wrote:

In article ,
"Arny Krueger" wrote:

"MiNe 109" wrote in message

In article ,
Oliver Costich wrote:

On Thu, 17 Jan 2008 12:56:23 -0500, Walt
wrote:

wrote:
On Jan 16, 10:52?am, John Atkinson
wrote:
http://online.wsj.com/article/SB1200...?mod=hpp_us_in...

Money quote: "I was struck by how the best-informed
people at the show -- like John Atkinson and Michael
Fremer of Stereophile Magazine -- easily picked the
expensive cable."

So will you be receiving your $1 million from Randi
anytime soon?

Don't count on it. From TFA: "But of the 39 people who
took this test, 61% said they preferred the expensive
cable." Hmmme. 39 trials. 50-50 chance. How
statistically significant is 61%? You do the math.
(HINT: it ain't.)

Here's the math: Claim is p (proportion of correct
answers) .5. Null hypothesis is p=.5. The null
hypothsis cannot be rejected (and the claim cannot be
supported) at the 95% significance level.

Welcome to the group! Out of curiosity, what significance
level does 61% support?

You haven't formed the question properly. 61% is statisically signifcant
or
not, depending on the total number of trials.

Okay, in 39 trials, what level of significance does 61% indicate?

Stephen

About 92%

This is wrong, according to my binomial table...should be 98% instead.

Oliver Costich

On Fri, 18 Jan 2008 12:06:54 -0600, MiNe 109
wrote:

In article ,
Oliver Costich wrote:

On Thu, 17 Jan 2008 17:44:27 -0600, MiNe 109
wrote:

In article ,
Oliver Costich wrote:

On Thu, 17 Jan 2008 12:56:23 -0500, Walt
wrote:

wrote:
On Jan 16, 10:52?am, John Atkinson wrote:
http://online.wsj.com/article/SB1200...?mod=hpp_us_in..
.

Money quote: "I was struck by how the best-informed people at the
show -- like John Atkinson and Michael Fremer of Stereophile
Magazine -- easily picked the expensive cable."

So will you be receiving your $1 million from Randi anytime soon?

Don't count on it. From TFA: "But of the 39 people who took this test,
61% said they preferred the expensive cable." Hmmme. 39 trials. 50-50
chance. How statistically significant is 61%? You do the math.
(HINT: it ain't.)

Here's the math: Claim is p (proportion of correct answers) .5. Null
hypothesis is p=.5. The null hypothsis cannot be rejected (and the
claim cannot be supported) at the 95% significance level.

Welcome to the group! Out of curiosity, what significance level does 61%
support?

Stephen

First you have to find out where the 61% came from. In this case, I
presume it is 24 out of 39. From the sample data and the claim about
the population proportion, you can compute a number called the
P-Value, not to be confused with the population probability in the
claim, usually denoted "p". To be able to support a claim that more
than half of the population can do better then guessing, you need the
P-Value for p=.5, which in this case is .07477. To support the claim
that p.5 at the 95% confidence level, you need the P-Value to be less
than (1-significance level). So for 95%, you need a P-Value of less
than .05, for 93% you need a P-Value less than .07. Looks like 24 out
of 39 supports the claim at the 92% level.

Thanks!

However, that's not how you does statistics. You don't compute the
P-Value and then fish around for a significance level that supports
your claim (or rejects it depending what side of the argument you are
on.

Can you fish around for confidence levels?

Confidence levels, significance levels - two sides of the same coin.
E.g., 95% confidence is same as 5% significance.

And of course this doesn't even address the single-blind nature of the
test. See http://en.wikipedia.org/wiki/Clever_Hans

The data from badly designed experiments is useless for analysis. I
would have thought that was obvious.

Which wire did Clever Hans prefer?

Stephen

Shhhh! I'm Listening to Reason!

On Jan 17, 6:36*pm, Eeyore
wrote:

No, 61% is as good as proof that there's NO difference.

That's not true, of course. I'd have to believe that even good old
insane Arns would disagree with this statement.

For one thing, if a test design is not valid to prove a difference
exists, it is certainly not valid to prove one doesn't.

George M. Middius

Shhhh! said:

On Jan 17, 5:25 pm, Oliver Costich wrote:
Why is this important to you, so much so that you have blasted so many
posts in this thread?

I've blasted "so many posts"? *WTF?

Please look at who this post was in response to. I think you are
confused about who you are.;-)

'Borgs are interchangeable, note.

Oliver Costich

On Fri, 18 Jan 2008 13:01:00 -0800 (PST), "Shhhh! I'm Listening to
Reason!" wrote:

On Jan 18, 9:31*am, "Arny Krueger" wrote:
"Walt" wrote in message

Shhhh! I'm Listening to Reason! wrote:
On Jan 17, 5:25 pm, Oliver Costich
wrote:
Don't count on it. *From TFA: "But of the 39 people
who took this test, 61% said they preferred the
expensive cable." Hmmme. *39 trials. 50-50 chance. How statistically
significant is 61%? *You do the
math.

Why is this important to you, so much so that you have
blasted so many posts in this thread?

I've blasted "so many posts"? *WTF?

I count three. *This will make four. *You must have me
confused with somebody else.

I think I counted 7 posts to this thread from ****R. The interesting
question about the Middiot Clique is which of them is less self-aware.

;-)

Right now Stephen, Jenn, ****R and the Middiot himself are duking it out for
the dishonor. ;-)

The difference being, GOIA, is that I did not post the same response
to multiple posts.

We get the fact that you don't consider the methodology valid. Oliver
made double-damned *sure* we knew that he didn't (which was my point).
What you and the others have not responded to is whether that "proves"
no difference existed, or (even more importantly) why it matters to
you in the least.

As I've said before, GOIA, I'm actually in your camp when it comes to
wires and cables. I just don't see why "you people" go so bonkers when
somebody doesn't agree with you. I know, I know, you're just trying to
save them money. But it's theirs to spend as they see fit, isn't it?

LOL!

My point was that even if the test was well designed (another thread
for another day, and I won't be the-) that the data don't support
the claim that people can distinuish one of these cables from the
other. That doesn't mean that they can't. It means that this test
doesn't support it.

If the data don't support the claim, it doesn't matter what the design
was.

Oliver Costich

On Fri, 18 Jan 2008 09:19:48 -0800 (PST), John Atkinson
wrote:

On Jan 18, 8:23*am, "Arny Krueger" wrote:
"John Atkinson" wrote in

Remind me again how many times Arny Krueger has been
quoted in the Wall Street Journal?

This is not logical discussion or even just rhetoric, this is abuse.

Er, no. It is a straightforward question, Mr. Krueger. How many times
have you been quoted in the WSJ?

And what would you hope to discern from that number?

At least he has stopped claiming that his neglected, rarely updated,
almost-never-promoted websites get as much traffic as Stereophile's...

No argument from Mr. Krueger about this, at least. :-)

or that his recordings are as commercially available as my own. :-)

Nor this, though I do note that he continues to argue with
professional
recording engineer Iain Churches that his own work is somehow
comparable. BTW, Mr. Krueger, my most-recent choral recording --
see http://www.stereophile.com/news/121007cantus/ -- was No.9
in NPR's Top Next-Generation Classical CDs of 2007. Even if I
am unaware of truncated reverb tails, as you mistakenly claim in
another thread. How are your own choral recordings doing?

John Atkinson
Editor, Stereophile
"Well Informed" - The Wall Street Journal

Shhhh! I'm Listening to Reason!

On Jan 18, 11:44*am, Oliver Costich
wrote:
On Thu, 17 Jan 2008 15:54:42 -0800 (PST), "Shhhh! I'm Listening to

Reason!" wrote:
On Jan 17, 5:15*pm, Oliver Costich wrote:

In other words, that 61% of a sample of 39 got the correct result
isn't sufficient evidence that in the general population of listeners
more than half can pick the better cable.

So, I'd say "that's hardly that".

I'm curious what percent of the "best informed" got. I mean, you could
mix in hot dog vendors, the deaf, people who might try to fail just to
be contrary, you, and so on, and get different results. Apparently JA
and MF did better than random chance.

However "random chance" is defined. To make a valid statement about
the abilities of the "best informed", you'd have to define that
population and do the experiment on them. If 24 of them got it right
out of 39, then you'd still not be able to support the calim and the
95% confidence level.

The claim I was basing that question on was the statement about how
the author was impressed with "how easily" JA and MF and the other
"best informed" picked the more expensive cable. Your question would
have to be answered by the author, as I do not know.

The real issue to me is "who cares". People who want expensive cables,
wires, cars, clothes, or whatever, will buy them. People who want to
tell other people what they should or shouldn't buy will come out of
the woodwork to bitch about it. ;-)

This seems to have really gotten your dander up. Why?

I don't care much about it either. If people want to buy overpriced
stuff based on bogus claims that's fine with me. What bugs me is that
they try to support the claims based on bogus experiments and bad
analysis. I spend way too much time in classrooms trying to
communicate the importance of critical thinking to today's college
students (and it ain't easy) to just let this sloppy logic pass.

Fair enough. If you really want to have some fun, read virtually any
post by "ScottW". His sloppy thinking and poor communication will
certainly catch your attention.:-)

What do you teach?

By the way, I don't use lamp cord or Home Depot interconnects in my
system.

I do not use expensive wires or cables in my system. I just don't
really care if others do.

Walt

Shhhh! I'm Listening to Reason! wrote:

Why is this important to you, so much so that you have blasted so many
posts in this thread?

I've blasted "so many posts"? WTF?

Please look at who this post was in response to. I think you are
confused about who you are.;-)

You replied to Oliver's post, but since you snipped everything he wrote
and responded only to what I had written I assumed you were talking to me.

Anyway, as for why it's important to me, well, in 20+ years of following
the cable debate this is the first instance I've seen of a blind test
indicating that differences in speaker cables are audible. So, some
questions about the methodology and statistical analysis are in order.

Maybe JA can really hear the difference between $2k Monster cable and 14
gauge zipcord. If that's actually the case, I'm interested.

//Walt

John Atkinson[_2_]

On Jan 18, 4:51*pm, Walt wrote:
Maybe JA can really hear the difference between $2k Monster cable
and 14 gauge zipcord. *If that's actually the case, I'm interested.

If it was printed in a newspaper, it must be true, right? :-)

John Atkinson
Editor, Stereophile
"Well-informed" - The Wall Street Journal"

George M. Middius

Shhhh! said:

No, 61% is as good as proof that there's NO difference.

That's not true, of course. I'd have to believe that even good old
insane Arns would disagree with this statement.

For one thing, if a test design is not valid to prove a difference
exists, it is certainly not valid to prove one doesn't.

Unless one happens to "know" that all alleged differences are nonexistent,
in which case contrary "test" results are prima facie "wrong" and
conforming "test" results are "proof" that the received "knowledge" is
correct and true.

You seem oddly lacking in the faith necessary to despise high-end audio.
Have you even learned to hate music yet?

JBorg, Jr.[_2_]

Arny Krueger wrote:
JBorg, Jr. wrote
Arny Krueger wrote:
JBorg, Jr. wrote
Arny Krueger wrote:

More proof that single blind tests are nothing more
than defective double blind tests.

From this article, the author wrote, "... the expensive
cables sounded roughly 5% better. Remember, by
definition, an audiophile is one who will bear any
burden, pay any price, to get even a tiny improvement
in sound." Only 5% ?

Even so, it was proabably 100% imagination.

How can that be so? From the article, it said, "... 39
people who took this test, 61% said they preferred the
expensive cable."

At what percentage do you consider it imagination, and
when it is not.

Well Borg, this post is more evidence that ignorance of basic
statistics is a common problem among golden ears. It's not a
well-formed question. It's not the percentage of correct answers that
defines statistical signicance, its both the percentage of correct answers
and the total number of trials. And, that's all based on the idea that
basic experiment was well-designed.
The most fundamental question is whether the experiment was
well-designed.

I made no claim saying that the test result were based upon
well-designed scientific experiment. What I ask regards your
contention claiming that the 61% who preferred the sound
produced by expensive cables did so perhaps based on their
imagination.

Somehow, this showdown at the CES looked like a
DBT sans blackbox.

Nope. This comment is even more evidence that ignorance of basic
experimental design is a common problem among golden ears.

From what I understand was that the participants were not informed
what was playing, and when it was playing.

The basic rule of double blind testing is that no clue other than the
independent variable is available to the listener. In this alleged
test, the person who controlled the cables interacted with the
listeners. In a proper DBT, nobody or anything that could possibly
reveal the indentity of the object chosen for comparison is acessible
in any way to the listener.

I reread the article and it seems to be SBT.

JBorg, Jr.[_2_]

Oliver Costich wrote:
JBorg, Jr. wrote:
Oliver Costich wrote:

Here's the math: Claim is p (proportion of correct answers) .5.
Null hypothesis is p=.5. The null hypothsis cannot be rejected (and
the claim cannot be supported) at the 95% significance level.

Well yes, Mr. Costich, the test results aren't scientifically valid
but it didn't disproved that the sound differences heard by
participants did not physically exist.

Of course not. Certainty is not in the realm of statistical analysis.

Right. Why then Arny and his ilk consistently assert using statistical
analysis during audio testing claiming to proved that the sound
differences heard by audiophiles did so based on their fevered
imagination.

Let's say you want to claim the a certain coin is biased to produce
heads when flipped. That you flip it 39 times and get 24 heads is not
sufficient to support the claim at a 95% confidence level. If you
lower your standard or do a lot more flips and still get 61%, the
conclusion will change

Ok.

I'm sure there are audible differences. The issue is whether they are
enough to make consistent determinations. A bigger issue for those of
use who just listen to music is whether the diffeneces are detectable
when you are emotionally involved in the music and not just playing
"golden ears".

Well then, you agreed that subtle differences do exist.

JBorg, Jr.[_2_]

Oliver Costich wrote:
JBorg, Jr. wrote:
Oliver Costich wrote:

Very little of the claims about people being able to discern
differences in cables is supported by such testing.

I take it you don't recommend testing for such purposes.
Ok then...

I don't recommend badly designed tests and I don't recommend
making statistically invalid claims based on any kind of test.

But the only way to statistically support (or reject) claims about
human behavior is through well designed experiments and real
statistical analysis.

May I interject then based on what you said above that audio
testing such as SBT and ABX/DBT are poorly designed experiments
and will fail to disprove that sound differences heard by audiophiles do
not physically exist.

JBorg, Jr.[_2_]

Oliver Costich wrote:
JBorg, Jr. wrote:
Oliver Costich wrote:
Walt wrote:
John Atkinson wrote:

Remind me again how many times Arny Krueger has been
quoted in the Wall Street Journal?

Ok. So you've been quoted in the WSJ. So have Uri Geller and Ken
Lay.

What's your point?

So has Osama Bin Laden. The point is that he's devoid of a sound
argument.

Mr. Costich, there is no sound argument to improve upon a strawman
arguments. It just doesn't exist.

Agreed.

Ok.

Incidentally Mr. Costich, how well do you know Arny Krueger if you
don't mind me asking so.

I only know of his existence from the news group, if that's his real
name:-)

He made claims that he had submitted peer-reviewed papers in AES.
He also calim to be audio engineer and well educated concerning
statistical analysis in well designed audio experiment. To be honest,
Mr. Costich, he is the worst offender of common sense and has been
pestering this group for a long, long time.

BTW, I don't necessarily agree with much of his opinion.

I am very happy to hear that.

JBorg, Jr.[_2_]

Oliver Costich wrote:
JBorg, Jr.wrote:
Shhhh! wrote:
Oliver Costich wrote:

In other words, that 61% of a sample of 39 got the correct result
isn't sufficient evidence that in the general population of
listeners more than half can pick the better cable.

So, I'd say "that's hardly that".

I'm curious what percent of the "best informed" got. I mean, you
could mix in hot dog vendors, the deaf, people who might try to
fail just to be contrary, you, and so on, and get different results.

Well asked.

What population of listeners was the claim made for and how was it
defined? My guess is that however it's constructed, it a lot bigger
than 39.

No information were provided for that. Still, valid parameter for such
test should exclude participants with personal biases and preferences
and those lacking extended listening experience, as examples.

Thread Tools
Show Printable Version
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Blind listening test!	Michael Mossey	High End Audio	13	April 15th 05 01:21 AM
anyone in LA want to help me do a blind test?	Michael Mossey	High End Audio	87	April 12th 05 11:54 PM
Blind Test of Power Cords	Steven Sullivan	High End Audio	13	February 1st 05 12:26 AM
A Blind Test of Cables	Scott	High End Audio	3	December 22nd 04 01:08 AM
Help requested on blind cable test	Michael Mossey	High End Audio	7	December 3rd 03 07:01 PM

Menu

About Us