July 23rd 05, 03:30 AM
If Stereophile objects to this which I have posted here under Fair
Use, I will email the posting party and get it pulled from their
archive. However being on the publicly accessible section of their
Website, I believe they won't mind.
Robert Harley's book has been slammed, as you know, so herewith, I
think this brings much light on the discussion:
http://www.stereophile.com/asweseeit/182/index7.html
Value judgments & experiments
Editor: As a lifelong lover of serious music and the author of more
than 50 scientific papers, I am well acquainted with both "subjective"
and "objective" approaches to knowledge. I also have the good fortune
to be married to a professional musician, a violinist, and have
witnessed many times the manner in which musical judgments are made.
Thus I was much interested in Robert Harley's thoughtful piece on the
evaluation of audio equipment (Stereophile, Vol.13 No.7). Herewith a
few comments stimulated by Harley's remarks:
First of all, I was surprised that Harley did not attach the most
obvious meaning to Prof. Lip****z's reply to John Atkinson. Needless to
say, I did not overhear this seminal conversation, nor for that matter
have I ever met Lip****z or heard him speak. Nevertheless, the context
strongly suggests that when the professor asked, "Ah, but how do you
know what is good?" he merely left unsaid the (to him) self-evident
qualifying clause, "unless of course you measure it."
In this light the question is no more than a rhetorical device. I doubt
that Lip****z had any intention of opening a deep philosophical
inquiry; he was merely reaffirming the objectivist's habitual mistrust
of raw, unquantified sensory evidence. Harley may well argue that such
skepticism is inappropriate in realms demanding refined aesthetic
judgment, but it is nevertheless a cornerstone of the scientific
method, and as reflexive as a knee-jerk among scientists. Perhaps it is
fortunate for all of us that Harley overlooked (or at least neglected
to mention) this simple interpretation of Lip****z's question;
otherwise we might have been deprived of the inquiry it provoked.
In reply, Harley takes his text from Robert Pirsig's Zen and the Art of
Motorcycle Maintenance. I agree that Zen is a memorable book, with
valuable things to say about self-discovery and self-knowledge. But I
am not aware that it has anything to say about the design of
experiments, which is the true subject of Harley's piece. If reading
assignments are to be made, let me recommend instead a classic paper on
experimental design, "Mathematics of a Lady Tasting Tea," by Sir Ronald
Fisher, one of the founding fathers of modern statistical theory. Here
is a paper that ought to be required reading for all audio equipment
reviewers. The original publication is not easy to find, but it has
been reprinted in James R. Newman's anthology The World of Mathematics,
which in turn has recently reappeared in paperback.
The paper concerns a lady who asserts that her surpassing delicacy of
taste permits her to tell whether the tea or the milk was first added
to the cup when her tea was brewed. (Parallels to the claims made by
certain reviewers will immediately suggest themselves.) How shall her
claim be tested? In a mere 10 pages Fisher lays out with lapidary
clarity the principles which underlie the design of experiments,
establishes a test protocol suitable to this case, examines the
significance of all possible outcomes, and discusses various
modifications and elaborations of the test procedure. No mathematical
skills beyond elementary arithmetic are required to follow the
argument.
One of the points which Fisher emphasizes most strongly is that only an
exact hypothesis can be tested. The hypothesis in this instance (Fisher
calls it the "null hypothesis") is that the lady lacks the power of
discrimination she claims, in which case the number of teacups she
successfully identifies will eventually and inevitably approach the
number attainable by chance alone. This is, of course, a limiting
operation, and demands in principle an experiment of infinite duration.
The hypothesis can be disproved, however, in relatively few trials, by
the attainment of a score sufficiently remote from a chance outcome.
The point which almost all lay persons (and I dare say many scientists
as well) fail to grasp is that if the null hypothesis is disproved, its
opposite is not thereby proved. This appears to contravene common
sense; surely if the lady makes a highly improbable number of correct
identifications, she is likely to possess some power of discrimination.
Indeed she probably does, but this is an inexact hypothesis and
therefore admits at most a statistical interpretation, not a proof.
The only other exact hypothesis is that she possesses unfailing power
of discrimination, and it is once again clear that this hypothesis can
be disproved by a single error of judgment, but can never be proved by
any finite amount of experimentation. Einstein clearly illustrated this
principle when he said, "No amount of experimentation can ever prove me
right. A single experiment at any time can prove me wrong." It is the
everlasting falsifiability of hypotheses which distinguishes genuine
science from, say, creationism.
It is worth noting that only extremely simple judgments are involved in
the foregoing example, those with answers which contain at most a few
bits of information. Some questions in the audio business are of this
type ("Do amplifier A and amplifier B sound the same?"), but
most-including those of greatest importance-are not. ("Is amplifier
A or amplifier B a better amplifier?")
I wish Harley had drawn this distinction more clearly, because the two
types of question demand very different procedures for arriving at an
answer. In particular, blind testing, which Harley deplores, is clearly
essential to answer questions of the first type, but may or may not be
appropriate in answering questions of the second type. On the other
hand, Harley makes a point too often ignored, which is that comparative
value judgments enter every stage of the recording business, from the
choice of performers and venue to the choice of processing plant, and
it seems inconsistent (or at least needlessly restrictive) to condemn
them when applied to the choice of playback equipment.
I can contribute some anecdotal evidence which supports Harley's views
on the importance of "subjective" reviewing. Over the years I have
watched my wife's progress from her original mass-produced Mittenwald
student violin to her present Italian master violin, a 1762 Carlo
Antonio Testore. I can attest that, to a professional musician, the
choice of a performing instrument dwarfs all other decisions in life
except possibly the choice of a spouse. The process takes many weeks.
The candidate violins are tested first in the luthier's workshop, then
at home, then in concert, then at home again. They are tested with
scales and finger exercises, then with Vivaldi and Bach, then with
Mozart and Paganini, then with Tchaikovsky, Bruch, and Berg. Strings
are replaced, bridges are exchanged, sound posts are tweaked this way
or that. Variations in humidity, temperature, ambience, mood, and
fatigue are taken into account. Other musicians are solicited for their
opinions. Agonies of vacillation and indecision are suffered until at
last, with trembling heart and crossed fingers, a final choice is made
and the prospective purchaser turns to the task of obtaining a second
mortgage on the house.
What I find remarkable about this process is the extent to which it
resembles the evaluation of a major high-end component or system.
Substitute Krell for Cremona and you have a fairly accurate description
of the behavior of an obsessed audiophile. The comparison is not
intended to disparage the audiophile; on the contrary, to my mind it
legitimizes or validates his behavior. He is behaving exactly as a
musician would under the circumstances.
It might be thought that the musician has no alternative, that there
exist no "objective" criteria for the evaluation of violins, but this
is not true. Thanks to the researches of the Catgut Acoustical Society
and others, one can distinguish good violins from bad with near-perfect
certainty in the laboratory. Nevertheless, it is inconceivable that a
musician would choose an instrument solely on the basis of laboratory
measurements, without having heard it "under the ear." Between a very
good violin and a superlative one there exist differences which
measurement cannot yet reveal. The trained ear is capable of levels of
discrimination far exceeding anything that can be caught in the coarse
net of available diagnostic techniques.
This does not, however, excuse us from the task of endeavoring to
refine those techniques. Much more needs to be said about this, and
Harley scarcely touches upon it. Forty years ago, when I was learning
electroacoustics at the feet of F. V. Hunt and B. B. Drisko, it was
indelibly impressed on me that music lives in its transients, and
nothing I have seen or heard since has caused me to change that
opinion. In the absence of the initial "ictus" or consonant of speech,
one can scarcely distinguish a softly blown trumpet from a flute. Yet
almost all the laboratory tests contained in a typical "objective"
review are performed in the steady-state and presented in the frequency
domain. Only rarely is anything done in a transient mode and presented
in the time domain. To be sure, a plot of impulse response has become a
more or less standard feature of loudspeaker reviews, but usually only
as a stepping stone to the derivation of frequency response by Fourier
transform. Similarly, photos of squarewave response often accompany
amplifier reviews, but only to illustrate the behavior under reactive
load in qualitative fashion.
In all essential respects the repertoire of laboratory measurements is
no larger today than it was half a century ago, when a pair of 2A3s
represented the acme of high-fidelity amplification and the attainment
of flat response to the extremes of human hearing was the principal
goal of designers. In the realm of measurement at least, audio
engineers have proved almost as resistant to change as automotive
engineers, who still employ the system of units established by James
Watt in 1770.
What accounts for this reluctance to devise newer and more revealing
diagnostic techniques? I don't know, but I recall a time when it was
not so. Around 1950, when the McIntosh amplifiers first appeared-the
celebrated 50W-2 and 20W-2 on inverted chassis-they quickly drove the
competition from the marketplace because of their evident superiority.
A mere glance at the McIntosh patent showed how cleverly McIntosh had
solved the problem of attaining adequate output-transformer bandwidth,
the bane of plate-coupled vacuum-tube amplifiers. (Remember that in
this era the majority of prospective purchasers knew how to read a
circuit diagram.) It was a "technically sweet" solution, and instantly
recognizable as such. In consequence, there ensued a sort of
underground contest among amplifier designers to contrive a test which
the McIntosh would fail, or at least one on which it would perform
badly. Improvement of the breed had nothing to do with this effort. The
contest was motivated purely by professional envy and the stakes were
competitive advantage; ie, the prospect of running full-page ads saying
"Try this with your McIntosh!" One of the fruits of this contest was
the interrupted sinewave test: four cycles of sinewave followed by an
equal period of zero input. During the nominally silent period a
sensitive recorder was gated on, and the RMS output of the amplifier
was recorded. Frequency and amplitude of the input were the independent
variables. I do not in fact recall whether McIntosh amplifiers
performed well or badly under this test, but the test itself would seem
to be the perfect tool for the quantification of intertransient
silence-a concept much in vogue these days-and for the
investigation of transient bias shifts under dynamic load, the plague
of vacuum-tube and transistor amplifiers alike. Why has it vanished
from the armamentarium of the technical reviewer?
Despite these comments, I find Harley's efforts to reconcile the two
schools of evaluation praiseworthy on the whole. Although I do not
agree with him on every point, he has consistently tried to contribute
light rather than heat to the discussion, a quality all too rare among
the hypertrophied egos in this field. I hope his article marks the
beginning of a continuing dialog on the fundamentals of the reviewer's
art in Stereophile.-Edward A. Fagen, Newark, DE
This is one of the very few books I have ever returned to the bookstore
(and I have bought thousands of books.)
Why don't I like this book?
1) It contains many, many, factual errors. These errors would be easily
spotted by any freshman physics student, and should have been spotted
by the publisher. For example, the author Robert Harley apparently
doesn't understand the difference between electrical current and
voltage.
2) It doesn't actually explain things. To me, an explanation shows how
something works in terms of basic principles. Mr. Harley simpley states
"facts", e.g., an outboard D/A converter will improve your sound,
without explaining how or why.
3) Many photos and diagrams have mistaken or even irrelevant captions,
leading me to conclude that Mr. Harley doesn't understand his own
diagrams. For example, a diagram of an amplifier that uses feedback is
used to "illustrate" a point about amplifiers that don't use feedback.
This last is the most serious point to me, because it makes me suspect
that much of the technical-looking stuff in the book is included to
impress the reader, not to actually explain things. In other words, it
creates the impression of dishonesty.
To the people who defend the book as not intended for technical
readers, I say this: even a non-technical book should be written by
someone who understand the technical issues, so he or she can explain
things clearly and truthfully. If it turns out that the author doesn't
know the technical stuff, why should we read the book?
I might add that Robert Harley has a very poor reputation among
respected audio engineers and other commentators in the field. Some
audio manufacturers (but not all) pander to him apparently because he
edits a high-end audio magazine, and his reviews can make or break a
product.
>>>
Much of the information in this book is good, but a couple of issues
need to be addressed. Tomlinson Holman, a good engineer, is the man
behind both the THX professional cinema house certification and the
Home THX certification program for home theater components. I have a
substantial issue with, particularly, the latter because it consists of
a secret set of parameters, which are divulged only to licensees under
nondisclosure. Because the requirements are themselves secret, how can
anyone judge their validity, or the comparative value of the
certification?
Mr. Harley, on the other hand, is no engineer at all, nor even a
hands-on amateur, but a promoter. He combines occasionally astute
observations with technical nonsense, so that even when his conclusions
appear to make sense you have no idea how he got there. Simply put, he
often either doesn't know what he's talking about, or he does and is
simply writing what equipment vendors and the gullible want said.
Use, I will email the posting party and get it pulled from their
archive. However being on the publicly accessible section of their
Website, I believe they won't mind.
Robert Harley's book has been slammed, as you know, so herewith, I
think this brings much light on the discussion:
http://www.stereophile.com/asweseeit/182/index7.html
Value judgments & experiments
Editor: As a lifelong lover of serious music and the author of more
than 50 scientific papers, I am well acquainted with both "subjective"
and "objective" approaches to knowledge. I also have the good fortune
to be married to a professional musician, a violinist, and have
witnessed many times the manner in which musical judgments are made.
Thus I was much interested in Robert Harley's thoughtful piece on the
evaluation of audio equipment (Stereophile, Vol.13 No.7). Herewith a
few comments stimulated by Harley's remarks:
First of all, I was surprised that Harley did not attach the most
obvious meaning to Prof. Lip****z's reply to John Atkinson. Needless to
say, I did not overhear this seminal conversation, nor for that matter
have I ever met Lip****z or heard him speak. Nevertheless, the context
strongly suggests that when the professor asked, "Ah, but how do you
know what is good?" he merely left unsaid the (to him) self-evident
qualifying clause, "unless of course you measure it."
In this light the question is no more than a rhetorical device. I doubt
that Lip****z had any intention of opening a deep philosophical
inquiry; he was merely reaffirming the objectivist's habitual mistrust
of raw, unquantified sensory evidence. Harley may well argue that such
skepticism is inappropriate in realms demanding refined aesthetic
judgment, but it is nevertheless a cornerstone of the scientific
method, and as reflexive as a knee-jerk among scientists. Perhaps it is
fortunate for all of us that Harley overlooked (or at least neglected
to mention) this simple interpretation of Lip****z's question;
otherwise we might have been deprived of the inquiry it provoked.
In reply, Harley takes his text from Robert Pirsig's Zen and the Art of
Motorcycle Maintenance. I agree that Zen is a memorable book, with
valuable things to say about self-discovery and self-knowledge. But I
am not aware that it has anything to say about the design of
experiments, which is the true subject of Harley's piece. If reading
assignments are to be made, let me recommend instead a classic paper on
experimental design, "Mathematics of a Lady Tasting Tea," by Sir Ronald
Fisher, one of the founding fathers of modern statistical theory. Here
is a paper that ought to be required reading for all audio equipment
reviewers. The original publication is not easy to find, but it has
been reprinted in James R. Newman's anthology The World of Mathematics,
which in turn has recently reappeared in paperback.
The paper concerns a lady who asserts that her surpassing delicacy of
taste permits her to tell whether the tea or the milk was first added
to the cup when her tea was brewed. (Parallels to the claims made by
certain reviewers will immediately suggest themselves.) How shall her
claim be tested? In a mere 10 pages Fisher lays out with lapidary
clarity the principles which underlie the design of experiments,
establishes a test protocol suitable to this case, examines the
significance of all possible outcomes, and discusses various
modifications and elaborations of the test procedure. No mathematical
skills beyond elementary arithmetic are required to follow the
argument.
One of the points which Fisher emphasizes most strongly is that only an
exact hypothesis can be tested. The hypothesis in this instance (Fisher
calls it the "null hypothesis") is that the lady lacks the power of
discrimination she claims, in which case the number of teacups she
successfully identifies will eventually and inevitably approach the
number attainable by chance alone. This is, of course, a limiting
operation, and demands in principle an experiment of infinite duration.
The hypothesis can be disproved, however, in relatively few trials, by
the attainment of a score sufficiently remote from a chance outcome.
The point which almost all lay persons (and I dare say many scientists
as well) fail to grasp is that if the null hypothesis is disproved, its
opposite is not thereby proved. This appears to contravene common
sense; surely if the lady makes a highly improbable number of correct
identifications, she is likely to possess some power of discrimination.
Indeed she probably does, but this is an inexact hypothesis and
therefore admits at most a statistical interpretation, not a proof.
The only other exact hypothesis is that she possesses unfailing power
of discrimination, and it is once again clear that this hypothesis can
be disproved by a single error of judgment, but can never be proved by
any finite amount of experimentation. Einstein clearly illustrated this
principle when he said, "No amount of experimentation can ever prove me
right. A single experiment at any time can prove me wrong." It is the
everlasting falsifiability of hypotheses which distinguishes genuine
science from, say, creationism.
It is worth noting that only extremely simple judgments are involved in
the foregoing example, those with answers which contain at most a few
bits of information. Some questions in the audio business are of this
type ("Do amplifier A and amplifier B sound the same?"), but
most-including those of greatest importance-are not. ("Is amplifier
A or amplifier B a better amplifier?")
I wish Harley had drawn this distinction more clearly, because the two
types of question demand very different procedures for arriving at an
answer. In particular, blind testing, which Harley deplores, is clearly
essential to answer questions of the first type, but may or may not be
appropriate in answering questions of the second type. On the other
hand, Harley makes a point too often ignored, which is that comparative
value judgments enter every stage of the recording business, from the
choice of performers and venue to the choice of processing plant, and
it seems inconsistent (or at least needlessly restrictive) to condemn
them when applied to the choice of playback equipment.
I can contribute some anecdotal evidence which supports Harley's views
on the importance of "subjective" reviewing. Over the years I have
watched my wife's progress from her original mass-produced Mittenwald
student violin to her present Italian master violin, a 1762 Carlo
Antonio Testore. I can attest that, to a professional musician, the
choice of a performing instrument dwarfs all other decisions in life
except possibly the choice of a spouse. The process takes many weeks.
The candidate violins are tested first in the luthier's workshop, then
at home, then in concert, then at home again. They are tested with
scales and finger exercises, then with Vivaldi and Bach, then with
Mozart and Paganini, then with Tchaikovsky, Bruch, and Berg. Strings
are replaced, bridges are exchanged, sound posts are tweaked this way
or that. Variations in humidity, temperature, ambience, mood, and
fatigue are taken into account. Other musicians are solicited for their
opinions. Agonies of vacillation and indecision are suffered until at
last, with trembling heart and crossed fingers, a final choice is made
and the prospective purchaser turns to the task of obtaining a second
mortgage on the house.
What I find remarkable about this process is the extent to which it
resembles the evaluation of a major high-end component or system.
Substitute Krell for Cremona and you have a fairly accurate description
of the behavior of an obsessed audiophile. The comparison is not
intended to disparage the audiophile; on the contrary, to my mind it
legitimizes or validates his behavior. He is behaving exactly as a
musician would under the circumstances.
It might be thought that the musician has no alternative, that there
exist no "objective" criteria for the evaluation of violins, but this
is not true. Thanks to the researches of the Catgut Acoustical Society
and others, one can distinguish good violins from bad with near-perfect
certainty in the laboratory. Nevertheless, it is inconceivable that a
musician would choose an instrument solely on the basis of laboratory
measurements, without having heard it "under the ear." Between a very
good violin and a superlative one there exist differences which
measurement cannot yet reveal. The trained ear is capable of levels of
discrimination far exceeding anything that can be caught in the coarse
net of available diagnostic techniques.
This does not, however, excuse us from the task of endeavoring to
refine those techniques. Much more needs to be said about this, and
Harley scarcely touches upon it. Forty years ago, when I was learning
electroacoustics at the feet of F. V. Hunt and B. B. Drisko, it was
indelibly impressed on me that music lives in its transients, and
nothing I have seen or heard since has caused me to change that
opinion. In the absence of the initial "ictus" or consonant of speech,
one can scarcely distinguish a softly blown trumpet from a flute. Yet
almost all the laboratory tests contained in a typical "objective"
review are performed in the steady-state and presented in the frequency
domain. Only rarely is anything done in a transient mode and presented
in the time domain. To be sure, a plot of impulse response has become a
more or less standard feature of loudspeaker reviews, but usually only
as a stepping stone to the derivation of frequency response by Fourier
transform. Similarly, photos of squarewave response often accompany
amplifier reviews, but only to illustrate the behavior under reactive
load in qualitative fashion.
In all essential respects the repertoire of laboratory measurements is
no larger today than it was half a century ago, when a pair of 2A3s
represented the acme of high-fidelity amplification and the attainment
of flat response to the extremes of human hearing was the principal
goal of designers. In the realm of measurement at least, audio
engineers have proved almost as resistant to change as automotive
engineers, who still employ the system of units established by James
Watt in 1770.
What accounts for this reluctance to devise newer and more revealing
diagnostic techniques? I don't know, but I recall a time when it was
not so. Around 1950, when the McIntosh amplifiers first appeared-the
celebrated 50W-2 and 20W-2 on inverted chassis-they quickly drove the
competition from the marketplace because of their evident superiority.
A mere glance at the McIntosh patent showed how cleverly McIntosh had
solved the problem of attaining adequate output-transformer bandwidth,
the bane of plate-coupled vacuum-tube amplifiers. (Remember that in
this era the majority of prospective purchasers knew how to read a
circuit diagram.) It was a "technically sweet" solution, and instantly
recognizable as such. In consequence, there ensued a sort of
underground contest among amplifier designers to contrive a test which
the McIntosh would fail, or at least one on which it would perform
badly. Improvement of the breed had nothing to do with this effort. The
contest was motivated purely by professional envy and the stakes were
competitive advantage; ie, the prospect of running full-page ads saying
"Try this with your McIntosh!" One of the fruits of this contest was
the interrupted sinewave test: four cycles of sinewave followed by an
equal period of zero input. During the nominally silent period a
sensitive recorder was gated on, and the RMS output of the amplifier
was recorded. Frequency and amplitude of the input were the independent
variables. I do not in fact recall whether McIntosh amplifiers
performed well or badly under this test, but the test itself would seem
to be the perfect tool for the quantification of intertransient
silence-a concept much in vogue these days-and for the
investigation of transient bias shifts under dynamic load, the plague
of vacuum-tube and transistor amplifiers alike. Why has it vanished
from the armamentarium of the technical reviewer?
Despite these comments, I find Harley's efforts to reconcile the two
schools of evaluation praiseworthy on the whole. Although I do not
agree with him on every point, he has consistently tried to contribute
light rather than heat to the discussion, a quality all too rare among
the hypertrophied egos in this field. I hope his article marks the
beginning of a continuing dialog on the fundamentals of the reviewer's
art in Stereophile.-Edward A. Fagen, Newark, DE
This is one of the very few books I have ever returned to the bookstore
(and I have bought thousands of books.)
Why don't I like this book?
1) It contains many, many, factual errors. These errors would be easily
spotted by any freshman physics student, and should have been spotted
by the publisher. For example, the author Robert Harley apparently
doesn't understand the difference between electrical current and
voltage.
2) It doesn't actually explain things. To me, an explanation shows how
something works in terms of basic principles. Mr. Harley simpley states
"facts", e.g., an outboard D/A converter will improve your sound,
without explaining how or why.
3) Many photos and diagrams have mistaken or even irrelevant captions,
leading me to conclude that Mr. Harley doesn't understand his own
diagrams. For example, a diagram of an amplifier that uses feedback is
used to "illustrate" a point about amplifiers that don't use feedback.
This last is the most serious point to me, because it makes me suspect
that much of the technical-looking stuff in the book is included to
impress the reader, not to actually explain things. In other words, it
creates the impression of dishonesty.
To the people who defend the book as not intended for technical
readers, I say this: even a non-technical book should be written by
someone who understand the technical issues, so he or she can explain
things clearly and truthfully. If it turns out that the author doesn't
know the technical stuff, why should we read the book?
I might add that Robert Harley has a very poor reputation among
respected audio engineers and other commentators in the field. Some
audio manufacturers (but not all) pander to him apparently because he
edits a high-end audio magazine, and his reviews can make or break a
product.
>>>
Much of the information in this book is good, but a couple of issues
need to be addressed. Tomlinson Holman, a good engineer, is the man
behind both the THX professional cinema house certification and the
Home THX certification program for home theater components. I have a
substantial issue with, particularly, the latter because it consists of
a secret set of parameters, which are divulged only to licensees under
nondisclosure. Because the requirements are themselves secret, how can
anyone judge their validity, or the comparative value of the
certification?
Mr. Harley, on the other hand, is no engineer at all, nor even a
hands-on amateur, but a promoter. He combines occasionally astute
observations with technical nonsense, so that even when his conclusions
appear to make sense you have no idea how he got there. Simply put, he
often either doesn't know what he's talking about, or he does and is
simply writing what equipment vendors and the gullible want said.