View Single Post
  #1   Report Post  
 
Posts: n/a
Default Just for Ludovic

Since we all know you are in desparate need of more information on how and
why DBT's are used by Sean Olive and Floyd Toole I include the following for
your edification.

Audio - Science in the Service of Art
by
Floyd E. Toole, Ph.D.
Vice President Engineering, Harman International
Senior Vice President Acoustical Engineering, JBL and Infinity
INTRODUCTION
Audio products must sound good. That is a given. However, the determination
of what constitutes "good
sound" is a matter that has been controversial. Some assert that it is a
matter of personal taste, that our opinions
of sound quality are as variable as our tastes in "wine, persons or song".
This would place audio manufacturers
in the category of artists, trying to appeal to a varying public "taste".
Others, like the author, take a more
pragmatic view, namely that artistry is the domain of the instrument makers
and musicians and that it is the role
of audio devices to capture, store and reproduce their art with as much
accuracy as technology allows. The audio
industry then becomes the messenger of the art. Interestingly, though, this
process has created new "artists", the
recording engineers, who are free to editorialize on the impressions of
direction, space, timbre and dynamics of
the original performance, as perceived by listeners through their audio
systems. Other creative opportunities
exist at the point of reproduction, as audiophiles tailor the fundamental
form of the sound field in listening rooms
by selecting loudspeakers of differing timbral signatures and directivities,
and by adjusting the acoustics of the
listening space with furnishings or special acoustical devices.
To design audio products, engineers need technical measurements.
Historically, measurements have been
viewed with varying degrees of trust. However, in recent years, the value of
measurements has increased
dramatically, as we have found better ways to collect data, and as we have
learned how to interpret the data in
ways that relate more directly to what we hear. Measurements inevitably
involve objectives, telling us when we
are successful. Some of these design objectives are very clear, and others
still need better definition. All of them
need to be moderated by what is audible. Imperfections in performance need
not be unmeasurably small, but
they should be inaudible. Achieving this requires knowledge of
psychoacoustics, the relationship between what
we measure and what we hear. This is a work in progress, but considerable
gains have been made.
This paper reviews scientific work performed by the author and his
colleagues, that aimed to determine
the extent to which listeners agreed on their preferences in sound quality
and, beyond that, to identify
relationships between listener preferences and measurable performance
parameters of loudspeakers. Given that
loudspeakers, listeners and rooms form a complex acoustical system, some of
the effort was necessarily devoted
to identifying the aspects of performance that maximize the performance of
the entire system in real world
circumstances.
ART, SCIENCE AND AUDIO
Science has allowed us to do some remarkable things. It has enabled us to
travel safely back and forth to
the moon, to drive around Mars, to harness atomic energy for peace and to
unleash it for war. Its principles
allow engineers to build tall buildings, immense bridges, and vehicles that
roll on the earth and those that fly
through air and space. These and most other aspects of scientific
accomplishment are associated with "things",
or theoretical concepts far removed from our everyday lives. However, that
same foundation of scientific
knowledge, principles, and methodology has given us the ability to enjoy the
excitement, emotions and sheer
beauty of music, whenever or wherever the mood strikes us.

Music, itself, is art, pure and simple. The composers, performers and the
creators of the musical
instruments are artists and craftsmen. Through their skills, we are the
grateful recipients of sounds that can
create and change moods, that can excite and animate us to dance and sing,
and that form an important
component of our memories. Music is part of all of us and of our lives.
However, in spite of its many capabilities, science cannot describe music.
Beyond the crude notes on a
sheet of music, science has no dimensions to measure the evocative elements
of a good tune. It cannot
technically describe why Pavarotti's tenor voice is so revered, or why the
sound of a Stradivarius violin is held as
an example of how it should be done. Nor can science distinguish, by
measurement, the mellifluous qualities of
trumpet intonations by Wynton Marsalis, and those of a music student who
simply hits the notes. Those are
distinctions that must be made subjectively, by listening. Some scientific
effort has gone into musical
instruments and, as a result, we are getting better at imitating the
desirable aspects of superb instruments in less
expensive ones. We are also getting better at electronically synthesizing
the sounds of acoustical instruments.
However, the determination of what is aesthetically pleasing, remains firmly
based in subjectivity and the arts.
Our audio industry is based on a seemingly simple sequence of events. We
capture a musical
performance with microphones, whose outputs are blended into an electronic
message stored on tape or disc,
which is subsequently reproduced through two or more loudspeakers. This
simple description disguises a
process that is enormously complicated. We know from experience that, in
some ways, the process is
remarkably good. For decades we have enjoyed reproduced music of all kinds
with fidelity sufficient to, at
times, bring tears to the eyes, and send chills down the spine. Still,
critics of audio systems can sometimes point
to timbral characteristics that are not natural, that change the sound of
voices and instruments. They point to
noises and distortions that were not in the original sounds, rendering even
the most eloquent tunes in a brittle and
harsh fashion. They note that closing the eyes does not result in a
perception that the listener is involved in the
performance, enveloped in the acoustical ambiance of a concert hall or jazz
club. They point out that stereo, as
we have known it, is an antisocial system - only a single listener can hear
the reproduction as it was created.
For all of these criticisms there are solutions, some here and now, and some
under development. All of
the solutions are based on science.
How can science, a cold and calculating endeavor if ever there were one,
help with delivering the
emotions of great music? Because, in the space between the performers and
the audience, music exists as sound
waves. Sound waves are physical entities, subject to physical laws, amenable
to technical measurement and
description and, in most important ways, predictable. The physical science
of acoustics allows us to understand
the behavior of sound waves as they travel from the musician to the
listener, whether the performance is "live" or
recorded.
To capture those sound waves, with all of the musical nuances intact, we
need transducers to convert the
variations in sound pressure (the sound waves) that impinge on them into
exact electrical analogs. Microphone
design uses the science of electroacoustics a blend of mechanical and
electrical engineering with a dose of
physics. Once the signal is in the electrical domain, the science and
engineering expertise in electronics is
brought to bear in preserving the integrity of the musical signal.
Contamination by noise and distortions of any
kind must be avoided, as it is prepared for storage and when it is recalled
from storage during playback.
Emerging from the tape or disc recorder, radio or TV, the musical signal is
too feeble to be of any
practical use, so we amplify it (more electronics), giving it the power to
drive loudspeakers or headphones (more
electroacoustics) that convert the electrical signal back into sound. Again,
this all must be done without adding
to or subtracting from the signal, or else it will not sound the way it
should.
Finally, the sound waves radiating from the loudspeakers propagate through
the listening room to our
ears. If there is a seriously weak link in this entire process, this is
probably it. Rooms in our homes are different
from any that would have been used for live performances, or for monitoring
the making of a recording. Our
eyes and our ears both recognize that. The acoustical properties of rooms,
large and small, are in the scientific
domain of architectural acoustics, and from that we could predict that there
would be problems of the kind that
we experience. The complex interactions between room boundaries and speaker
directivity at middle and high
frequencies, and speaker and listener position at low frequencies are
powerful influences in what we hear. With

careful acoustical design and electronic signal manipulations, we are
finding ways to make speakers more
"friendly" to rooms, both in recording studios and in homes.
THE STEREO PRESENT AND THE MULTICHANNEL FUTURE
In stereo there are only two loudspeakers to reconstruct an illusion of the
complex three-dimensional
sound field that existed in the original hall, club or studio. The choice of
two channels was based on the
limitations, then, of what could be stored in the groove of an LP. Even at
that time, it was known that two
channels were insufficient to recreate truly convincing illusions of a
three-dimensional acoustical performance.
The old argument that we only have two ears is valid only in the context of
binaural recordings, where it is
necessary to send specific sounds to each ear independently.
The excitement and realism of the reproduction is enhanced if we have more
channels and more
loudspeakers. This is a relatively new development in music recording and,
understandably, it will take some
time for artists and recording engineers to learn how to use the new format.
Examples from the abortive
quadraphonics era, and from some current efforts, show that not everyone has
learned how to merge
multichannel technology with good taste. However, properly used, the
combination of digital discrete
multichannel technology, and the appropriate loudspeakers can provide the
basis for some amazing sound
experiences - both realistic and contrived.
And, what might the appropriate loudspeakers be? Two-channel stereo, as we
have known it, is not a
"system" of recording and reproduction. The only "rule" is that there are
two channels. At the recording end of
the chain, there are many quite different theories and practices of miking
and mixing the live performance -
ranging from the purist simplicity of two coincident microphones to
multi-microphone, multi-track, pan-potted
and electronically-reverberated mono. They all can be great fun, some of
them even very good, but they are all
different.
At the reproduction end of things, loudspeakers have taken many forms:
forward facing, bipolar
(bidirectional in phase), dipolar (bidirectional out-of-phase),
omnidirectional, and a variety of multi-directional
variants. These, and various sum/difference and delay devices, have been
employed in attempts to coax, from a
spatially-deprived medium, a rewarding sense of space and envelopment. In
the two-channel world, therefore,
the artists could not anticipate how their performances would sound in
homes. It was left to the end user to
create something pleasant. Stereo, therefore, is not an encode/decode
system, but a basis for individual
experimentation.
Nowadays we have elegant active-matrix technologies, such as Lexicon's Logic
7, and Citation's 6-Axis
that allow us to play conventional stereo recordings through multichannel
systems, to introduce us to being truly
enveloped in sound. Whether the experience is realistic, or even tasteful,
will depend on how the recording was
made - remember, there are no standards in stereo.
All of this will get much better when recordings are created to be
reproduced through multiple channels.
Then, if we do not like what we hear, we can legitimately complain to the
artist. Among the discoveries in this
new era, may be that some of the loudspeaker designs that were flattering to
stereo recordings, will be less
appropriate for multichannel sound. For example, many listeners have come to
favor loudspeakers having wide
dispersion, or even multi-directional radiating characteristics, for stereo
recordings. The principle at work here is
that the reflected sounds in the listening room embellish the sounds from
the two loudspeakers, pleasantly
enhancing the impressions of acoustical warmth, depth and spaciousness. This
is as it should be.
Multichannel recordings are created with the knowledge that there are
loudspeakers positioned around the
room in locations optimized to create different directional and spatial
illusions. This is a powerful advantage, in
that it permits the artist to create an enormous range of effects: first, in
movies:
.. close-up speech, intimate whispers (listener in a strong direct sound
field),
.. being surrounded by reverberant space or cheering fans at a game or
concert (multidirectional discrete
sounds),

.. fly-overs, drive-bys, ricochets, etc. (multidirectional discrete sounds),
and in musical
performances:
.. being in a superb concert hall or club (direct sounds from an orchestra in
front, with reflected and
uncorrelated sounds from beside and behind), and
.. being in the middle of a band (multidirectional direct sounds).
These will be convincing only if the loudspeakers can deliver the
appropriate sounds to the listeners' ears.
If this "spatial dynamic range", as I call it, is to be achieved in normal -
i.e. not acoustically treated - listening
rooms, we will very likely need loudspeakers of differing directivities in
different locations and, if we are truly
fussy, we will need more than five channels.
We have begun a voyage of discovery and, depending on how we approach it, it
can be long and tortuous,
or short and sweet. If it is approached with an appropriate blend of art and
science, it could be the latter. It
would be wonderful if the music industry could agree on a standard
methodology for multichannel sound.
However, based on experience, that is extremely unlikely.
At this point it is necessary to introduce the science of psychoacoustics,
the study of relationships
between physical sounds and the perceptions that result from them.
Psychoacoustics allows us to understand and
interpret measurements in ways that relate to what we hear. Such knowledge
is absolutely necessary if we are to
make significant progress in designing better products, especially when
price is a concern.
THE SCIENCE OF AUDIO
Individual points of view are a part of human nature. They enrich our lives
in many ways. It would be a
boring world if we all were attracted to the same music, food, wine and
people. However, it is a serious
disadvantage if one is trying to design products that will be attractive to
the majority of the consumer population.
A point of view, commonly expressed, is that sound is "subjective", that we
all "hear differently", and
therefore not all of us prefer the same loudspeakers. It is also alleged
that different nationalities, and regions
have different preferences in sound. I have always regarded these assertions
with suspicion because, if they were
true, it would mean that there would be different pianos for each of these
regions, different trumpets, bassoons
and kettle drums. Vocalists would change how they sang when they were in
Germany, Britain, and the U.S. I
wonder what Pavarotti's Japanese timbre sounds like? I don't believe it . .
.. and it doesn't happen that way.
The entire world enjoys the same musical instruments and voices in live
performance, and the recording
industry sends the same recordings throughout the world. True, from time to
time, there have been regional
influences that have made differences. I can recall the "east coast / west
coast" sounds associated with some
powerful brands in those locations in the USA. In Britain, the British
Broadcasting Corporation (BBC) provided
loudspeaker designs that became the paradigm for a few years. Everywhere,
there are magazines and reviewers
that are influential. All of these factors change with time. In truth, they
are really minor variations on a common
theme. Underlying it all is a powerful attraction to reproduce sounds as
accurately as possible.
Now, what about those individual preferences in sound quality? This
interesting issue was settled by
conducting many, many listening tests, using many, many listeners and many,
many loudspeakers. To reduce the
influences of price, size and style - factors that we absolutely know can
reveal individuality - all of the tests
were conducted blind. Other physical and psychological factors known to be
sources of bias were also well
controlled [refs. 1-3]. The results were surprisingly clear. When the data
were compiled, it turned out that most
people, most of the time, liked and disliked the same loudspeakers. There
were also differences in the way
listeners performed; most were remarkably consistent in repeated evaluations
of the same product, while some
others changed their opinions of the same product at different listening
sessions.
10
ONE STANDARD DEVIATION FOR
9
HIGH FIDELITY AND EXPERIENCED LISTENERS WITH
F
8
STUDIO MONITOR NORMAL HEARING (3% - 5%)
SPEAKERS
I
7

FIGURE 1: Judged on a FIDELITY scale of 0 to 10, terrible to perfect, this
figure shows results of many
listeners evaluating many loudspeakers from different categories. To the
same scale are shown, symbolically,
the variability of repeated judgements by experienced listeners with hearing
levels close to audiometric zero, and
those who exhibited broadband hearing loss. Also shown is the variability of
data combined from several
listeners with normal hearing.
Figure 1 shows, as might be expected, a rather large range in performance of
the loudspeakers
themselves. It also shows that, in repeated judgements, listeners with
normal hearing exhibited standard
deviations small enough for high statistical significance to be associated
with small (about 0.5 scale unit) rating
differences between products. This important observation is
VARIABILITY
JUDGMENT
0 10 20 30
BROADBAND HEARING LOSS (dB)
paralleled by another, perhaps even more important one: that
groups of such listeners closely agreed with each other. The
finding that hearing loss is a factor is not surprising - listeners
who cannot hear all of the sound must make less reliable
judges. What is surprising is that the deterioration in
performance is so rapid. It is well defined within hearing
losses that would not be regarded as alarming by conventional
audiometric criteria, see Figure 2. This difference, one
assumes, is related to the difficulty of the task: judging sound
accuracy, as opposed to understanding the spoken word. The
hearing loss, in this case, was defined by the average threshold
elevation at frequencies below 1kHz. Those exhibiting this
form of hearing loss, also tended to have loss at high
frequencies. High-frequency loss, by itself, was not a clearly
correlated factor. Reference 4 covers this in detail.
FIGURE 2
CHOOSE YOUR LISTENERS WITH CARE
FIDELITY
10
LISTENERS WITH LOW
9
JUDGMENT VARIABILITY /
NORMAL HEARING
8
7
6
LISTENERS WITH HIGH
5
JUDGMENT VARIABILITY /
4
HEARING LOSS
3
2
1
0
FIGURE 3: A
comparison of
loudspeaker
FIDELITY
evaluations
performed by two
groups, one with
low variability in
their judgments
and the other with
high judgment

Listeners with hearing loss not only exhibit high judgment variability, they
can also exhibit strong
individualistic biases in their judgments. This comes as no surprise, since
such individuals are really in search of
a "prosthetic" loudspeaker that somehow compensates for their disability.
Since the disabilities vary
enormously, so do the biases.
The evidence of Figure 3 is that the group of normal-hearing listeners
substantially agree in their ratings.
Interestingly, the second group shares the opinion of the truly good
speakers, A and B. However, speakers C and
D exhibit characteristics that are viewed as problems by the normal group,
but about which the second group has
substantially no opinion. Based on individual opinions, either C or D could
be the best or worst speaker in the
world. It appears that their disabilities prevented some of the listeners
from hearing certain of the deficiencies.
Sadly, some listeners who fall into the problem category are talented and
knowledgeable musicians or audio
professionals whose vocations may have contributed to their condition.
However articulately their opinions are
enunciated, their views are of value only to them, personally.
The conclusion is clear. If there is any desire to extrapolate the results
of a listening evaluation to the
population at large, it is essential to use representative listeners. In
this context, it appears to be adequate to
employ listeners with broadband hearing levels within about 20 dB of
audiometric zero. According to some
large surveys, this is representative of about 75%, or more, of the
population - an acceptable target audience for
most commercial purposes. This is not an "elitist" criterion.
PRACTICE MAKES PERFECT - USE TRAINED LISTENERS
Listeners in the early tests gained much of their experience "on the job",
while performing the tests.
Some listeners were musicians, and others had professional audio experience,
but most were simply audio
enthusiasts. Probably the single most apparent deficiency of novice
listeners was the lack of a vocabulary to
describe what they heard. Without such descriptions, most listeners found it
difficult to be analytical in forming
their judgments, and to remember how various test products sounded. It was
also clear that, without the
prompting of a well-designed questionnaire, not all listeners paid attention
to all perceptual dimensions, resulting
in judgments that were highly selective.
As the relationships between technically-measurable parameters and their
audible importance became
clearer, it was possible to design training sessions that focused on
improving the ability of listeners to hear and to
identify specific classes of problems in loudspeakers. With the aid of
computers, this training has been refined to
a self-administered procedure, which keeps track of the student's progress
[5]. From this we have also been able
to identify program material that is most revealing of the defects that are
at issue, thus improving the efficiency
and effectiveness of the tests.
BLIND vs. SIGHTED TESTS - SEEING IS BELIEVING
Knowledge of the products that are being evaluated is generally understood
to be a powerful source of
psychological bias. In scientific tests of many kinds, and even in wine
tasting, considerable effort is expended to
ensure the anonymity of the devices or substances being subjectively
evaluated. In audio, though, things are
more relaxed, and otherwise serious people persist in the belief that they
can ignore such factors as price, size,
brand, etc. In some of the "great debate" issues, like amplifiers, wires,
and the like, there are assertions that
disguising the product identity prevents listeners from hearing differences
that are in the range of extremely
small to inaudible. That debate shows no signs of slowing down. In the
category of loudspeakers and rooms,
however, there is no doubt that differences exist and are clearly audible.
To satisfy ourselves that the additional

rigor was necessary, we tested the ability of some of our trusted listeners
to maintain objectivity in the face of
visible information about the products.
The results are very clear, and strongly
supportive of the scientific view. Figure 4 shows that,
in subjective ratings of four loudspeakers, the
FIDELITY
differences in ratings caused by knowledge of the
products is as large or larger than those attributable to
the differences in sound alone. The two left-hand
striped bars are scores for loudspeakers that were large,
expensive and impressive looking, the third bar is the
score for a well designed, small, inexpensive, plastic
three-piece system. The right-hand bar represents a
moderately expensive product from a competitor that
had been highly rated by respected reviewers.
When listeners entered the room for the sighted
tests, their positive verbal reactions to the big beautiful
speakers and the jeers for the tiny sub/sat system
BLIND SIGHTED
FIGURE 4: A comparison of blind vs. sighted foreshadowed dramatic ratings
shifts - in opposite
evaluations of the same products by the same directions. The handsome
competitor's system got a
group of listeners. higher rating; so much for employee loyalty.
Other variables were also tested, and the results
indicated that, in the sighted tests, listeners substantially ignored large
differences in sound quality attributable to
position in the listening room and to program material. In other words,
knowledge of the product identity was at
least as important a factor in the tests as the principal acoustical
factors. Incidentally, many of these listeners
were very experienced and, some of them thought, able to ignore the
visually-stimulated biases [6].
At this point, it is correct to say that, with adequate experimental
controls, we are no longer conducting
"listening tests", we are performing "subjective measurements".
"ZOOMING IN" ON THE PROBLEMS
Inevitably, as we make progress in improving products, the listening tests
no longer include examples of
bad performance. On the 0 to 10, junk to jewels, "fidelity" scale,
therefore, one ends up listening exclusively to
devices that score in the top ten percent, or so. When reminders of bad
sound are removed from the tests, an
interesting thing happens: listeners spontaneously expand the scaling of
their responses to fill more of the range.
In the absence of reminders about how bad things can really be, we become
more critical of the relatively good
sounds we are evaluating.
10
9
8
7
6
5
4
3
2
1
F
I
D
E
L
I
T
10
9
8
7
6
5
4
3
2
1
10
9
8
7
6
5
4
3
2
1

FIGURE 5: What happens when listeners evaluate products that are closely
ranked at the top of a range of
products. Without any "anchor products" to remind them of how things sound
at lower ratings, the response
scale expands to fill a "comfortable" range.
A consequence of the phenomenon illustrated in Figure 5, is that, when they
are auditioned in isolation,
listeners tend to exaggerate the importance of small differences. If one is
focusing attention on the small
differences, that is a good thing. If, however, one is attempting to arrive
at an evaluation of a phenomenon in a
manner that relates to its importance in a total system, it is a distorted
judgment. Real-world examples of this
occur when, for example, one does comparisons of devices that are
fundamentally similar. Many electronic
devices fall into this category, as do loudspeakers that differ only in
minor details. The listener impressions may
be that there is an undisputed winner, a 9 versus a 6, but the reality is
that, if there were any other variables in the
test, the ratings may differ only by fractions of a point, and may even be
in a different order.
Because of this, another response-scaling method is used when we are
interested only in the relative
performance of devices. It is a preference scale.
1
9876
5
432
1
0
RATING
PREFERENCE
REALLY LIKE
SLIGHT PREFERENCE
LIKE
NEITHER LIKE NOR DISLIKE
MODERATE PREFERENCE
DISLIKE
REALLY DISLIKE
STRONG PREFERENCE
FIGURE 6: The preference rating scale that is used when comparing sounds on
a relative, rather than an
absolute, basis. On the right are the suggestions for rating differences
according to the strength of the
preference.
The function of this kind of response scaling is to establish how listeners
respond to differences between
sounds. Since the scale is not "anchored" by reference to sounds known to be
very good and very bad, there is
nothing to indicate where they stand in any absolute sense.
TECHNICAL MEASUREMENTS
SUBJECTIVE MEASUREMENTS
With statistically reliable, repeatable, numerical subjective ratings of
loudspeakers in hand, it is an
unavoidable temptation to look for orderly relationships with technical data
on the same products. Figure 7
shows that the axial response of a loudspeaker needs to be very smooth, flat
and wide-band in order to achieve
high subjective ratings. One can rightly conclude that this is associated
with a perception of timbral neutrality - a
lack of coloration. But, there is more to the story. Since loudspeakers
function in rooms, it is necessary that the
sounds radiated in other directions also be well behaved, so that the ears
receive similarly neutral sounds after
single and multiple reflections from floor, walls, etc. The extension to the
requirement, therefore, is that the off

axis frequency
responses,
including sound
power, also be
well-behaved
[7,8]. The
design target,
therefore, is a
smooth, flat,
axial response,
with a constant directivity as a function of frequency. Just as axial
behavior, by itself, is an insufficient criterion,
so is sound power.
FIGURE 7: A simplistic view of the relationship between the
spatially-averaged axial frequency response of
loudspeakers and their subjective ratings. This is a necessary, but not
sufficient, criterion of excellence.
Since absolute perfection in transducers and enclosures is still a remote
possibility, it is important to be
able to identify, in measurements, the presence and the audibility of
defects such as resonances. In this way
designers can choose to build the best product possible or, by making
appropriate compromises, the best product
at a given price level. The first step in this process is to develop a
measurement method that allows the eyes to
identify the presence and magnitude of defects that are audible. The second
step is to identify the measured level
at which the defect is audible - the detection threshold below which the
defect ceases to be a problem.
F
I
D
E
L
I
T
Y
015L
15UP
15 DN
FIGURE 8: A simple form of spatial averaging, in which
measurements on several axes can be viewed individually, or in
combination.
A frequency response curve contains evidence of both
resonances and acoustical interference. It is important to be able
to identify which detailed features in the curve are attributable
to each phenomenon. Why? Because, in a room, resonances
will be easy to hear, and interference will be perceptually
attenuated.
Spatial averaging is a simple way to identify resonances
15R since, in the data for each of the microphone positions, the
resonances will be relatively unchanged, while evidence of
acoustical interference will change as a function of microphone
position. When the collection of frequency response curves is
averaged, the visual evidence of resonances remains, while that
for the interference is diminished. It is a very simple, but very effective
analytical method.
0
10

FIGURE 9: A frequency response measurement before (top curve) and after
(bottom curve) spatial averaging.
Data processing of multiple measurements, as in Figure 9, allows us to see,
in high resolution, the
amplitude and "Q" of resonances in complex mechanical and acoustical
systems. These data were gathered in an
anechoic chamber, where it is possible to measure with high resolution over
the entire frequency range.
A practical problem is that many measurements these days are made with FFT
or TDS systems that timewindow
the data so that anechoic measurements can be made in normal rooms. A result
of the time windowing
is that the frequency response data have limited frequency resolution, most
noticeable at the low frequency end
of the spectrum. As a result, it is not
True Level
CANNOT MEASURE WHAT WE HEAR
possible to see high-Q phenomena at low
and middle frequencies.
10
FIGURE 10: Identical high-Q resonances
B
(Q=50) at equal intervals between 20 Hz
0
and 20 kHz, adjusted to the threshold of
audibility, as measured by an FFT
measurement system having a time
20 50 1 00 500 1K 5K 10K 20K window of 17 ms, corresponding to a
FREQUENCY (Hz) frequency resolution of 60 Hz.
Resolution limitations of the kind shown in Figure 10, and worse, are common
among loudspeaker
designers and reviewers, because of the scarcity and expense of anechoic
chambers, and the practical difficulties
in measuring outdoors, in nature's own anechoic space. It means, simply,
that many commonly-used and
published measurements simply cannot reveal visual evidence of certain kinds
of audible problems falling within
a critical portion of the frequency range - that of the human voice.
Another common measurement is one in which the audible frequency range is
divided into equal fixedpercentage
bandwidths, such as 1/3 octaves, or in which a high-resolution measurement
is heavily smoothed, or
spectrally averaged, on a continuous basis. These spectral-averaging devices
have extremely limited utility in the
design and evaluation process. Spatial averaging adds information, whereas
spectral averaging removes spectral
details, making curves look smoother and prettier. The sound, though,
remains unchanged. Is it any wonder that
some people mistrust measurements?
Serious loudspeaker manufacturers need to be able to see, in measurements,
anything that might result in
an audible defect. In terms of the measurement of frequency response, it
means expensive anechoic chambers,
facilities for gathering data over an entire sphere, and elaborate
post-processing of the data. Doing this quickly
and accurately is neither inexpensive nor simple.

THE AUDIBILITY OF RESONANCES
Why is it that resonances are so important? Because they are the fundamental
building blocks of almost
all of the sounds we are interested in hearing. High-Q resonances define the
pitches of voices and instruments.
Medium- and low-Q resonances define the timbres of sounds, allowing us to
distinguish between different voices
and instruments. It is subtle differences in the resonant structure of
sounds that are responsible for the nuances
and shading of tone in musical sounds. Our ears are very highly attuned to
the detection and evaluation of
resonances, and it is therefore no surprise that listeners zero in on them
as unwanted "editorializing" when they
appear in loudspeakers.
In order to be effective, design engineers need to know when a resonance is
present, and if it is audible.
PINK NOISEQ=50
Q=10
Q=1
The techniques described above, help to identify the
presence of resonances. Reference 9 describes, in
detail, how the audibility of resonances is related to
measurements.
FIGURE 11: The amplitude response measured in an
otherwise perfect system, after a single resonance has
been added at the threshold of detection, when listening
to the most revealing signal, pink noise.
dB
In that study, resonances of different Q, at
different frequencies, were added to an otherwise
excellent system, and the amplitudes at which they
100 200 500 1K 2K 5K 10K were just detected by listeners were determined by
FREQUENCY (Hz)experimentation. Different kinds of program material
yielded different thresholds, as did different listening
environments.
Resonances reveal themselves in both the frequency-domain (amplitude and
phase vs. frequency) and the
time domain (impulse response / transient response). As defined by the
Fourier Transform, if there is
misbehavior in one domain, there will be misbehavior in the other, so we
have two ways to look for problems.
It is a matter of fact that high-Q resonances exhibit prolonged ringing in
the time domain, and that low-Q
resonances exhibit little ringing. The irony of this finding is that, as
represented in conventional steady-state
frequency response measurements, Figure 11, the low-Q resonances were
detectable at much lower amplitudes
than resonances of higher Q. If this is so, it means that a treasured belief
is in jeopardy. The popular belief is
that prolonged ringing, by itself, is a reliable indicator of an audible
problem. To test this, we performed the
measurements in the time domain, with the following results.
INPUT SIGNAL FIGURE 12: Pulse responses of an otherwise perfect
system (top curve) to which resonances have been added at
the thresholds of detection.
Q = 50 Looking at Figure 12, it is important to note that, in
terms of audible changes in timbre, these responses are all
equal. To the eyes, though, the long tail of the Q=50
resonance is quite alarming, and the Q=1 response appears
Q = 10 almost perfect. How can this be so?
One important factor would appear to be related to
the portion of the spectrum influenced by a resonance.

High-Q resonances are very narrow-band phenomena. In order for one of these
to be energized by music, a
sound would need to be closely centered on the frequency, and remain there
long enough to impart significant
energy to the resonant system. The higher the Q, the longer is the
"build-up" period. In music, pitches are
constantly changing, and voices and instrumental sounds frequently have
vibrato, a fluctuating pitch. Such
sounds would more often drive a low or medium Q resonance to maximum output,
than they would a high-Q
resonance. Statistically, therefore, with music and speech, lower-Q
resonances would be heard more frequently
than those of higher Q.
The apparent contradiction between perception and measurement, then, begins
with the observation that,
as conventionally measured, the frequency response is a "steady-state"
measurement, showing the resonance
outputs at their maximum amplitude. With music, high-Q resonances are rarely
driven to their maximum
outputs, and so are less audible than the measurement indicates. The problem
is not that the measurements are
wrong, or irrelevant, it is that they are non-linearly related to the
perceptual mechanism in humans, and therefore
must be interpreted. Time-domain measurements are similarly problematic,
since they suggest audibility in the
prolongation of ringing. Such phenomena are wonderfully visible in pulse
responses, tone-burst responses and
the highly ornamental "waterfall" diagrams that digital measurement systems
permit. Truth is that, without
careful interpretation, these are just as misleading as the frequency
responses. The conclusion is that, in practical
situations, if a resonance does not make itself apparent in an accurate,
high-resolution spatially-averaged,
frequency-response measurement, then it is probably not audible. If a
resonance is visible in a frequency
response measurement, its audibility must be assessed by comparison with
data from Figure 11, or better, from
the detailed analysis in reference 9.
An interesting fact now emerges: that the conventional method of specifying
the excellence of frequency
response -± x dB - is almost useless unless the tolerance is very, very
small. For equal audibility, high-Q
phenomena could be ± 5 dB, while moderate-Q resonances could be ± 3 dB and
low-Q and other broadband
deviations could be ± 0.5 dB. Clearly, frequency response curves must be
interpreted, there is no simple "catchall"
kind of tolerance specification that is truly meaningful. Such is life.
Having identified the presence of a problematic resonance, it is the task of
the design engineer to
diagnose the cause, and to prescribe a remedy. All the while, it is
essential to keep a close eye on costs. In this
task, other specialized tools are needed, since the origins can be both
acoustical and mechanical, and they can be
associated with either the drivers or the enclosure. It is a curious
phenomenon that the perception of "boxiness"
in sound, may have nothing to do with the box itself. It is not uncommon for
the offending resonance to have
another origin.
In the analysis of resonances, a laser interferometer/vibrometer is a
powerful ally, in that it can show the
complete vibratory behavior of a surface, such as a loudspeaker diaphragm or
an entire surface of a loudspeaker
enclosure. Vibration is important only if sound is radiated. Surfaces do not
move uniformly, as a piston, at all
frequencies, so measurement at a single point can be misleading. In
practice, portions of the surface move in
opposite directions, simultaneously. Consequently, some vibratory modes
radiate sound very effectively, others
less so, and some not at all. It is important not to go chasing after modes
that simply cannot be audible. The
combination of finite element analysis, in the design of diaphragms and
enclosures, and modal analysis, after the
prototype is built, allow engineers to be intelligent about the way they use
shapes, materials, braces, etc. in
reducing the audibility of resonances. The traditional method of playing
safe, has been to build enclosures from
massive, dense and stiff materials. If cost, size and weight are not
considerations this method works well.
However, enclosures with excellent acoustical performance can be built from
mundane materials, at much lower
cost, with due regard to modal prediction and analysis. In lower cost wooden
enclosures, and especially with
plastic enclosures, this becomes normal engineering practice.

LOUDSPEAKERS AND ROOMS
As important as the enclosures behind loudspeaker diaphragms are, it is the
ones in front that, as often as
not, give us serious problems. The listening room is the final audio
component, and it is the one over which the
loudspeaker manufacturer has little or no control. The first important step
is to design loudspeakers so that they
have a reasonable chance of sounding good in a room.
In a room, listeners hear the direct sound first, followed quickly by early
reflections from floor, ceiling
and sidewalls. Then arrive the multitudes of sounds from many reflecting
surfaces after several reflections - the
reverberation. Ideally, all of these should reinforce a similar timbral
signature in the mind of a listener. This can
only happen if the loudspeakers are designed to radiate similar sounds in
all of the relevant directions. In
technical terms, this reduces to a requirement for constant directivity as a
function of frequency.
The directivity itself can take different forms for different applications,
but is should not change with
frequency. This is a difficult requirement to fill, and it requires special
measurement techniques to allow
engineers to judge their success as products are developed. In brief, it
requires that we know:
.. the nature of sounds radiated in the direction of the listener (the
on-axis/listening-window performance),
.. the nature of sounds that will be reflected from adjacent room surfaces
(vertical and lateral "early-reflection"
performance), and
.. the nature of sounds that will generate the diffuse reverberant sound
field (the sound power).
From these the directivity index can be calculated. It is the combination of
these that matter; any one by
itself is insufficient data.
Loudspeakers that meet these requirements also tend to be the ones that win
listening tests, and there can
be no better reinforcement for a methodology than that. One of the factors
in listening tests is imaging, and that
raises an issue for which there is not a complete answer at the moment: what
is the ideal directivity?
LOUDSPEAKERS FOR STEREO
As discussed earlier, 40-some years with two-channel stereo have yielded
nothing in the way of a clear
direction. Although the majority of loudspeakers sold are traditional
forward-facing "cone and dome" designs, it
is also true that the majority of listeners are not very critical about the
imaging of their systems. Among those
who are, the "high end audiophiles", those designs figure strongly in their
preferences. However, so do designs
of very different kinds, like, dipole, bipole and directional horns. The
perceptual consequences of speakers this
diverse are not subtle. Bipole designs have approximately omnidirectional
sound radiation properties, and
therefore produce energetic reflected sound fields in listening rooms.
Conventional forward-facing systems, and
horn-loaded systems will place the listener in a sound field in which the
direct sound is more prominent. The
more directional the system, the more dominant will be the direct sound.
It is probably correct to say that the majority of listeners find stereo to
be pleasantly embellished if the
room reflections are energetic. The sound tends to be open and spacious,
with a good sense of depth, but the
specific images can be rather vague - in other words, not unlike real
concerts. A positive effect of this vagueness
is that the stereo listening region is enlarged.
However, there is also a category of listeners who respond unfavorably to
this kind of reproduction, and
prefer to have a very specific, almost pinpoint, sense of image position.
Interestingly, this category includes
many recording engineers who, in their studios, require that they be able to
hear, very precisely, the results of
their manipulations. Consequently, recording studios are often acoustically
rather dead, and the loudspeakers
directional (often horn loaded), or placed very close (so-called near-field
listening). However, these same
people, at home, frequently revert to the more spacious version of stereo.
So, go figure.
+ -

LOW BASS MID FREQUENCIES HIGH TREBLE
+ +
FIGURE 13: Forward-firing (left), bidirectional-in-phase - Bipole (center)
and bidirectional out-of-phase -
Dipole (right), are just some of the very different directional patterns
that are used in loudspeakers for stereo
systems. The differences in imaging precision, spaciousness, and soundstage
depth are not subtle. Once
selected, though, the characteristics are applied to all kinds of music,
whether it is appropriate or not.
LOUDSPEAKERS FOR MULTICHANNEL AUDIO
With the introduction of multichannel audio, things are no less complicated.
Multichannel sound should
mean that things become more controlled, that listeners stand a better
chance of hearing the senses of direction
and space that the artists created. In film applications, this is actually
attempted. The film industry, has had
basic standards for playback systems and environments for many years. THX
improved on this, and attempted to
translate it into the home. For the Dolby Surround films of that era, it was
moderately successful. However,
things now are much more confused, with Dolby Digital and DTS sound tracks
and music recordings, and the
Audio DVD around the corner. One senses an impending free-for-all in which
anything goes.
If it were approached logically, however, it seems that there is a scheme
that makes sense. The purpose
of multiple channels with loudspeakers located around the listeners, is to
allow for a large variety of predictable
localization and spatial effects. Since one of these is a sense of intimacy,
wherein the sound from the front
loudspeakers does not energize the listening room reverberant field, there
is a requirement for front loudspeakers
that are predominantly forward firing, with directional control in both
vertical and horizontal planes. The wellknown
THX requirement has been for some directional control in the vertical plane
only. Many of the
implementations have been somewhat less than ideal; simple vertical arrays
of drivers cause severe lobing at the
off-axis angles at which floor and ceiling bounces occur. If we are to
address this issue properly, we need to
incorporate horizontal directional control as well - our ears are in the
horizontal plane, and it is horizontal
reflections that are primarily responsible for the impressions of
spaciousness [10]. We also need to focus on how
well the speakers behave at the off-axis angles of importance - the adjacent
boundary reflections and sound
power.
Predictable directional control is possible with horns, waveguide-loaded
tweeters or complex twodimensional
arrays, all of which can deliver excellent sound, using today's technology.
The alternative,
unattractive in most practical situations, is to move quantities of sound
absorbing material into the room and
cover large areas with it. The lack of attraction has two components, visual
and acoustical. Visually, areas of
sound absorber run contrary to popular themes of interior décor.
Acoustically, absorbers dissipate sound energy
that one has paid good money to create, thus making the speakers work even
harder.
In a multichannel system, the impressions of space and envelopment are
provided by loudspeakers
positioned to the sides of the listeners. Here we enter hostile territory,
with monopoles, dipoles, tripoles and
quadrapoles attempting to be the perfect solutions. Truth is that, while all
of these names have meaning in the
physics of sound, the products bearing them are only crude approximations to
these forms of acoustic behavior.
What really is at issue here is the matter of whether the listener should be
in a predominantly direct or

predominantly reverberant sound field from the surround loudspeakers. There
cannot be a single correct answer
for the loudspeaker configuration until there is agreement at the production
end of the process.
From the perceptual point of view, impressions of ambiguous localization and
spaciousness exist when
sounds arriving at the two ears are uncorrelated, containing many
reflections. If the decorrelation is in the
recording, or is added electronically, spaciousness can be perceived in
headphone listening. It can also be
convincing through conventional forward-facing surround speakers. However,
much (most?) existing recorded
material is deficient in this respect, and additional decorrelation tends to
be beneficial. Using multidirectional (or
multiple) surround speakers is a good method of adding decorrelation through
multiple reflected sounds. The
actual performance, however, is dominated by the geometry and reflective
properties of the room.
At present, movies are still made assuming that audiences will experience a
multiplicity of speakers down
the sides of the cinema, even though the digital discrete surround channels
are all wideband and full range. At
home, this argues for multidirectional surround speakers, strong electronic
decorrelation, or multiple directradiating
monopoles - i.e. business as usual. However, the notion of five identical
loudspeakers is gaining
favor. Multichannel music, still very much in its infancy, also supports the
idea of five identical loudspeakers.
DTS, in its multichannel music demos, goes a step further, and has been
promoting the idea of moving the
surround speakers to the rear of the room, mirroring the left and right
fronts, and delivering sounds of individual
musicians to these locations.
At this point the audience divides into two camps - those who like having
musicians behind them, and
those who do not . . . and the feelings can be very strong. Some see this as
a cause to criticize multichannel
audio. This is where we must invoke the old adage: if you don't like the
message, don't shoot the messenger.
Multichannel audio systems are just delivery systems [11].
SPEAKERS FOR MUSIC AND SPEAKERS FOR MOVIES
Are there fundamental differences between loudspeakers designed for movies
and music? This oftrepeated
concern is really the wrong question. The real question is: are there
fundamental differences between
loudspeakers designed for stereo and those designed for multichannel
systems? The answer is that there may be.
Can multichannel delivery systems work equally well for music and films?
They must or we will have
consumers up in arms, and for very good reasons. A good loudspeaker is a
good loudspeaker, and there is no
practical reason for different standards of performance in loudspeakers
intended for films and music. There are
reasons for differing directivities and, depending on how the loudspeakers
are used, listeners will experience
different imaging and spatial effects depending on the manner in which
loudspeakers radiate their sound into
rooms. That has always been true for stereo, and it remains so for
multichannel systems. Thus, the choice of
loudspeaker directivity should not alter ones expectations of inherent sound
quality.
GOOD BASS IN ROOMS - A BASIC PROBLEM
At low frequencies the room interactions take on a special flavor because of
the way acoustical standing
waves (resonances) dominate what we hear. The quantity and quality of bass
sound is as much or more
determined by the room and how it is set up, as it is the speakers
themselves [12,13,14,15].
This is an enormous frustration for speaker manufacturers. Once the product
is in a box, on a truck, we
have lost control of how the speaker will sound at low frequencies. Gaining
some control, has been the objective
of several projects and products over the years. Short of hiring an
acoustical consultant, and being willing to
rearrange the furniture and possibly the walls, what can be done?
The short answer is "equalize". At this point, some people's hackles are
rising, I am sure. Equalization
has acquired a bad reputation over the years. To some proponents, it is a
"cure all"; set up a microphone, follow
the dancing lights of a "real-time analyzer", reach for the trusty
multifilter equalizer and create a pretty curve.
Such an exercise is almost certainly doomed to disappoint. Steady-state room
curves are "dumb" measurements,
in that the microphone simply adds up all of the incoming sounds, from
whatever direction, after whatever time.

One-third octave analysis, which is typical of these devices, is a crude
measurement. Two ears and a brain are
much more sensitive and analytical, directionally, temporally and
spectrally.
To be successful, equalization should address the problems it has the
potential to remedy. There are some
things equalization can and cannot do:
1. It cannot make a poor speaker sound good. With laboratory measurements to
work from it might make a
good speaker sound better. However, there is nothing that an average
consumer can measure in a listening
room that would reveal problems in a speaker that can be repaired by
equalization. If the speaker has been
competently designed, it should probably be left alone at frequencies above
about 300 to 500 Hz, whatever
the room-curves look like.
2. At frequencies below about 300 to 500 Hz, the system performance is
dictated by the shape and size of the
room, and the position of the speaker and listener within it. Three factors
are interactively operational he
.. solid-angle gains (the proximity of the speaker and listener to adjacent
room boundaries: walls and floor.)
Addressing this requires a broadband "tone-control" kind of equalization.
.. room resonances, which can cause strong peaks and dips, some with quite
high "Q". Before these can be
addressed, the resonances must be identified within a complicated confusion
of peaks and dips caused by
acoustical interference. It is always unwise to try to fill dips -
acoustical cancellations can be
"bottomless pits". Prominent peaks can be individually addressed with
parametric filters set to the
appropriate center frequency, Q, and gain/loss. Identifying those parameters
accurately requires highresolution
measurements; much more detail than is revealed by the traditional
1/3-octave "real-time
analyzers".
.. Acoustical interference caused by the interaction of many reflected sounds
within the room. These are
non-minimum-phase phenomena, and they cannot be addressed with equalization.
Fortunately, our ears
are less sensitive to their effects than our measurement systems, so we
really need to find a measurement
process that diminishes their visibility.
Successful equalization begins with good measurements of what is happening
in the room. Broadband
spectral trends are easily identified even in elementary real-time analyzer
(RTA) measurements. Separating
resonances from acoustical interference requires spatial averaging, just as
it does in speaker design. For this one
needs to make measurements at several different locations throughout the
listening area, and they need to be
averaged. This requires data acquisition and post processing, it cannot be
done with several microphones
plugged into a simple mixer. The measurements need to have high resolution
in the frequency domain, if there is
to be any hope of identifying the key parameters of problematic resonances.
For this job, the RTA fails.
Successful equalization requires a good equalizer. Ideally, this would be a
multiple parametric-filter type.
The JBL Synthesis home theater systems incorporate all of the required
characteristics for successful
equalization. The dedicated digital controller has 95
individually-configurable parametric filters distributed
among the 5.1 channels. In the laboratory, based on high-resolution
spatially-averaged anechoic measurements,
some filters are preset to address small residual problems in the speakers
themselves. The equalizer has helped to
create better speakers. Once the system is installed in the customer's home,
a trained installer arrives with a
custom measurement system to adapt the system to match the room at low
frequencies. The system employs five
microphones connected to a multiplexer, coupled to a laptop computer which,
in turn, is coupled to the digital
processor. In a carefully controlled sequence, the appropriate test signals
are sent through each of the channels,
the measurements are compared to predetermined "target functions", and a set
of filters is automatically designed
that will allow the system performance to approach the target with minimum
error. Built into the system are
safeguards that attempt to prevent it from trying to equalize the
unequalizable. Manual override is always an
option, so human intervention can modify a bad decision by the computer, or
accommodate customer preferences
in spectral balance.
At present, this is an expensive and cumbersome system. However, we are
learning from our experiences
in the real world. We are learning what the optimum target functions are,
how best to instruct the adaptive

equalization process, and what we may be able to leave out of a cost-reduced
version of the system. Ideally, this
should become a standard feature in popularly-priced active woofers and
subwoofers.
A LITTLE FOURIER ANALYSIS
Before leaving this topic, let us address one of the common
misunderstandings of equalization. It is
frequently asserted that equalizers, because they are filters, add ringing,
in the time domain. That is a fact. The
assertion continues that, in addressing a frequency response problem, one
may damage the transient behavior of
the system. That may or may not be the case.
Resonances in speakers and rooms also ring. If the filter is used to correct
a resonance, both the problem
and solution are minimum-phase phenomena. If the measurement of the problem
has been made accurately, and
the parametric filter solution has been designed to match the problem, then
the ringing of the filter is equal and
opposite to the ringing of the problem resonance. The ringing is gone, just
as the measured peak in the frequency
response has been eliminated.
The real trick in this is trying to ensure that the filters are not used to
"fix" something that cannot, and
perhaps need not, be fixed by this means. In the days of RTA's and
1/3-octave multifilter equalizers, success of
this kind was simply not possible. With today's elegant technology, it is
possible, but difficult.
SUBJECTIVISM vs. OBJECTIVISM - IN CONCLUSION
In this lengthy summary we have covered a lot of topics. Much of it was
matter-of-factly technical,
driven by data and the need to measure, and much of it was subjective,
driven by the desire to understand what
we can hear. All of it was oriented towards creating loudspeakers that sound
better.
The literature of audio continues to be sprinkled with letters and articles
debating the merits of science in
audio. The subjectivist stance is that "to hear is to believe", and that is
all that matters. Some of the arguments
conjure images of white-coated engineers with putty in their ears, designing
audio equipment, and not caring
how it sounds, only how it measures. I have never met such a person in my 30
years in audio science and
engineering.
The simple fact is that, without science, there would be no audio as we know
it. Without extensive and
meticulous subjective evaluation, there would be no audio science as we know
it. Without audio science, audio
engineering reverts to trial and error. So, where does this leave us?
Clearly, to be successful in this business,
one must be actively involved with both of the objective and subjective
sides.
A faith in the scientific method is not a blind faith. It is a faith built
on a growing trust that measurements
can guide us to produce better sounding products at every price level, for
every application. The proof, as
always, is in the listening, and one MUST listen.
The Harman International loudspeaker companies, JBL, Infinity, and Revel
have invested heavily in
measurement facilities that allow them to take the fullest advantage of
existing audio science. They have
invested in talented engineers who understand and respect the scientific
method, good sound and great music.
They have invested in elaborate listening rooms where they can enjoy and
criticize the fruits of their labors.
There are people on staff with many years of experience in successfully
probing the frontiers of knowledge in
product design and audio science, and they are equipped to continue those
investigations, to push those frontiers.
The arrival of multichannel audio for films required some adjustments in the
performance objectives of
speakers, certainly at the high end. Multichannel music is another, as yet
ill-defined, challenge. More speakers
in rooms, means less consumer tolerance for large boxes. Merging
loudspeakers with rooms is not easy, and it is
the one remaining large challenge for our industry. We are working on all of
these fronts. Stay tuned.

REFERENCES
1. "Listening Tests, Turning Opinion Into Fact", F.E. Toole, J. Audio Eng.
Soc., vol. 30, pp. 431-445 (1982 June).
2. "Listening Tests - Identifying and Controlling the Variables", F.E.
Toole, Proceedings of the 8th International Conference, Audio Eng, Soc.
(1990
May).
3. "Subjective Evaluation", F.E. Toole, in J. Borwick, ed. "Loudspeaker and
Headphone Handbook - Second Edition", chap. 11 (Focal Press,
London, 1994).
4. "Subjective Measurements of Loudspeaker Sound Quality and Listener
Performance", F.E. Toole, J. Audio Eng. Soc., vol 33, pp. 2-32 (1985
January/February)
5. "A Method for Training of Listeners and Selecting Program Material for
Listening Tests", S. E. Olive, 97th Convention, Audio Eng. Soc.,
Preprint No. 3893 (1994 November).
6. "Hearing is Believing vs. Believing is Hearing: Blind vs. Sighted
Listening Tests and Other Interesting Things", F.E. Toole and S.E. Olive,
97th
Convention, Audio Eng. Soc., Preprint No. 3894 (1994 Nov.).
7. "Loudspeaker Measurements and Their Relationship to Listener Preferences",
F.E. Toole, J. Audio Eng, Soc., vol. 34, pt.1 pp.227-235 (1986
April), pt. 2, pp. 323-348 (1986 May).
8. "Loudspeakers and Rooms for Stereophonic Sound Reproduction", F.E. Toole,
Proceedings of the 8th International Conference, Audio Eng, Soc.
(1990 May).
9. "The Modification of Timbre by Resonances: Perception and Measurement",
F.E. Toole and S.E. Olive, J. Audio Eng, Soc., vol. 36, pp. 122-142
(1988 March).
10. "The Detection of Reflections in Typical Rooms", S.E. Olive and F.E.
Toole, J. Audio Eng, Soc., vol. 37, pp. 539-553 (1989 July/August).
11. "The Future of Stereo", F.E. Toole, Part 1, Audio, vol 81, no.5, pp.
126-142 (1997 May), Part 2, Audio, vol.8, no.6, pp. 34-39 (1997 June).
12. "Perception of Reproduced Sound in Rooms: Some Results from the Athena
Project", P. L. Schuck, S. Olive, J. Ryan, F. E. Toole, S Sally, M.
Bonneville, E. Verreault, Kathy Momtahan, pp.49-73, Proceedings of the 12th
International Conference, Audio Eng. Soc. (1993 June).
13. "The Detection Thresholds of Resonances at Low Frequencies", S.E. Olive,
P. Schuck, J. Ryan, S. Sally, M. Bonneville, J. Audio Eng. Soc. Vol.
45, No. 3 (1997 March.)
14. "The Variability of Loudspeaker Sound Quality Among Four Domestic-Sized
Rooms", S.E. Olive, P. Schuck, J. Ryan, S. Sally, M. Bonneville,
presented at the 99th AES Convention, preprint 4092 K-1 (1995 October).
15. "The Effects of Loudspeaker Placement on Listener Preference Ratings",
S.E. Olive, P. Schuck, S. Sally, M. Bonneville, J. Audio Eng. Soc., Vol.
42, pp. 651-669 (1994 September)
5/8/98 rev. 8/19/99


Harman International Industries, Incorporated 8500 Balboa Blvd., PO Box
2200, Northridge, CA 91329 (818) 893-8411
A New Laboratory for Evaluating
Multichannel Audio Components and Systems
SEAN E. OLIVE, AES Fellow, BRIAN CASTRO, AES Member, AND
FLOYD E. TOOLE, AES Fellow
R&D Group, Harman International Industries Inc., 8500 Balboa Blvd.,
Northridge, CA, 91329, USA
Email:
ABSTRACT
The design criteria, features and acoustic measurements of a new listening
laboratory
designed specifically for listening tests on multichannel loudspeakers and
components
are described. Among its features is a novel automated speaker shuffler that
eliminates
loudspeaker position effects or allows the variable to be efficiently
tested. Other features
include complete computer control of experimental design, control and
collection of
listener data, making listening tests more reliable and efficient.
1.0 INTRODUCTION
Listening tests are the final arbiter for determining whether an audio
product sounds
good, and they play a critical role in the research and development of new
products.
Designing and conducting listening tests that produce reliable and accurate
data is,
however, no simple task. There are many variables other than those under
test that unless
removed or controlled can seriously bias the results [1-9]. Two of the more
difficult
variables to control are the listening room [5],[7],[9] and the position(s)
of the
loudspeakers under test [5],[9] both of which can significantly influence
the sounds that
arrive at listeners' ears and listeners' perceptions of them.
Recently we had the opportunity to design and construct a new
state-of-the-art listening
laboratory to be used for developing and subjectively testing multichannel
loudspeakers
and other components. The goal from the outset was to build and equip a
listening
laboratory that could generate subjective measurements as accurate,
efficient and free of
1

Harman International Industries, Incorporated 8500 Balboa Blvd., PO Box
2200, Northridge, CA 91329 (818) 893-8411
bias as possible. To meet these goals, a large effort went into developing
hardware and
software that would automate the design and control of experiments,
including the
collection, storage and statistical analysis of listener data. Included in
the design is a
novel automated speaker shuffler that performs positional substitution of 9
loudspeakers
so that positional biases can be eliminated or efficiently tested. By
eliminating position as
a variable, the speaker shuffler has reduced the length of a typical
multiple loudspeaker
listening test by a factor of 24:1 making product development faster and
less costly.
Another notable feature of the room is that the acoustics can be easily
varied from almost
hemi-anechoic to semi-reverberant by adding removable reflective panels to
the walls
and ceiling.
This paper describes the rationale, features and measurements of the new
listening
facility, which we call the Multichannel Listening Laboratory (MLL).
Finally, the results
are compared with several current international standards that recommend
performance
criteria for listening rooms intended for critical listening.
1.1 Listening Room Standards
Several standards recommend values for various acoustic parameters that
define listening
room performance. The goal of these standards is to facilitate the
replication of listening
evaluations in different rooms under the same test conditions. This is
particularly
important for radio and television broadcast corporations, audio production
facilities,
large audio equipment manufacturers, and international standards and
research
organizations, all of whom have multiple facilities in which critical
judgments are made
on the same program material or equipment. Ideally, if the listening rooms
and test
conditions in which these judgments are made are sufficiently similar, and
the listeners
have normal hearing are properly trained, then a consensus in opinion should
be possible.
If not, then there is likely something wrong with the test procedure itself.
In reviewing these various standards, a serious problem common to many is
that while
they define tolerances for specific acoustic parameters, they do not
adequately define
how the parameter is to be measured. For example, IEC is the only standard
that specifies
how reverberation time should be measured, even though it has been shown
that RT60
can vary widely depending on the technique used. Unfortunately, this rather
defeats the
purpose of defining a standard in the first place! It is conceivable that
one measurement
method may show the room meets the standard, while another measurement
method may
not. Added to this is the belief, held by some authorities, that in small
rooms,
reverberation time is a parameter of little or no value.
A very good discussion and summary of standards as they relate to the design
of
multichannel listening room intended for loudspeaker listening tests are
given by
Jarvinen et al in [9].
The current standards that recommend listening room performance include:
2

Harman International Industries, Incorporated 8500 Balboa Blvd., PO Box
2200, Northridge, CA 91329 (818) 893-8411
1. IEC Publication 268-13: Sound System Equipment, part 13. Listening Tests
on
Loudspeakers (1985) [10 ]
2. NR-12 A, Technical Recommendation: Sound Control Rooms and Listening
Rooms.
2nd Edition, The Nordic Public Broadcasting Corporation, (1992) [11]
3. ITU-R Recommendation BS.1116: Methods for Subjective Evaluation of Small
Impairments in audio systems including multichannel sound systems, 2nd
Edition
(1997) [12]
4. ITU-R Recommendation BS.775: Multichannel stereophonic sound system with
and
without accompanying picture (1994) [13]
5. EBU Tech 3276 (2nd Edition, 1997 ) [ 14]
6. AES20-1996: Recommended Practice for Professional Audio - Subjective
Evaluation
of Loudspeakers (1996) [15 ]
The standards can be classified according to the intended application of the
listening
room and can be generally classified into two groups. The AES and IEC
standards were
intended for monophonic and stereophonic testing of loudspeakers in typical
domestic
listening rooms. Both these standards are now quite old and the recommended
room sizes
are too small to allow multiple comparison of multichannel systems.
The EBU, ITU and NR standards were drafted primarily by broadcasters and
allow for
much larger control rooms that can accommodate several listeners at a time.
Only the
AES, IEC and ITU standards include recommendations for listening test
methodology.
At the design stage, we did not intentionally set out to meet any of the
above standards.
However, in post-hoc examination have found that our listening room meets
both ITU
and EBU standards in its current configuration in which we have added
reflective and
diffractive surfaces to both the ceiling and walls.
In the following sections we show measurements made in the MLL and compare
these
with various acoustical properties recommended in the above standards. These
properties
include dimensions, floor area, volume, proportions, reverberation time and
background
noise. The values measured for the MLL are compared with the recommended
values in
Table 1 for each standard, and shows that the MLL meets both ITU and EBU
recommendations.
2.0 MULTICHANNEL LISTENING LAB (MLL)
2.1 Room Dimensions
The listening room itself consists of double-wall constructed shell built by
Industrial
Acoustics Corporation (IAC). The dimensions of the MLL were largely dictated
by our
requirements to be able to evaluate up to 3 different 5.1 or 7.1 channel
systems at a time
and accommodate 1-6 listeners. The room also had to be sufficiently large to
accommodate our automated 9-loudspeaker shuffler that requires a space of
approximately 9 m (L) x 1.5 m (W) x 1 m (D). This resulted the following
dimensions:
3

Harman International Industries, Incorporated 8500 Balboa Blvd., PO Box
2200, Northridge, CA 91329 (818) 893-8411
MLL Dimensions
Length 9.14 m
Width 6.58 m
Height 2.59 m
Floor Area 60.20 m 2
Volume 155.92 m 3
As shown in Table 1, the MLL satisfies the recommended volume and floor area
values
specified in ITU, EBU, and N12 standards. The MLL's volume exceeds the IEC
and AES
recommended limits of 110 m 3 and 120 m 3 respectively because the standards
were
intended for small domestic stereo listening rooms.
2.2 Room Proportions
The most problematic performance issue in small listening rooms is
non-uniform low
frequency reproduction caused by standing waves that produce large pressure
peaks and
nulls in the lower 3-4 octaves of the audio range. The distribution and
frequencies at
which these peaks and notches occur are directly related to the geometry of
the room. If
the ratio of the room dimensions is carefully chosen, a more uniform
response is possible.
Walker from the BBC [16] has created a room geometry criterion that has been
adopted
by both the EBU and ITU standards. The "Walker" criterion defines the limits
of the
ratios for length (l), width (w) and height (h) as:
1.1 w = l = 5.4 w -4 (1)
hh h
As shown in Table 1, the ratio of dimensions for the MLL meet the "Walker"
criterion
and therefore satisfies the EBU and ITU standards. The relatively large size
of the MLL
also benefit uniform frequency response in the lower octaves since the first
order width
and length modes are below 25 Hz.
2.3 Background Noise
Accurate and repeatable subjective measurements require a listening room
with low
background noise so those listeners are able to reliably judge the quality
of low-level
signals. Perception of timbre, nonlinear distortion, loudness and spatial
qualities are all
influenced by the presence and masking effects of background noise.
Minimizing background noise in the MLL was carefully considered during the
design and
construction. The IAC double-wall shell itself is located in a large room
that has limited
access to both people and noisy equipment. No part of the shell touches the
structural
walls of the building except the floor, which is mechanically floated.
4

Harman International Industries, Incorporated 8500 Balboa Blvd., PO Box
2200, Northridge, CA 91329 (818) 893-8411
The inner walls and ceiling of the double-wall IAC shell are made of heavy
gauge steel
panels separated 10 cm and filled with fiberglass. The inner surfaces are
perforated with
2.34-mm openings to provide substantial sound absorption inside the room.
The inner
walls are entirely floated and separated from the outer wall of the shell by
a 10 cm space
to minimize mechanical and acoustic transmission of noise.
The room has its own dedicated HVAC system with ventilation silencers and
acoustically
lined ducts that create a comfortable and quiet environment. For experiments
that require
extremely low background noise the room can be cooled and the HVAC can be
completely shut off during the test. The room requires minimal lighting
during the test
itself (i.e. 1 Halogen light) which means that noise from lights is not an
issue. All audio
equipment, other than the required amplifiers, is located outside the room,
and this also
helps to minimize electrical noise as well.
In an effort to simulate the construction of floors found in many homes, a
carpeted
"squeak-free" plywood floor was laid on 5 cm x 15 cm wooden joists separated
41 cm
apart. The joists are mounted on 6.4 mm neoprene pads for isolation from the
concrete
floor beneath. The rationale for constructing this floor is to allow
transmission of low
bass from the loudspeaker through the floor to the listeners' feet, since
the perception of
bass depends on what is felt, as well as what is heard. The front and middle
sections of
the floor can be removed to allow easy access of audio, video and data
cables that run
underneath the floor to access panels both inside and outside the room.
In reviewing the various listening room standards there is a wide range of
recommended
levels for background noise. The most stringent requirements are specified
by the EBU
and ITU standards, which call for minimum level of NR10, not exceeding NR15.
These
rather demanding requirements are likely justified in broadcast environments
where
listeners are frequently required to evaluate small signal linearity, for
example in relation
to CODECS.
At the other extreme, the AES and IEC standards both have rather liberal
recommended
background noise limit of 35 dBA measured using a slow time constant. The
AES
standard has an additional limit of 50 dB C-weighted for low frequency
noise. The less
stringent requirements are likely justified on the basis that they are aimed
at loudspeaker
evaluations in typical domestic environments where background noise levels
are typically
higher.
Figure 1 shows the background noise measured in the MLL with the air
conditioner
turned both on and off. Also plotted are the NR curves 0 through 15. The MLL
noise
curves each represent an average of four measurements take at 4 different
locations
around the listening area. The time over which each measurement was averaged
was 64 s.
The measurement was taken using a Bruel & Kjaer 4179, 1 inch microphone, a
Bruel &
Kjaer preamp Type 2660, and a Bruel & Kjaer real-time analyzer. The low
noise
microphone and preamp allow accurate measurement of sound pressure levels
below the
threshold of hearing, which is necessary at higher frequencies for measuring
rooms below
NR20. Figure 1 shows that with the air conditioning turned off, the MLL
meets NR5,
5

Harman International Industries, Incorporated 8500 Balboa Blvd., PO Box
2200, Northridge, CA 91329 (818) 893-8411
thus meeting the requirements of the EBU and ITU specification. With the air
conditioning turned on the noise increases to NR15.
2.4 Reverberation Time
The reflected sounds and reverberation time in a room have been shown to
have an
important influence on the perception of loudness, timbre and spatial
qualities and speech
intelligibility in both live and reproduced sound. While this is a complex
phenomena, the
acoustic community sees fit to summarize it all in a T60 measurement.
Both the EBU and the ITU standards specify values for the average
reverberation time in
the room. ITU and EBU recommend the value (within a tolerance of ± 0.05 s)
be
determined using the following equation:
1/3
25.0
V
T
=
m
s
(2)
V
ref
where Tm is the average reverberation time between 200 Hz to 4 kHz, V is the
volume of
the room, and Vref is the reference volume of 100 3 m. The EBU also put
limits on the
range of values specifying that the value should lie between 0.2 Tm 0.4
s.
The IEC standard specifies a Tm of 0.3 - 0.6 seconds which is very similar
to the AES
standard that recommends 0.45 s ( ±0.05 s). The N 12-A standard specifies Tm
be
measured in 1/3 octaves between 200 Hz to 2.5 kHz and be determined as a
function of
the floor area using the following equation:
35.0
S
T
=
m
±s 05.0
(3)
S ref
where S is the floor area of the room and S ref is the reference area of 60
2 m.
In addition to specifying the average reverberation time, most of the
standards
recommend that Tm be relatively independent of frequency within a certain
bandwidth
and tolerance. For ITU and EBU standards, the Tm value for each octave band
between
200 Hz - 3.5 kHz should vary no more than ±0.05 s from the calculated
optimum value.
Below 200 Hz, Tm is allowed to increase monotonically with frequency to 0.3
s above the
optimum value. Above 3.5 kHz, the tolerance is increased to ±0.1 s from the
optimal
value.
By substituting the volume of the MLL (155.92 3 m) into equation (2), we
calculate that
Tm should be 0.29 s to meet ITU and EBU standards. According to N 12-A, the
Tm for the
MLL should be 0.35 s.
6

Harman International Industries, Incorporated 8500 Balboa Blvd., PO Box
2200, Northridge, CA 91329 (818) 893-8411
The Tm of the MLL was measured using a MLSSA system from DRA laboratory. The
microphone was a Bruel & Kjaer 4134 microphone. The sound source consisted
of four
JBL Synthesis satellite loudspeakers crossed at 80 Hz over to a JBL
Synthesis Two
subwoofer located in the corner of the room. Each of the four satellites was
located
approximately 2 m apart and aimed at a different corner in an attempt to
create a diffuse
sound field. The measurement shown in Figure 2 represents a spatial average
of four
microphone locations. The average Tm value for the MLL is about 0.23 s,
which is
slightly below the calculated ITU and EBU optimal value of 0.29 s. However,
the curve
falls within the minimum recommended value, and is quite uniform with
frequency, only
rising slightly below 125 Hz.
2.5 Control of Early Reflections
With the advent of 5.1 and 7.1 multichannel and 3D audio playback systems,
there is a
trend among professional and home theater listening room designs towards
lower
reverberation times and the control of early reflections. There are sound
scientific reasons
for doing this, since strong early reflections are known to influence the
perceived spatial
and timbral qualities of reproduced sound [7], [17]. In the new generation
of
multichannel recordings and video disks, the additional center and surround
channels
allow the producer and recording artist to create much more realistic and
spatiallyenriched
environments than ever before. There is less need to use the room's
boundaries
and the loudspeakers' directional characteristics to compensate for the
obvious spatial
deficiencies inherent to stereo.
The EBU standard recommends that all reflections within the first 15 ms
after the arrival
of sound be no greater than 10 dB in level relative to the direct sound from
each sound
source. With multichannel setups the early sound field is rather complex
given that there
are between 5-7 loudspeakers and several boundaries. For example with 5
loudspeakers
and 6 boundaries there are 30 first order reflections and 150 second order
reflections.
Measuring and separating out these reflections is no trivial task. The
reflections from the
floor are particularly problematic to treat since in most facilities, the
floor surfaces must
be hard and reflective to facilitate the movement of people and equipment.
Nonetheless,
several organizations [18], [19] are building such rooms that meet this
reflection-free part
of the specification with the exception of the floor bounce.
In the MLL room, the only significant first order reflections are from the
floor, and these
are attenuated at higher frequencies by the carpet. At listener-loudspeaker
distances
greater than 2 m any reflection with a path length greater than 6.34 m will
be attenuated
10 dB by spreading loss [18]. This effectively eliminates all second order
reflections
since their path length exceeds this value. For front channel sources, first
order
reflections from the side walls will also be sufficiently delayed beyond the
15 ms time
gap. The main culprits are reflections from the front and back walls, and
the ceiling.
Fortunately these surfaces can be made absorptive by simply removing the
reflective
panels so that the absorptive surface is exposed. To reduce flutter echoes
from reflective
surfaces and to increase reverberation, 120 RPG Skylines, an omnidirectional
primitive
root number theory 2D diffusor, are placed on the reflective panels located
on the walls,
7

Harman International Industries, Incorporated 8500 Balboa Blvd., PO Box
2200, Northridge, CA 91329 (818) 893-8411
as well as on the ceiling and areas behind the loudspeaker as shown in
Figures 4 and 5.
These light-weight diffusors are easily removed or relocated, and help
reduce any other
specular reflections that may arrive after the direct sound.
2.6 Automated Speaker Mover
The position of a loudspeaker in a room has a significant impact on its
perceived sound
quality. Changing its position affects the way it couples to the standing
wave modes of
the room, and alters the physical characteristics of broadband reflections
that arrive at the
listener. In listening tests that involve multiple comparisons among
loudspeakers the
positional effects on listeners' ratings can be larger than the differences
between the
loudspeakers under test [8]. Unless these positional effects are controlled,
the results may
be contaminated by a nuisance variable.
For multiple comparison loudspeaker tests, asking human beings to sit behind
a doubleblind
screen and quickly and smoothly substitute the positions of 2-9 loudspeakers
(some
weighing upwards to 100 kg) on command presents an obvious logistical
problem.
Clearly the problem of positional substitution calls for an automated
solution. This
realization led to the development of our own custom-built speaker shuffler.
Prior to
having a speaker shuffler, the positional effects in loudspeaker tests had
been balanced by
testing each loudspeaker in each position. Any position-related bias would
be equally
distributed or balanced across each loudspeaker. More scientifically
rigorous designs go
even further and test all possible loudspeaker-position permutations so that
any possible
context effects between loudspeaker and position are also balanced.
The disadvantage of not having a speaker mover is that an additional number
of trials are
required to balance the variable position. This relationship in illustrated
in Figures 3(a)-
(b), which compare the number of trials required to balance the variable
position in
multiple comparison tests, with and without a speaker mover. The number of
trials is
calculated using the following equation:
Trials of Number = N Positions Speaker ! × N Programs × N Repeats (4)
Where N Speaker Positions equals the number of speaker positions in the
test, N Programs equals
the number of program selections being used and N Repeats is the number of
repeats. In
Figure 3 we, the experimental design shows no repeats, that is N Repeat = 1.
The graphs clearly shows that an automated speaker mover can drastically
reduce the
length of the experiment because the variable N Speaker Positions always
equals 1, regardless
of how many loudspeakers are compared. In comparing the two graphs we see
that there
is a 2:1 advantage for paired comparisons, a 6:1 advantage for triple
comparisons, and a
24:1 advantage for comparisons among four loudspeakers. When you multiply
these
ratios by the number of programs and repeats used in the experimental
design, the
number of trials quickly escalates. For multiple comparisons between four
loudspeakers
using 4 programs with no repeats, a total of 96 trials are required without
a speaker
mover. Having a speaker mover reduces the experiment to 4 trials. This
enormous
8

Harman International Industries, Incorporated 8500 Balboa Blvd., PO Box
2200, Northridge, CA 91329 (818) 893-8411
difference provided the justification to design and build a custom speaker
shuffler, since
over the long-term, it could afford considerable savings in person-listening
hours and
product development time.
A custom-built floor at the front of the room allows us to perform
positional substitution
of up to 9 different loudspeakers. A photograph of the speaker mover set up
for an A/B
stereo loudspeaker comparison is shown in Figure 4. Figure 5 shows a
photograph of the
speaker mover set up for a single comparison of a 5.1 loudspeaker system.
For the
purposes of the photograph the front, side and rear listening curtains have
been retraced
out of the way. Each loudspeaker is attached to one of nine pallets that
move in 1-inch
increments over a range of 4 feet forwards and backwards while the entire
array moves 4
feet to the left and right of the listener. The movement of the floor can be
controlled
manually from a programmable logic controller (PLC), or from a computer that
is linked
serially to the PLC via RS232. This allows all positions of loudspeakers to
be
programmed, stored and recalled quickly. The movement of the floor is
extremely quiet,
repeatable to within 1 inch, and fast. Transit time between positions is no
greater than 3 s,
and most positional changes are under 2 s. The transit speed is also
programmable and
can be decreased or increased if desired. As a safety measure, a light fence
is installed in
front of the moving floor so that if anyone crosses the light beam the
speaker mover
automatically stops.
The speaker shuffler allows position-controlled loudspeaker comparisons in
mono (up to
4 different systems), stereo (4 different systems) or three different
left/center/right
channel loudspeakers. At this time, positional substitution of surround and
rear channel
speakers must be done manually for multichannel experiments. The speakers
can be
placed away from the side and rear boundaries on stands, or placed on
adjustable shelves
that are mounted on baffles made of high-density board, that slide in a
track along the
perimeter of the room.
The moving floor gives us an efficient means to eliminate the effects of
loudspeaker
position, or it can do the reverse, and allow us to test the interaction
effects between
loudspeaker and position. By statistically-averaging a loudspeaker's
performance over a
number of different positions we can assess its off-axis performance, and a
number of
other parameters that are position dependent. All of this becomes essential
as we aim to
design loudspeakers that are 'room friendly' and develop digital room
equalization
systems.
Finally, the speaker mover also allows us to efficiently randomize between
each trial,
how the loudspeaker is identified to the listener (e.g. "A,B,C..). This
ensures that
listeners' judgments in each trial are statistically independent between
program
selections. Without a speaker mover, experimenters normally do not move the
loudspeakers behind the screen until a complete block of programs has been
rated. These
are not independent judgments since the listener knows they are rating the
same
loudspeaker(s) within each block. The extent to which this biases the
results has not yet
been reported.
9

Harman International Industries, Incorporated 8500 Balboa Blvd., PO Box
2200, Northridge, CA 91329 (818) 893-8411
2.7 Blind versus Sighted Listening Tests
It is generally accepted among scientists that psychometric experiments must
be
performed double blind. For audio tests, this means the identities of the
components
under test cannot be made known to the listener, and the experimenter cannot
not directly
control or administer the actual test.
In 1996 Toole and Olive in [2] conducted some blind versus sighted
loudspeaker tests
that showed both experienced and inexperienced listeners' judgments were
significantly
influenced by factors such as price, brand name, size and cosmetics. In
fact, the effect of
these biases in the sighted tests were larger than any other significant
factors found in the
blind tests, including loudspeaker, position and program interactions. These
experiments
clearly show that an accurate and unbiased measurement of sound quality
requires that
the tests be done blind.
To remove these biases from listening tests in the MLL an acoustically
transparent
curtain that is visually opaque is placed between the products and the
listeners so that
they do not know the identities of the products under test. All other
associated equipment
in the signal path is also out-of-sight and locked in an equipment rack,
since the
performance and paranoia of some listeners can be affected by simply having
knowledge
that a certain brand of interconnect or CD player is in the signal path.
The front screen consists of a black open knit polyester knit cloth chosen
for its acoustic
transparency and used as grille clothe in many of our loudspeakers. The
material is
attached to a large automated curtain roller so it can be easily lifted down
and up with an
infrared remote control. Weights are attached to a seam in the bottom so the
cloth retains
its tautness when in use. Retractable curtains made of the same material
surround the
listeners to hide the identities of loudspeakers located at the sides and
rear of the listening
room. Figures 4 and 5 show the front, side and rear curtains fully retracted
when not in
use, and Figure 8 shows the curtains in place during an actual listening
test.
2.8 Video Playback
Video and audio are increasingly becoming recorded, processed and
distributed together.
There is a growing interest among researchers in studying how the perceived
quality of
one affects the perception of the other. Although much research still needs
to be done,
evidence suggests there are bimodal interactions between the two that
influence listeners'
expectations and judgments of the quality of the audio, and vice versa.
Keeping this in
mind, we were careful in selecting a video playback system within our budget
that had
sufficient quality, so that it would not negatively impact listeners'
opinions of the sound
quality.
We selected a three gun front projection CRT made by Audio Video Source for
its above-
average picture quality and the additional advantage that is has no fan. The
picture is
projected on a 100 inch Stewart Microperf screen that is retractable so it
can be removed
10

Harman International Industries, Incorporated 8500 Balboa Blvd., PO Box
2200, Northridge, CA 91329 (818) 893-8411
for audio-only listening tests. The acoustical effect of the screen is
another factor that is
not completely understood, and will be a subject of investigation.
2.9 Automated Control, Collection and Analysis of Data
In designing the MLL, we wanted to automate as much as possible the design
and
running of experiments including the collection, storage and analysis of
data, in order to
reduce the time and costs of performing listening tests. Automation of
experiments has
the additional benefit of making listening tests more reproducible, largely
because it
reduces the risk of human errors and biases introduced by the experimenter.
Considerable
ongoing effort in software development is helping us to fulfill these goals.
Automation begins at the experimental design stage where all important
experimental
parameters and details are defined by the experimenter as a "*.exp" file
that is stored in a
database that resides on the Windows NT server.
The experiment file contains the following information:
.. The name of the experiment and a brief description
.. Detailed information related to the experimental design and protocol
including
definition of scales and randomization of variables. Protocol choices
include single or
multiple comparisons, ABX, ABC(with hidden reference) and different
threshold
measurement protocols.
.. Instructions to the listeners
.. Equipment control information and operational parameters required by the
audio
switcher for level matching, switching and overall output level.
.. The file names or track information for each program selection. This
information is
sent to the appropriate signal source device.
.. Information related to the position and movement of loudspeakers
.. A list of trials which the software randomly selects
The Windows NT server controls the running of the experiment including
control of all
associated equipment in the signal path. A block diagram of the equipment
and signal
path for the MLL is shown in Appendix 1. The lines that connect each block
as well as
the signal paths are color coded and typed according to whether the signals
are audio
(either analog or digital), video, infrared or RF control, computer data,
MIDI control or
sent over PCI or serial buss. The signal sources are the blocks on the top
left of Appendix
1. They currently include DVD and Laser Disk player, an 8-channel PCM
digital
recorder, and an 8-channel PC-based hard disk recorder (Lexicon Studio) and
its
associated A/D and D/A I/O cards. The audio and video outputs of the DVD and
LD
players are sent to the Lexicon DC-1 which provides AC-3 and DTS decoding
when
required. The analog outputs are sent to the Spirit 328 digital mixer which
provides signal
switching and level matching (within 0.03 dB) for up to 16 analog or digital
inputs. The 8
channel sources are sent digitally to the Spirit mixer and remain digital up
to the power
amplifier before they are converted by the Studer D/A's.
All operational parameters of the Spirit mixer can be viewed, stored and
recalled from the
NT Server via MIDI control.
11

Harman International Industries, Incorporated 8500 Balboa Blvd., PO Box
2200, Northridge, CA 91329 (818) 893-8411
The input of listener data, feedback and status information is done using
laptops
connected to the NT Server through a LAN. For single listener experiments,
the listener
can control switching of the stimulae remotely from their laptop. A
photograph of a
listener entering data on the laptop connected to the NT Server is shown in
Figure 6. For
multiple listener experiments, the NT Server controls the switching either
manually or
through software automation. During the experiment, all changes in listener
response data
can be viewed in real-time on the NT Server which performs running
statistical averages
and graphs of the results.
Remote access to the NT Server and control of the equipment from inside the
listening
room is also possible through a wireless RF mouse, keyboard and a flat panel
display, all
of which are connected to the Server. This might be required during set up
or for
informal listening sessions or product demonstrations. The flat panel
display also shows
status information to the listener(s) indicating what stimulus (i.e. A, B,
C.) is currently
selected, and any other necessary information.
Finally, all experimental data and information related to listeners (date
and time, name,
seat position, age) is stored in a relational data base which can be
formatted and imported
into various statistical packages we use for analysis of results.
Not shown in the block diagram is a video camera used for monitoring
subjects and to
detect and hopefully deter possible cheating. Also not shown is a two-way
intercom that
allows communication between the subject(s) and the experimenter.
3.0 CONTROL ROOM AND LISTENER TRAINING LAB
Outside the MLL is a lab area dedicated for audio and test equipment used
during the set
up, running and monitoring of listening experiments. Here a space is also
dedicated for
the training of listeners, which is done over headphones at computer audio
workstations.
Bech in [20] has shown that 6 trained listeners can provide data that is as
statistically
reliable as data gathered from 18 untrained listeners. Clearly, considerable
cost-savings in
time and money can be realized if listeners are trained before they
participate in formal
listening experiments. At Harman, listeners with normal hearing undergo a
listener
training program, which self-administered through a computer and custom
software
developed in-house [21]. The software teaches listeners to identify and rate
using
different scales, frequency response irregularities according to the center
frequency,
amplitude and Q of the distortion. The graphical user interface of the
training software is
shown in Figure 8.
The training focuses on frequency-related problems since these are the
common and most
serious audible problems found in most loudspeaker-related listening tests,
which many
untrained listeners find difficult to describe. The training solves this
problem by teaching
12

Harman International Industries, Incorporated 8500 Balboa Blvd., PO Box
2200, Northridge, CA 91329 (818) 893-8411
listeners to describe these phenomena in technical terms that design
engineers can
understand and use to correct any problematic audible artifacts in product
designs.
The training software has proved to be a valuable tool for teaching
listeners how to
describe and scale the various dimensions of sound quality in meaningful
terms, and
allows their performance to be quantified in terms that allow us to
discriminate good
listeners from bad ones. An additional, indirect, benefit accrued from
training is that we
have learned which program selections are most revealing of typical
frequency-related
artifacts introduced during the training exercises, and we now use these in
our product
evaluations.
4.0 CONCLUSIONS
In summary, we have described a new facility designed to test multichannel
components
efficiently and as bias-free as possible. The facility includes acoustically
transparent
listening screens that hide the identities of all multichannel loudspeakers
and equipment
within the audio path. Particular attention has been taken to address the
two of the most
problematic variables in listening tests: the listening room and the
position(s) of the
loudspeaker. Through the use of a computer automated speaker shuffler, we
have greatly
reduced the amount of time and effort required to set up and test multiple
comparisons
between loudspeakers by reducing the factor position to a one-dimension or
level
variable. Typical loudspeaker evaluations should be reduced in length by a
factor of 24:1.
The listening room itself is capable of testing up to three different 5.1 or
7.1 channel
systems and accommodate 1-6 listeners at a time. The measurements we have
shown in
this paper indicate its performance in its current form meets the very
highest standards set
out by the ITU and EBU recommendations, in terms of volume, geometry,
reverberation
time, and the control of early reflections. The acoustics of the room can be
easily altered
from hemi-anechoic to more typical domestic room conditions by adding
reflective
panels to the room's boundaries.
Finally, the experimental design, set up and control are computer-automated
so that
experiments can be easily repeated, and are less prone to human error. The
more timeconsuming
and mundane tasks such as collection and analysis of data have also been
computer-automated, so that experiment report writing becomes a simple
cut-and-paste
operation.
5.0 ACKNOWLEDGEMENTS
The authors would like to thank Tom Roberts of Bruel & Kjaer for his
assistance and
loan of the equipment used to make the background noise measurements shown
in this
paper.
13

Harman International Industries, Incorporated 8500 Balboa Blvd., PO Box
2200, Northridge, CA 91329 (818) 893-8411
6.0 REFERENCES
[1] F.E. Toole, "Listening Tests - Identifying and Controlling the
Variables", Proceedings
of the 8th International Conference, Audio Eng., Soc. (1990 May).
[2] F.E. Toole and S.E. Olive, "Hearing is Believing vs. Believing is
Hearing: Blind vs.
Sighted Listening Tests and Other Interesting Things", 97th Convention,
Audio Eng.
Soc., Preprint No. 3894 (1994 Nov.)
[3] F.E. Toole, "Listening Tests, Turning Opinion Into Fact", J. Audio Eng.
Soc., vol. 30,
pp. 431-445 (1982 June).
[4] F.E. Toole, "Subjective Measurements of Loudspeaker Sound Quality and
Listener
Performance", J. Audio Eng. Soc., vol. 33, pp. 2-32 (1985 January/February).
[5] Soren Bech, " Perception of Timbre of Reproduced Sound in Small Rooms:
Influence
of Room and Loudspeaker Position J AES, Vol. 42, Number 12 pp. 999 (1994).
[6] S.E. Olive, P. Schuck, J. Ryan, S. Sally, M. Bonneville, "The
Variability of
Loudspeaker Sound Quality Among Four Domestic-Sized Rooms", presented at the
99th
AES Convention, preprint 4092 K-1 (1995 October).
[7] F.E. Toole, "Loudspeakers and Rooms for Stereophonic Sound Reproduction",
Proceedings of the 8th International Conference, Audio Eng., Soc. (1990
May).
[8] S.E. Olive, P. Schuck, S. Sally, M. Bonneville, "The Effects of
Loudspeaker
Placement on Listener Preference Ratings", J. Audio Eng. Soc., Vol. 42, pp.
651-669
(1994 September).
[9] Antti Jarvinen, Lauri Savioja, Henrik Moller, Veijo Ikonen, Anssi
Ruusuvuori,
"Design of a Reference Listening Room - A Case Study", AES 103rd Convention,
New
York, Preprint 4559, September 26-29, 1997.
[10] IEC Publication 268-13: Sound System Equipment, part 13. Listening
Tests on
Loudspeakers (1985)
[11 NR-12 A, Technical Recommendation: Sound Control Rooms and Listening
Rooms.
2nd Edition, The Nordic Public Broadcasting Corporation, 1992.
[12] ITU-R Recommendation BS.1116: Methods for Subjective Evaluation of
Small
Impairments in audio systems including multichannel sound systems, 2nd
Edition (1997)
[13] ITU-R Recommendation BS.775: Multichannel stereophonic sound system
with and
without accompanying picture (1994).
[14] EBU Tech 3276 (2nd Edition, 1997).
[15] AES20-1996: Recommended Practice for Professional Audio - Subjective
Evaluation of Loudspeakers (1996).
[16] Walker, R. "Optimum Dimension Ratios For Small Rooms". 100th AES
Convention.
Preprint 4191 (Copenhagen, Denmark, 1996).
[17] S.E. Olive and F.E. Toole, "The Detection of Reflections in Typical
Rooms", J.
Audio Eng., Soc., vol. 37, pp. 539-553 (1989 July/August).
[18] R.Walker," A controlled-reflection listening room for multichannel
sound", AES
104th Convention Amsterdam, The Netherlands, Preprint #4645, May 16-19, 1998
14

Harman International Industries, Incorporated 8500 Balboa Blvd., PO Box
2200, Northridge, CA 91329 (818) 893-8411
[19] E. Arató Borsi, T. Póth, and A. Fürjes," New Reference Listening Room
for Two-
Channel and Multichannel Stereophonic" AES 104th Convention Amsterdam, The
Netherlands, Preprint #4732, May 16-19, 1998.
[20]Soren Bech,"Selection and Training of Subjects for Listening Tests on
Sound-
Reproducing Equipment" Vol. 40, Number 7 pp. 590 (1992).
[21] S. E. Olive, "A Method for Training of Listeners and Selecting Program
Material for
Listening Tests", 97th Convention, Audio Eng. Soc., Preprint No. 3893 (1994
November).
TABLE 1
Parameter Harman ITU EBU N12-A IEC AES
MLL
Volume 155.92 60-110 50-120
( m 3 ) (80)
Floor area 60.20 20-70 40 60 ± 10 20
( m 2 )
Height 2.59 2.3 - 3.0 rec. 2.1
h (m) 2.8
Length 9.14 = 6
l (m) rec. 6.7
Width 6.58 = 4
w (m) rec. 4.2
(1.1 w / h) 2.80
( l / h) 3.53
( 4.5w / h - 4 ) 7.44
T m (s) 0.23 0.29 0.29 0.35 0.3 -0.6 0.45 ± 0.15
± 0.05 0.4 ± 0.05
T 63 Hz Max .34 Tm(s) 0.2 - 0.4 0.35 0.8
(s)
Noise Level NR 5 NR10; NR10;NR 10 L pA L pA 35 dB
abs. max abs. max or L pA 35 dB andNR 15 NR 15 15 dB L pC 50 dB
Table 1: Dimensions and Acoustic Parameters of Harman MLL versus
Recommendations of Various Standards
15

Harman International Industries, Incorporated 8500 Balboa Blvd., PO Box
2200, Northridge, CA 91329 (818) 893-8411
0
10
20
30
40
50
60
70
32 63 125 250 500 1000 2000 4000 8000
)
SPL ( )
-10
Frequency (Hz
dBAC ON
AC OFF
NR0
NR5
NR10
NR15
Figure 1 A spatially-averaged measurement showing the background noise in
the MLL
with the air conditioning off (dotted) and turned on (dashed) compared to
the NR curves:
0,5,10 and 15.
0
63 250 500 2000
()
T60 (seconds)
in
0.1
0.2
0.3
0.4
0.5
0.6
0.7
125 1000 4000 8000
FrequencyHzEBU & ITU OPT.
EBU & ITU Max
EBU & ITU MMLL
Figure 2 The Tm (RT60) values measured in the MLL compared to the optimal,
maximum and minimum values recommended by the EBU and ITU standards.
16

Harman International Industries, Incorporated 8500 Balboa Blvd., PO Box
2200, Northridge, CA 91329 (818) 893-8411
()
0
20
40
60
80
100
120
1 2 3 4
/o
/o
/o
/o
Minimum Number of Trials Without Speaker MoverNumber of Loudspeaker
Positions
Compared
1 program w2 program s w3 program s w4 program s w
Figure 3(A) The above graph shows the number of trials required for a
multiple
comparison loudspeaker experiment as a function of the number loudspeaker
positions
compared. The lines represent experiments in which 1-4 programs are used.
The design
balances all position and context effects and has no repeats.
()
0
2
4
6
8
10
12
14
16
18
20
1 2 3 4
Minimum Number of Trials With Speaker MoverNumber of Loudspeaker Positions
Compared
1 program w.
2 programs w.
3 programs w.
4 programs w.
Figure 3(B) The same experiment is shown as in Figure 3(A) above except here
a
speaker mover is used.
17

Harman International Industries, Incorporated 8500 Balboa Blvd., PO Box
2200, Northridge, CA 91329 (818) 893-8411
Figure 4 Shown is the automated speaker shuffler of the MLL set up for A/B
stereo testing of two stereo
loudspeakers. Here the front listening screen is pulled up.
Figure 5 A front-left wide-angle shot of the MLL with the listening screens
pulled back.
The automated speaker shuffler is in the foreground setup for 5.1 playback.
Note the side
and rear channel speaker baffles in the background, and the audio and
computer data
control box on the back wall. The video projector is mounted on the ceiling
with a
retractable screen in front of the speaker mover.
18

Harman International Industries, Incorporated 8500 Balboa Blvd., PO Box
2200, Northridge, CA 91329 (818) 893-8411
Figure 6 A listener performing a test by entering their data on a laptop
computer that is
networked to the NT Server. In this test, video is displayed and both front,
side and rear
curtains are drawn to hide the identifies of the 5.1 loudspeaker systems
under test.
Figure 7 Shown is the control room area outside the listening room where all
audio
equipment,experimental control and monitoring takes place. Shown here is the
NT Server
on the left, and two listener training workstations on the right.
19

Harman International Industries, Incorporated 8500 Balboa Blvd., PO Box
2200, Northridge, CA 91329 (818) 893-8411
Figure 8: The GUI of the listener training software. The listeners' task is
to match the 4 different
equalizations indicates by their frequency response curves that are randomly
assigned to Buttons A-D
Feedback is given on their responses. The "FLAT" button allows listeners to
audition the program
without any equalization added.
Figure 9: The GUI of the software used for a typical listening test or
training exercise. Listeners
enter their preference ratings for sounds A-D relative to a given reference
("REF"). Ratings are also
given on spectral balance and distortion. Relevant comments are optional.
20

Appendix 1: Block Diagram of Harman Multichannel Listening Laboratory (MLL)
showing key features, equipment and path for audio,
video, data and control signals.
Speaker Mover ateProceed
(Controlled via Amplifier
RS-232) (16 channels)
2200, N Studer 8411
D/A Converter
(16 channels)
Sony PCM800
Curtain Stewart
DVD
Video Screen
Flat Panel
Display
Laser Disk
Wireless
Infrared
Analog Audio Mouse
+
Digital Audio Keyboard
Lexicon DC-1
Video
Listeners'
Computer Data
MIDI Data
Laptops
Computer Card
Spirit Digital
328
Faroudja Lexicon
Front Line I/O Box
Projector
Doubler
MIDI Card
Ethernet WINDOWS
HUB NT SERVER
Lexicon Studio
Core PCI Card
RS232/IR /RF
21
Controller