About Us

Gary Eickmeier

STEREO RAW

This article is to answer a few misconceptions about what stereo is and then
to offer some analogies to correct those misconceptions.

"Stereo" is a generic term that means stereophonic, a field-type system of
auditory perspective using more than one channel from microphones to
playback speakers. It can be the legacy two channel system for commercial
recordings or any number of extra channels for surround sound or center
channel or even full peripheral, including above and below. It is reproduced
by loudspeakers in a room, placed in positions which are geometrically
similar to those of the instruments. Stereo is differentiated from
monophonic, a single channel on loudspeaker in a room, binaural, a
head-related system using a dummy head for recording and either headphones
or a circuit for the speakers that can isolate the channels to the
respective ears with crosstalk removal. "Monaural" would be a single
channel sent to one ear on headphone, "diotic" being a single channel sent
to two headphones equally. In general the suffix "phonic" means on
loudspeakers and "aural" means on headphones with the exception that
"binaural" can be reproduced on speakers as mentioned above, in which case
the term is simply loudspeaker binaural, because the signals are still
isolated to each ear at the head of the listener.

The first main point is that there are a few misconceptions about what
stereo is, rooted in some confusion between the head-related system of
binaural and the field-type system stereophonic. Some distinguished writers
still believe that stereo is a "two ears, two speakers" head-related system
intended to pipe the two recorded signals to the two ears, creating an
illusion of a panorama of sounds as heard by the microphones. They believe
that the recorded signals contain all of the spatial information necessary
to decode the original sound field by means of the binaural localization
mechanisms of human hearing by phase, amplitude, timing of the cues in the
recording. Some believe that the system is based on the two ears and their
separation and pickup patterns, their pinnae effects, Interaural Cross
Correlation (IACC), and response curves, or transfer functions. Some have
stated outright that they believe that the "problem" with stereo is
interaural crosstalk.

I hope to show that the system (stereophonic) has nothing whatsoever to do
with the human hearing mechanism, the number of ears on our heads, their
separation, pinnae, or response curves.

We all know how stereo lateralization works, with a summing localization
between the two channels, the intensity or timing differences between
channels making it possible to perceive an auditory event anywhere along a
line between the speakers. With coincident miking techniques where there is
no timing difference between channels the summing localization is based only
on amplitude differences between channels. With some separation between
pickup microphones there are both intensity and timing differences.
Multi-miking techniques with spot mikes picking up various parts of the
sound field and pan-potted into the mix by means of intensity can also be
incorporated, either as the sole method of recording or as spot mikes for
certain important instruments or the human voice for soloists or small
groups.

So where does this alleged confusion between binaural and stereophonic come
from? It could be from the innocent presumption that the use of two speakers
for playback has something to do with our having two ears, which in turn may
have arisen from the Blumlein patent and method of recording with two
coincident mikes. But meanwhile, at around the same time in the Bell Labs,
researchers were experimenting with multiple channels and placing speakers
on the repro stage similar to the positioning of the instruments and
microphones used for pickup. One of their ideal but impractical methods of
reproduction was called the "curtain of sound" in which a line of many
microphones might record the performance and a similar line of speakers
would reproduce it on another stage, or playback space. They defined
binaural as a head-related system and stereophonic as a field-type system,
in which the idea is to place many speakers on a sound stage and reconstruct
the sound fields that existed in the original. Binaural, on the other hand,
was always and only a two channel head-related system based on the human
hearing mechanism and recorded with a dummy head, the idea being that the
headphones would introduce to the ears the identical signals that the dummy
head heard at the recording site. William Snow remarked that the binaural
system brought the listener to the original performance location, whereas
the stereophonic system brought the performers into our own listening rooms.
Bell Labs ended up with their recommendation for a three channel system, but
practical limitations caused it to be limited to the two channel system that
we know today.

So what is the major difference between a head-related and a field-type
system?

There are two fundamentally different ways to reproduce a sensory
experience. You can reproduce the sensory input directly, such as with
binaural, or you can reproduce the object itself, the sound fields produced
by the orchestra and let the subject's own sensory apparatus pick it up in
the normal way, just as it does with live sound. The sensory input system
depends heavily on attempting to pick up the sounds in the same way that our
own ears do, such as with the use of a dummy head shaped like our heads and
with number of ears and ear spacing and pinnae as much like ours as
feasible. But the stereophonic system has nothing whatsoever to do with the
number of ears on our heads, the spacing between them, their pinnae effects,
or their frequency response (transfer functions), and the whole recording
and reproduction process can be accomplished without any knowledge or
consideration of those factors - NONE. Compare it to the difference between
sculpture and 3D photography. If we want to reproduce the image of an
elephant, we could do it one of two ways. We could either take a 3D
photograph in color and introduce both halves of the image into our eyes, or
we could hire a sculptor to make a very real 3 dimensional model of an
elephant, even to the point of being life sized and placed in a background
such that we could walk all around it and each of us perceive it with our
native vision mechanism, the whole process accomplished with NO knowledge of
the human vision mechanism. In fact, all beings who can see in three
dimensions etc, such as the animals or visitors from another planet, all
would behold the same model in the same way as they did live, even with no
knowledge of how they see, hear, or anything else, if we did the
reproduction as a model of the real thing rather than a direct sensory
input.

I hope to show that the system of stereophonic sound depends ONLY on our
knowledge and study of sound fields in rooms, and not upon knowledge of the
human hearing mechanism, except for the very fortunate psychoacoustic fact
of the summing localization being able to permit the simplification of the
number of channels to fewer than the number of instruments being reproduced.

The raw, base example for purposes of illustration would be a team of
researchers wishing to begin exploring systems of auditory perspective to
explore the field-type system. They go into the recording studio with a
battery of microphones and multi-channel recorder. They close-mike each
instrument but also including a small amount of the reverberance from the
studio as would be heard near the instrument. Some instruments such as the
piano or drum set might call for more than one mike to capture the extent of
the drum kit or the width of the piano.

On playback, we select a good sounding playback space and place the
speakers, possibly selected for a radiation pattern similar to their
instruments, in positions in the room that are geometrically similar to the
original. We now have a "they are here" system if no reverberance was
recorded, or modified a touch by the original hall sound if some was
recorded. Notice also that if some was recorded, and if we use a llittle of
the natural reflecting surfaces around the speakers in the same way that the
original hall's walls did it, the reflected sound from instruments on the
right side would reflect from the right wall of the playback room etc, but
the instruments themselves would remain anchored where they belong by means
of the precedence effect. In total, this "model" of the original sound would
be 3 dimensional, having depth and width and appropriate ambience behind and
around, and you could literally walk all around the model and hear it from
various angles from anywhere in the room.

This is the raw model for the stereophonic system. I would first point out
that the whole process was accomplished with NO knowledge or reference to
the human hearing mechanism and would be the same to all listeners, each one
hearing the model with his or her own hearing system. It was recorded and
reproduced with knowledge ONLY of sound fields in rooms, reconstructing them
in the new space as a model of the original.

I would then point out that this ideal system could be simplified down to
fewerand fewer channels for a more practical system without losing too much,
if we could only remember what it is that we are doing with the system and
not lose sight of the fact that it is a field-type system, a literal
reconstruction, or model, of the original, not a binaural system.

We first reduce the number of channels to as few as two, thanks to the
summing localization being able to place all of the instruments anywhere
along a line between the two speakers. We can then pull the speakers out
from the walls and place them with some geometrical similarity to the
original left and right positions of the orchestra. Finally, we can
customize the radiation patterns of the speakers to a lower direct to
reflected ratio because of our closeness to the speakers, relative to our
original distance from the orchestra. If we now treat the walls so that we
might get some of the reflected sound from the recording bouncing from the
left, center, and right walls of the listening room, we stand a chance of
having the various recordings make our playback rooms take on most of the
important characteristics of the original acoustics.

Finally, so what?

The answer is that this is a radical change in thinking about how the
process works, from a two ears/ two speakers process achieved with the
direct sound output from two speakers to a 3 dimensional model of a typical
original sound field, a reconstruction of all aspects of the original within
the listening room rather than a direct sensory input from the speakers to
your ears. The paradigm to be sought is now sound fields in rooms rather
than the "accuracy" of getting the signals intact from the speakers to your
ears. The new model requires paying attention to the radiation patterns,
room positioning, and acoustical qualities of the whole playback system.
In-wall speakers, nearfield speakers, dead rooms, highly focused sound from
the speakers, all must be re-examined in light of the new theory.

The total acoustical situation that we are hearing when our ears are free to
hear it without any attempt to isolate the channels at the ears or from the
room can be described visually as the image model of the fields in the room,
whether it be the original concert hall or the playback room. What we are
hearing is the total acoustical situation, direct, early reflected, and late
reflected reverberant sound. All of these must be reproduced, which is to
say reconstructed in front of us, or else it will sound different from the
original. The preferred solution would be surround sound, but in any case
the sound patterns within the room must be honored and the goal changed to
realism rather than accuracy. We are not "doing" accuracy with stereophonic
recording, unless you want to hear the piano from underneath the lid, the
singer from a foot in front of her tonsils, or the perspective from 9 feet
above the head of the conductor. Rather, we are seeking realism as will be
displayed in the final result by the placement of the microphones and
speakers to display the sounds from a distance from us in the listening
room, with signal processing or extra channels all around us, and NOT from
the perspectives of any particular microphones.

Gary Eickmeier

Thread Tools
Show Printable Version
Display Modes
Switch to Linear Mode Switch to Hybrid Mode Threaded Mode

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
FA: 360 Systems Model 2800 Programmable stereo Parametric EQ for stereo bus or mastering	kellykevm	Pro Audio	0	February 16th 07 02:54 AM
FA: Stereo 10 band Equalizer, IMX Stereo Expander & Manual	[email protected]	Marketplace	0	June 24th 06 08:43 PM
Escort '97 - Can I add Stereo RCA input plugs to my factory stereo?	David	Car Audio	0	November 29th 04 08:46 PM
"Lost" left channel into stereo headphones through 3.0 / 3.5 mm stereo jack socket / plug	Clive Long,UK	General	0	June 9th 04 05:57 PM
Mazda Tribute - Stereo upgrades/mods, 7 speaker cd and cassette stereo - upgrd	prairieboy	Car Audio	0	March 9th 04 02:51 PM

Menu

About Us