Reply
 
Thread Tools Display Modes
  #1   Report Post  
Ryan
 
Posts: n/a
Default Fourier Analyses, or, how the Orchestra learned to play "Jet Engine"

I'm looking to find out more about writing some software that will use
traditional classical instruments to emulate "natural" or "non musical
sounds." The software will perform some type of analyses on an audio
file, I imagine FFT would be used at some point, but the problem with
FFT is that it only tells you what "perfect" or pure sine wave based
frequencies are present in a sound. Besides the flute, not much else
in an orchestra has anything close to a sine wave output. After this
analysis is done, the software will look through a library of sounds
made by traditional instruments. These sounds will include every
noise and playing style every traditional instrument can produce. The
software will then juggle the sounds around at various dynamic levels
in various rhythms and etc until it comes up with the closest
combination to the original sound. Perhaps a car engine sound file
would yield three Double Basses, a flute or two in very quiet
irregular rhythms, and maybe a horn would be involved during gear
changes. I might not have to tell you that Gyorgy Ligeti's
"Atmospheres" and his "Mechanical Music" served as the chief
inspiration for this idea.

Has anybody ever heard of anything like this, or know where I might
start to look for info on this subject? I'm not looking for
programming help, but rather, help with setting up the math. Are
there any scientific communities online that I could point my
questions to? Any books on this type of thing. I've heard Csound
might work for this. I thought Csound was for composing, not for
analyzing existing sound files. I can't seem to come up with the
right keywords to get anything out of Google, but I hoped someone here
might be able to put me on the right path.
  #2   Report Post  
Karl Winkler
 
Posts: n/a
Default

(Ryan) wrote in message . com...
I'm looking to find out more about writing some software that will use
traditional classical instruments to emulate "natural" or "non musical
sounds." The software will perform some type of analyses on an audio
file, I imagine FFT would be used at some point, but the problem with
FFT is that it only tells you what "perfect" or pure sine wave based
frequencies are present in a sound. Besides the flute, not much else
in an orchestra has anything close to a sine wave output. After this
analysis is done, the software will look through a library of sounds
made by traditional instruments. These sounds will include every
noise and playing style every traditional instrument can produce. The
software will then juggle the sounds around at various dynamic levels
in various rhythms and etc until it comes up with the closest
combination to the original sound. Perhaps a car engine sound file
would yield three Double Basses, a flute or two in very quiet
irregular rhythms, and maybe a horn would be involved during gear
changes. I might not have to tell you that Gyorgy Ligeti's
"Atmospheres" and his "Mechanical Music" served as the chief
inspiration for this idea.

Has anybody ever heard of anything like this, or know where I might
start to look for info on this subject? I'm not looking for
programming help, but rather, help with setting up the math. Are
there any scientific communities online that I could point my
questions to? Any books on this type of thing. I've heard Csound
might work for this. I thought Csound was for composing, not for
analyzing existing sound files. I can't seem to come up with the
right keywords to get anything out of Google, but I hoped someone here
might be able to put me on the right path.


Ryan,

You might want to visit the Fourier analysis again... my understanding
is that with it, you are not just determining the fundamental
frequency of a sound, but all the other frequencies present in it, as
well. The key is that any given sound IS a collection of sine waves,
at different intensities, with different relationships in *time*.

For example, a square wave is a sine wave at the fundamental, then a
series of harmonics (3rd, 5th, 7th, 9th, etc.) in diminishing
amplitude, in a specific arrangement. Fourier can describe this
arrangement.

What you are attempting to do with sounds reminds me of those posters,
where one large picture (say, of a person) is made up of hundreds of
smaller pictures. The movie poster for "The Truman Show" starring Jim
Carrey comes to mind. Perhaps some of the math or the code from that
system may work for what you are doing.

Regards,

Karl Winkler
Lectrosonics, Inc.
http://www.lectrosonics.com
  #3   Report Post  
Scott Dorsey
 
Posts: n/a
Default

Ryan wrote:
I'm looking to find out more about writing some software that will use
traditional classical instruments to emulate "natural" or "non musical
sounds." The software will perform some type of analyses on an audio
file, I imagine FFT would be used at some point, but the problem with
FFT is that it only tells you what "perfect" or pure sine wave based
frequencies are present in a sound.


No. ANY arbitrary waveform can be decomposed down to sine waves. When you
put the sines back together, you can reconstitute the original wave. This is
the WHOLE POINT of the Fourier series. The time domain and frequency domain
representations of the waveform are equivalent and you can convert from one
to the other and back with impunity.

Besides the flute, not much else
in an orchestra has anything close to a sine wave output. After this
analysis is done, the software will look through a library of sounds
made by traditional instruments. These sounds will include every
noise and playing style every traditional instrument can produce. The
software will then juggle the sounds around at various dynamic levels
in various rhythms and etc until it comes up with the closest
combination to the original sound.


Why use a computer for this anyway? George Gershwin did a perfectly good
job of this by ear.
--scott
--
"C'est un Nagra. C'est suisse, et tres, tres precis."
  #4   Report Post  
philicorda
 
Posts: n/a
Default

On Tue, 12 Oct 2004 07:09:24 -0700, Karl Winkler wrote:

snip
What you are attempting to do with sounds reminds me of those posters,
where one large picture (say, of a person) is made up of hundreds of
smaller pictures. The movie poster for "The Truman Show" starring Jim
Carrey comes to mind. Perhaps some of the math or the code from that
system may work for what you are doing.


There is a free program called "Soundmosaic" that does exactly this. It
sorta works.

http://thalassocracy.org/soundmosaic/

(Some people here may appreciate the demo of a George Bush speech
combined with a chimp screaming.)

And "Dissasociated studio" which does the same kind of thing, but within a
single audio file.

http://www.panix.com/~asl2/music/dissoc_studio/

  #6   Report Post  
Ryan
 
Posts: n/a
Default

(Scott Dorsey) wrote in message ...


Hi Scott. How have you been? Heard anymore Sonic Youth of late?

Ryan wrote:
I'm looking to find out more about writing some software that will use
traditional classical instruments to emulate "natural" or "non musical
sounds." The software will perform some type of analyses on an audio
file, I imagine FFT would be used at some point, but the problem with
FFT is that it only tells you what "perfect" or pure sine wave based
frequencies are present in a sound.


No. ANY arbitrary waveform can be decomposed down to sine waves. When you
put the sines back together, you can reconstitute the original wave. This is
the WHOLE POINT of the Fourier series. The time domain and frequency domain
representations of the waveform are equivalent and you can convert from one
to the other and back with impunity.


So what I have to do is perform FFT on each of my sound "samples", the
squeak of a vilon played behind the bridge, a viol's "dry string"
sounds, regular arco, pizzicato, etc, ect, all the ohter instruments,
etc. And then perform an FFT on any given sound file I'm interested
in emulating. After that, what kind of math would be used to sort
through all the samples and figure what goes best where?

Samplitude features an FFT analyses window. It just looks like a
regular EQ anlysis to me. Is it the case that if I take each
frequency as a sine wave and apply it to the given amplitude that I
will have achieved X's sound? Is there anyway to simplify that? Even
the simplest natural sounds have about a 5khz range. Do I have to
create 5000 individual sine waves? The FFT graph only shows frequency
over time, How do I find out about the relationships between the
frequencies as far as timming? For example say a put a sine wave at
2Khz and 1Khz. Obviously the 2Khz occilates twice as fast as the 1
Khz, but beyond that, the starting/ending points (where y=0) might not
sink up. The 2Khz sine may start, say, 300ths of a second after the
1Khz. I don't think info like this can be found out by the FFT
window, can it?

Do I have this right at all, or am I still nopt grasping Fourier
transforms?

Besides the flute, not much else
in an orchestra has anything close to a sine wave output. After this
analysis is done, the software will look through a library of sounds
made by traditional instruments. These sounds will include every
noise and playing style every traditional instrument can produce. The
software will then juggle the sounds around at various dynamic levels
in various rhythms and etc until it comes up with the closest
combination to the original sound.


Why use a computer for this anyway? George Gershwin did a perfectly good
job of this by ear.
--scott


The hope is to use this as a learning tool and eventually stop using
it, not unlike training wheels on a bicycle. I could probably do a
decent job of this in a tonal 4/4 world, but most real life sounds
contain dissonant and microtonal intervals, as well as many
"co-rhythms" that work together to create larger aspects of the sound,
such as pulses, and trigonometric polynomials. Making something that
sounds like a train whistle is one thing. I imagine it would have
been rather difficult for a composer of even Gershwin's skill to
notate out the sound of a babbling brook during a rain storm, with a
distant propeller airplane heard off in the far distance. It would
break my head, not to mention take a considerable amount of time for
me to do this by ear. Whereas with a system of this sort, I could run
twenty analyses and in a day know far more about this type of
orchestration than I would in a month if I did it all in my head. I
would gain a good overall knowledge that I can use as starting points
for future works, I would have a "feel for it". On the other hand,
doing this all by ear until I figure out how to make it work, is like
finding out the details first and only later getting the overall
picture--not the most efficient way of working. Like trying to
complete a jigsaw puzzle with no picture of what the finished puzzle
looks like. Learning the individual interactions between the parts
does not always lead to a good understanding of the whole. Anyway, I
learn best working from the outside in.
  #7   Report Post  
Scott Dorsey
 
Posts: n/a
Default

Ryan wrote:
(Scott Dorsey) wrote in message ...

Hi Scott. How have you been? Heard anymore Sonic Youth of late?


I'm listening to Toots and the Maytals as I type this...

No. ANY arbitrary waveform can be decomposed down to sine waves. When you
put the sines back together, you can reconstitute the original wave. This is
the WHOLE POINT of the Fourier series. The time domain and frequency domain
representations of the waveform are equivalent and you can convert from one
to the other and back with impunity.


So what I have to do is perform FFT on each of my sound "samples", the
squeak of a vilon played behind the bridge, a viol's "dry string"
sounds, regular arco, pizzicato, etc, ect, all the ohter instruments,
etc. And then perform an FFT on any given sound file I'm interested
in emulating. After that, what kind of math would be used to sort
through all the samples and figure what goes best where?


I'm not sure this will really do what you want, but you can try it. You
could just do a standard correlation coefficient and see how close they
come.

Then again, you could probably just do a correlation coefficient on the
samples themselves. That might be fun to look at.

Samplitude features an FFT analyses window. It just looks like a
regular EQ anlysis to me. Is it the case that if I take each
frequency as a sine wave and apply it to the given amplitude that I
will have achieved X's sound? Is there anyway to simplify that? Even
the simplest natural sounds have about a 5khz range. Do I have to
create 5000 individual sine waves? The FFT graph only shows frequency
over time, How do I find out about the relationships between the
frequencies as far as timming? For example say a put a sine wave at
2Khz and 1Khz. Obviously the 2Khz occilates twice as fast as the 1
Khz, but beyond that, the starting/ending points (where y=0) might not
sink up. The 2Khz sine may start, say, 300ths of a second after the
1Khz. I don't think info like this can be found out by the FFT
window, can it?


No, you probably want a tool like matlab. How many terms you want to
calculate out to depends on how good an approximation you want. I think
that the number of terms that you're going to get is going to be larger
than the number of samples in the original file for most arbitrary sounds.
You can decide to reduce this by bandlimiting the original signal, though.
--scott
--
"C'est un Nagra. C'est suisse, et tres, tres precis."
  #8   Report Post  
Kurt Riemann
 
Posts: n/a
Default

Symbolic Sound's KYMA does resynthesis in real-time.

Gotta buy the box, though . . .




Kurt Riemann


  #9   Report Post  
Ryan
 
Posts: n/a
Default

(Scott Dorsey) wrote in message
I'm listening to Toots and the Maytals as I type this...


Woah. New stuff? I haven't even heard the title before. I'm kinda
weening off the Youth a little bit. Nowadays I'm really into Ligeti.
Have you heard his "Atmospheres" or his "San Fransisco Polyphony", or
his "Continuum (fur Cembalo)"? My goodness! They're must-listens.
He has some of the most revolutionary music I have ever heard. I'm
sure you will understand my wanting for a type of software like this
once you hear these pieces, if you haven't heard them already.

So what I have to do is perform FFT on each of my sound "samples", the
squeak of a vilon played behind the bridge, a viol's "dry string"
sounds, regular arco, pizzicato, etc, ect, all the ohter instruments,
etc. And then perform an FFT on any given sound file I'm interested
in emulating. After that, what kind of math would be used to sort
through all the samples and figure what goes best where?


I'm not sure this will really do what you want, but you can try it. You
could just do a standard correlation coefficient and see how close they
come.

Then again, you could probably just do a correlation coefficient on the
samples themselves. That might be fun to look at.


A correlation for the whole sound file, or a correlation every set
number of seconds, or a type of gui tool to use to set up the sections
you want to emulate. This would be good for Monophonic reduction, but
more math would be involved if you wanted to reduce the sound file to,
say, 3 concurrent, or 13 concurrent instruments, right?

Ideally this software would/could use both of these approaches.

Samplitude features an FFT analyses window. It just looks like a
regular EQ anlysis to me. Is it the case that if I take each
frequency as a sine wave and apply it to the given amplitude that I
will have achieved X's sound? Is there anyway to simplify that? Even
the simplest natural sounds have about a 5khz range. Do I have to
create 5000 individual sine waves? The FFT graph only shows frequency
over time, How do I find out about the relationships between the
frequencies as far as timming? For example say a put a sine wave at
2Khz and 1Khz. Obviously the 2Khz occilates twice as fast as the 1
Khz, but beyond that, the starting/ending points (where y=0) might not
sink up. The 2Khz sine may start, say, 300ths of a second after the
1Khz. I don't think info like this can be found out by the FFT
window, can it?


No, you probably want a tool like matlab. How many terms you want to
calculate out to depends on how good an approximation you want. I think
that the number of terms that you're going to get is going to be larger
than the number of samples in the original file for most arbitrary sounds.
You can decide to reduce this by bandlimiting the original signal, though.
--scott


Terms? As in how many instruments I want to end up with? Or by what
specs I will measure the orignal soundfile? If the later, do you mean
something like bitrate, samplerate, something else? Why would the
number of terms be greater than the samplerate? Is matlab an audio
tool. Probably just a math program right? So I would enter in pcm
info and run the calculations and then use the output to create a pcm
file? Sorry so many questions.
  #10   Report Post  
Bob Cain
 
Posts: n/a
Default



Ryan wrote:

Terms? As in how many instruments I want to end up with? Or by what
specs I will measure the orignal soundfile? If the later, do you mean
something like bitrate, samplerate, something else? Why would the
number of terms be greater than the samplerate? Is matlab an audio
tool. Probably just a math program right? So I would enter in pcm
info and run the calculations and then use the output to create a pcm
file? Sorry so many questions.


Ryan, most of what you are asking about is well beyond the
state of the art, the art being DSP. I would suggest that
you go to comp.dsp and set forth what it is you want to get
more specific feedback about it.


Bob
--

"Things should be described as simply as possible, but no
simpler."

A. Einstein


  #11   Report Post  
Ben Bradley
 
Posts: n/a
Default

On Wed, 13 Oct 2004 17:01:59 -0700, Bob Cain
wrote:



Ryan wrote:

Terms? As in how many instruments I want to end up with? Or by what
specs I will measure the orignal soundfile? If the later, do you mean
something like bitrate, samplerate, something else? Why would the
number of terms be greater than the samplerate? Is matlab an audio
tool. Probably just a math program right? So I would enter in pcm
info and run the calculations and then use the output to create a pcm
file? Sorry so many questions.


Ryan, most of what you are asking about is well beyond the
state of the art, the art being DSP.


I'm trying to follow the thoughts... it appears what he wants is a
computer program that does with an orchestra what one does with a
synthesizer to imitate the sound of a musical instrument ("imitative
synthesis"). I suppose nowadays you could write a program that scans a
digitized audio recording and makes a patch (or orchestral score) that
somewhat crudely approximates the sound, but it could surely be
tweaked by hand/ear to make it better, or perhaps a synthesist (person
making a synth patch) would just start over and make something that
sounds better/closer. I doubt that having it do a mathematical
operation such as a fit a least-squares match of the FFT would make it
anywhere near the "original sound" as would a person experienced in
doing these things.
But to make "arbitrary sounds" with orchestral instruments ... the
only thing I've heard that's anything like this is on Peter Shickele's
"Upper West Side" where he says something about hearing Vivaldi one
more time. The strings play throught the melody once, then they play
the beat of the melody with hip-hop record-scratching sounds. It was
hard to believe my ears. Is there a video? I'd like to SEE these
string players reproducing this speed-up-and-slow-down
record-scratching sound.


I would suggest that
you go to comp.dsp and set forth what it is you want to get
more specific feedback about it.


Like MIDI output of polyphonic audio input, this technology is not
quite (actually nowhere near) ready for prime time.



Bob


-----
http://mindspring.com/~benbradley
  #12   Report Post  
Ryan
 
Posts: n/a
Default

Ben Bradley wrote in message . ..

I'm trying to follow the thoughts... it appears what he wants is a
computer program that does with an orchestra what one does with a
synthesizer to imitate the sound of a musical instrument ("imitative
synthesis"). I suppose nowadays you could write a program that scans a
digitized audio recording and makes a patch (or orchestral score) that
somewhat crudely approximates the sound, but it could surely be
tweaked by hand/ear to make it better, or perhaps a synthesist (person
making a synth patch) would just start over and make something that
sounds better/closer. I doubt that having it do a mathematical
operation such as a fit a least-squares match of the FFT would make it
anywhere near the "original sound" as would a person experienced in
doing these things.


Well, maybe, I don't really know. I'd be surprised if some type of
math couldn't be rigged up that would do as good a job as a human.
It's all analytical, and actually not too subjective. It will either
sound like a jet engine or not, and since the computer will "know"
what a jet engine sounds like thanks to the FFT and differential
analysis, it seems to me this shoud be as easy as asking a computer to
come up with a number that adds to 7 to make ten.

It doesn't matter if it's in real time or not to me. It could take an
hour to process a minute long soundfile for all I care. And once I
get something together I can tweak it for better results, and it
doesn't have to be perfect. Again, this will be mainly a learning
tool.

But to make "arbitrary sounds" with orchestral instruments ... the
only thing I've heard that's anything like this is on Peter Shickele's
"Upper West Side" where he says something about hearing Vivaldi one
more time. The strings play throught the melody once, then they play
the beat of the melody with hip-hop record-scratching sounds. It was
hard to believe my ears. Is there a video? I'd like to SEE these
string players reproducing this speed-up-and-slow-down
record-scratching sound.


Hmmm, I've never heard this before. You're not talking about the
musical "west side story" I gather. Anyway, my guess is that it's a
simple dodecaphonic or maybe microtonal glissando performed with light
enough pressure on the bow/strings to emit that rosiny scratchy sound.
I keep bringing up this guy's name, but if you haven't listened to
any Ligeti, you really owe it to yourself to. His "Atmospheres" and
his "Harmonies (for organ)" are good starting points. His music often
sounds like Arbitrary sounds, and it's always produced with
traditional instruments. "Harmonies" is especially interesting. The
organ has to be rigged up to change the inner air pressure so as to
play microtonally. The low powered organ sounds like a giant whoosh
of sound, or the kind of still wonder you might expect an astronaut to
hear in his head. It mesmerizes and twinkles like distant stars or
complex microscopic schools of glowing plankton in the ocean at night.
In fact, a small bit of Atmoshperes was used in 2001: a space
oddessy. A lot of his music takes you into the moment, stops your
breath, and makes you question why no one else thought of it first.
He does this partially by emulating real world sound.



I would suggest that
you go to comp.dsp and set forth what it is you want to get
more specific feedback about it.


This is good advice.


Like MIDI output of polyphonic audio input, this technology is not
quite (actually nowhere near) ready for prime time.


If it was out and available on every supermarket endcap, I probably
wouldn't want anything to do with it! ;-) This interests me because
as far as I know, it isn't really done that much (the orchestration,
not the software), and certainly not the extent I want to take it to.




Bob


-----
http://mindspring.com/~benbradley

  #13   Report Post  
hank alrich
 
Posts: n/a
Default

Ryan wrote:

I'd be surprised if some type of
math couldn't be rigged up that would do as good a job as a human.


If the math is required to make the assumptions you make in the next few
sentences putting the calcs together is going to be tough.

It's all analytical, and actually not too subjective. It will either
sound like a jet engine or not, and since the computer will "know"
what a jet engine sounds like thanks to the FFT and differential
analysis, it seems to me this shoud be as easy as asking a computer to
come up with a number that adds to 7 to make ten.


Do all oboes sound the same? All violins? All trumpets? _All jet
engines_?

"Not too subjective" goes into the grist mill when a creative mind
chooses among available voicings for a given instrument.

--
ha
  #14   Report Post  
Paul Stamler
 
Posts: n/a
Default

I think one of the things you'll find, investigating these real-world
sounds, is that most of them differ drastically from the sound made by most
musical instruments in that they are inharmonic; in other words, musical
instruments produce sound consisting mostly of a fundamental and harmonics,
at integer multiples of the fundamental frequency. Real-world noises, to a
great extent, have mixtures of frequencies that aren't integer multiples of
one another.

The implication of that, of course, is that in trying to score instruments
to sound like real-world noises, you'll have to suppress their natural
tendency to play with integer-multiple harmonic series. In other words,
you'll need to force them to stop behaving like musical instruments. Thus,
for example, the suggestion of the light-pressure bow producing extraneous,
"non-musical" sounds in the Schickele recording. Contemporary composers have
been doing things like this for a while, with varying degrees of success --
I think back to the string snaps in Bartok's Music for Strings, Percussion
and Celesta, in effect making the fiddles into percussion instruments.

Interesting project, and quite a challenge.

Peace,
Paul


  #15   Report Post  
Bob Cain
 
Posts: n/a
Default



Ryan wrote:


Well, maybe, I don't really know. I'd be surprised if some type of
math couldn't be rigged up that would do as good a job as a human.


Prepare, then, to be surprised. Our mechanisms for feature
extraction and interpretation remain largely a mystery. The
process is highly algorithmic and that is very different
than mathematical, although math can be employed in some
algorithmic process.

It's all analytical, and actually not too subjective. It will either
sound like a jet engine or not, and since the computer will "know"
what a jet engine sounds like thanks to the FFT and differential
analysis, it seems to me this shoud be as easy as asking a computer to
come up with a number that adds to 7 to make ten.


An FFT doesn't begin to disclose what you are looking for in
and of itself. It's no more than a view of the same data
with a different independant axis. It contains no
information at all about when things happen.

In any event, the ear brain does not do a Fourier analysis.
There are frequency dependant mechanisms but they are
totally ad hoc in terms of what nature found most useful for
subsequent analysis.

In a very real sense you are asking for an artificial ear
all the way through to the process of blind separation.
That problem remains a curiousity that researchers are
merely nibbling the edges of.

You might want to Google on "blind separation" to see how
much your problem involves that and how little progress has
been made.


Bob
--

"Things should be described as simply as possible, but no
simpler."

A. Einstein


  #16   Report Post  
hank alrich
 
Posts: n/a
Default

Bob Cain wrote:

An FFT doesn't begin to disclose what you are looking for in
and of itself. It's no more than a view of the same data
with a different independant axis. It contains no
information at all about when things happen.


Or why things happen.

--
ha
  #17   Report Post  
Ryan
 
Posts: n/a
Default

(hank alrich) wrote in message ...
Ryan wrote:

I'd be surprised if some type of
math couldn't be rigged up that would do as good a job as a human.


If the math is required to make the assumptions you make in the next few
sentences putting the calcs together is going to be tough.

It's all analytical, and actually not too subjective. It will either
sound like a jet engine or not, and since the computer will "know"
what a jet engine sounds like thanks to the FFT and differential
analysis, it seems to me this shoud be as easy as asking a computer to
come up with a number that adds to 7 to make ten.


Do all oboes sound the same? All violins? All trumpets? _All jet
engines_?

"Not too subjective" goes into the grist mill when a creative mind
chooses among available voicings for a given instrument.


Well, I'm just starting to get my hands around this. I think I may be
suffering from "don't know how to ask the right questions" syndrome.
Just to clarify a bit: It is certainly true that no two oboes sound
the same, in fact the very same oboe can sound different from day to
day or from climate to climate. I think we could aproximate the sound
of a bassoon, and since this is is only a learning tool, not intended
to produce a perfect final product, that would be good enough. On the
other hand, for this problem, there is only one sound of a jet engine,
and that sound would be whatever soundfile I choose to feed to the
software. Although both sounds will have to be analyzed to produce
the desired effect, the file I seek to emulate, "the jet engine
sound," will never have to suffer from aproximation. That's what I
meant by "the computer will know" what a jet engine sounds like.
  #18   Report Post  
Ryan
 
Posts: n/a
Default

Bob Cain wrote in message ...
Ryan wrote:


Well, maybe, I don't really know. I'd be surprised if some type of
math couldn't be rigged up that would do as good a job as a human.


Prepare, then, to be surprised. Our mechanisms for feature
extraction and interpretation remain largely a mystery. The
process is highly algorithmic and that is very different
than mathematical, although math can be employed in some
algorithmic process.

It's all analytical, and actually not too subjective. It will either
sound like a jet engine or not, and since the computer will "know"
what a jet engine sounds like thanks to the FFT and differential
analysis, it seems to me this shoud be as easy as asking a computer to
come up with a number that adds to 7 to make ten.


An FFT doesn't begin to disclose what you are looking for in
and of itself. It's no more than a view of the same data
with a different independant axis. It contains no
information at all about when things happen.


Is there any kind of analysis that does? I used FFT because that's
the only one I've really ever heard of. What if I perform a different
FFT for every second of the soundfile?


In any event, the ear brain does not do a Fourier analysis.
There are frequency dependant mechanisms but they are
totally ad hoc in terms of what nature found most useful for
subsequent analysis.


Was this a typo? I hope this doesn't offend, but every site I've
looked at about this says that indeed our ears do function as FFT
devices. If this is incorrect I'd very much like to know the turth
about the matter.


In a very real sense you are asking for an artificial ear
all the way through to the process of blind separation.
That problem remains a curiousity that researchers are
merely nibbling the edges of.

You might want to Google on "blind separation" to see how
much your problem involves that and how little progress has
been made.


Bob


Is this what I'm asking for? I really don't know myself. It seems to
me FFT would work ideally if the only instruments I wanted to score
for were flutes. Flutes have an almost perfect sine wave output. And
since FFT is a breakdown of the sound into sine waves, I'd think this
would work quite well, except of course for the limited bass range of
the flute family. No?

Regardless, thanks for giving me some new info to go on.
  #19   Report Post  
Ryan
 
Posts: n/a
Default

"Paul Stamler" wrote in message ...

I think one of the things you'll find, investigating these real-world
sounds, is that most of them differ drastically from the sound made by most
musical instruments in that they are inharmonic; in other words, musical
instruments produce sound consisting mostly of a fundamental and harmonics,
at integer multiples of the fundamental frequency. Real-world noises, to a
great extent, have mixtures of frequencies that aren't integer multiples of
one another.


This is something I've always wondered about. I thought everything
obeyed the 1st harmonic, 2cnd harmonic, etc., rules. Is it possible
for a sound to have no overtones? I thought that even computer
generated sounds that have no harmonics on screen, produce them
automatically when they come out of the speaker. I thought the
harmonic series was just part of the physics of sound. Yes, real
world sounds often contain dissonant and un related intervals, but if
we broke down the overall sound to a set of sounds, wouldn't these
sounds in themselves produce the natural overtones?

The implication of that, of course, is that in trying to score instruments
to sound like real-world noises, you'll have to suppress their natural
tendency to play with integer-multiple harmonic series. In other words,
you'll need to force them to stop behaving like musical instruments.


How about microtones? I imagine the sound of an F#+ coming out of an
oboe would create some funny interactions with the harmonics. But I
could be wrong.


Thus,
for example, the suggestion of the light-pressure bow producing extraneous,
"non-musical" sounds in the Schickele recording. Contemporary composers have
been doing things like this for a while, with varying degrees of success --
I think back to the string snaps in Bartok's Music for Strings, Percussion
and Celesta, in effect making the fiddles into percussion instruments.

Interesting project, and quite a challenge.

Peace,
Paul

  #20   Report Post  
Bob Cain
 
Posts: n/a
Default



Ryan wrote:


Well, I'm just starting to get my hands around this. I think I may be
suffering from "don't know how to ask the right questions" syndrome.
Just to clarify a bit: It is certainly true that no two oboes sound
the same, in fact the very same oboe can sound different from day to
day or from climate to climate. I think we could aproximate the sound
of a bassoon, and since this is is only a learning tool, not intended
to produce a perfect final product, that would be good enough. On the
other hand, for this problem, there is only one sound of a jet engine,
and that sound would be whatever soundfile I choose to feed to the
software. Although both sounds will have to be analyzed to produce
the desired effect, the file I seek to emulate, "the jet engine
sound," will never have to suffer from aproximation. That's what I
meant by "the computer will know" what a jet engine sounds like.


You've got me confused now, what is it that you are wanting
to do that is different than a sampler?


Bob
--

"Things should be described as simply as possible, but no
simpler."

A. Einstein


  #21   Report Post  
Bob Cain
 
Posts: n/a
Default



Ryan wrote:


An FFT doesn't begin to disclose what you are looking for in
and of itself. It's no more than a view of the same data
with a different independant axis. It contains no
information at all about when things happen.



Is there any kind of analysis that does? I used FFT because that's
the only one I've really ever heard of. What if I perform a different
FFT for every second of the soundfile?


Very good! You've just described the STFT, short time
Fourier transform. It does give information about when
things happen with no greater resolution than the length of
the FFT. They can be overlapped for better resolution.
There is also the variety of wavelet transforms which allow
you to trade off the resolution in frequency and in time
according to a principle similar to Heisenberg's. They are
tricky to use.

The question remains to be answered in some detail what
information you want to obtain.



In any event, the ear brain does not do a Fourier analysis.
There are frequency dependant mechanisms but they are
totally ad hoc in terms of what nature found most useful for
subsequent analysis.



Was this a typo? I hope this doesn't offend, but every site I've
looked at about this says that indeed our ears do function as FFT
devices. If this is incorrect I'd very much like to know the turth
about the matter.


Nope. No offense taken. There is a _big_ difference
between a FT and an ad hoc and idiosyncratic feature
extraction mechanism that uses a very complicated organic
filter as part of its discrimination. The FT has a precise
mathematical formulation involving inner products with sin
and cosine signals at a precise set of frequencies. The ear
just doesn't do that. There is a gross similarity but
that's about all.

The Ghost could address this in some detail if anyone could
get him to do something besides insult people. When he was
young he published with one of the pioneers in the field of
hearing research, someone who I believe got a Nobel Prize
for it.

Is this what I'm asking for? I really don't know myself.


I'm having trouble figuring that out exactly too. :-)

In case you've received any new information that might help
you frame it better, would you care to try again?
Refinement to specs from vague ideas is not an uncommon
process in the user/marketing/engineering cyclic process.


Bob
--

"Things should be described as simply as possible, but no
simpler."

A. Einstein
  #22   Report Post  
Paul Stamler
 
Posts: n/a
Default


"Ryan" wrote in message
om...
"Paul Stamler" wrote in message

...

I think one of the things you'll find, investigating these real-world
sounds, is that most of them differ drastically from the sound made by

most
musical instruments in that they are inharmonic; in other words, musical
instruments produce sound consisting mostly of a fundamental and

harmonics,
at integer multiples of the fundamental frequency. Real-world noises, to

a
great extent, have mixtures of frequencies that aren't integer multiples

of
one another.


This is something I've always wondered about. I thought everything
obeyed the 1st harmonic, 2cnd harmonic, etc., rules. Is it possible
for a sound to have no overtones? I thought that even computer
generated sounds that have no harmonics on screen, produce them
automatically when they come out of the speaker. I thought the
harmonic series was just part of the physics of sound. Yes, real
world sounds often contain dissonant and un related intervals, but if
we broke down the overall sound to a set of sounds, wouldn't these
sounds in themselves produce the natural overtones?


Not necessarily. Many noises contain a mixture of frequencies not at all
harmonically related. For that matter, sometimes even musical instruments
produce a sound that isn't perfectly harmonic -- in other words, the
harmonics aren't exact integer multiples. One of my guitars at the moment
needs its strings changed; they're no longer perfectly cylindrical (there
are dents in the windings where they go over the frets), and the harmonics
aren't quite perfect multiples of the fundamental anymore. Which is why it
sounds like crap, of course, and will continue to do so until I get off my
duff and change the strings.

Another example: play a guitar through a fuzzbox or an amplifier craniked up
enough to distort. Play two strings and, along with the harmonic series of
each individual note, you'll get a whole raft of intermodulation products
not part of that harmonic series at all. That's the fuzz.

Anyway, back to non-musical noises. I remember having to clean up a
recording made in a room with a large HVAC blower outside. It had a lot of
different frequencies in it, most of them not related to each other by any
simple ratios. Along with that was a heap of white noise.

No, not everything obeys the harmonic rules.

Peace,
Paul


  #23   Report Post  
Ryan
 
Posts: n/a
Default

Bob Cain wrote in message ...

The Ghost could address this in some detail if anyone could
get him to do something besides insult people. When he was
young he published with one of the pioneers in the field of
hearing research, someone who I believe got a Nobel Prize
for it.



The Ghost?

Is this what I'm asking for? I really don't know myself.


I'm having trouble figuring that out exactly too. :-)

In case you've received any new information that might help
you frame it better, would you care to try again?
Refinement to specs from vague ideas is not an uncommon
process in the user/marketing/engineering cyclic process.


Well, I think you had the right idea the first time, before I
attempted to be more conscise and confused you. I will jot out a
basic algorithm for the softwa

1. Analyze real instrument sound files. These files should inculde
every possible way every classical instrument can be played, from the
traditional to the avant garde. For the viols for example, from plain
jane arco to bartok's snapping strings to harmonics to different bow
pressures to playing behind the bridge to the tapping of fingers on
the body of the instruments. There should be files that represent the
instruments at all possible dynamic levels. There should be files
that feature the instruments playing in micotones if it can do so.
(Most classical instruments can.) Also, there should be analysis of
the instruments in "static form." By this I mean the part of the
sound after the intial attack, which can be looped over and over again
to give the impression the note is sustaining. This is done in
standard synthesis as well as good sample libraries. It may take
quite awhile to amass all these samples, but once collected the
analysis of them only has to be done once.

2. Deduct from these analyzations the prime aspects of these sounds.
If we only have, say ten frequencies to represent this sound, which
ones would be the most usefull. Or would some other type of info
about the file be more imprtant than it's frequencies? So now we have
a set of data instead of just a pcm sound file. We can call these
data sets, "fingerprints." This is mainly to help speed up the math
performed later during step 4, though it will compromise the accuracy
of the final product. Ideally, the user should be able to select the
amount of data to be derived from the samples.

3. Analyze any given sound file. These would be the "real world"
sounds. Or anything at all. In fact, I was thinking last night that
the ultimate test for this software would be to feed it, say,
Beethoven's 9th, and see how close it could aproximate it.

4. Run differential, or co-efficent on the "real world" sound file
compared to all the "sound fingerprints" the program created in step
2.

5. Create midi file. After the program has deduced what would be the
best combination of instruments in which playing styles at what
pitches and what dynamics, playing at what kind of rhytmic figures,
etc., the program would simply create a multiple staff midi file with
all said info scored on it.

Viola!
  #24   Report Post  
philicorda
 
Posts: n/a
Default

On Fri, 15 Oct 2004 22:20:06 -0700, Ryan wrote:

Bob Cain wrote in message
...

The Ghost could address this in some detail if anyone could get him to
do something besides insult people. When he was young he published
with one of the pioneers in the field of hearing research, someone who
I believe got a Nobel Prize for it.



The Ghost?

Is this what I'm asking for? I really don't know myself.


I'm having trouble figuring that out exactly too. :-)

In case you've received any new information that might help you frame
it better, would you care to try again? Refinement to specs from vague
ideas is not an uncommon process in the user/marketing/engineering
cyclic process.


Well, I think you had the right idea the first time, before I attempted
to be more conscise and confused you. I will jot out a basic algorithm
for the softwa

1. Analyze real instrument sound files. These files should inculde
every possible way every classical instrument can be played, from the
traditional to the avant garde. For the viols for example, from plain
jane arco to bartok's snapping strings to harmonics to different bow
pressures to playing behind the bridge to the tapping of fingers on the
body of the instruments. There should be files that represent the
instruments at all possible dynamic levels. There should be files that
feature the instruments playing in micotones if it can do so. (Most
classical instruments can.) Also, there should be analysis of the
instruments in "static form." By this I mean the part of the sound
after the intial attack, which can be looped over and over again to give
the impression the note is sustaining. This is done in standard
synthesis as well as good sample libraries. It may take quite awhile to
amass all these samples, but once collected the analysis of them only
has to be done once.


Why not use mathametical models of the instruments? I would imagine the
amount of samples required to cover all the sounds a violin could make
would be impossible (think of playing a false harmonic on all the strings
of a violin at every position, and with every bowing style). With a model,
you have defined the 'prime aspects of these sounds' in a very flexible
way. The computer could adjust the way the model is 'played' to find the
best fit to the sound you wish to analyse.

This would perhaps get nearer to fulfilling the interesting idea in your
original post-

"Perhaps a car engine sound file
would yield three Double Basses, a flute or two in very quiet irregular
rhythms, and maybe a horn would be involved during gear changes."

The computer could go through every single possible sound a violin could
make by iterating though all possible bow positions/angle/velocity, finger
positions etc until it found the combination that would most closely
approximate the sound you want to analyse.

If I were to persue this, I would brutally simplify things to start with.
For example, make some simple rules for an experiment...

All music is played on a single instrument model that creates perfect sine
waves. Each note this instrument makes has a fixed decay to silence over a
period of one second.
The only variables this instrument has is how loud each note is, and its
pitch, fixed to a chromatic scale.
The limitations of the 'player' of this instrument has is to play twenty
notes per second, and as many as ten notes at once.

Then, take the sound file to be analysed and every 20th of a second, try
each of the limited range of sounds this instrument can create until you
find the one that correlates most closely. (Litrally by fft correlation?)

Once that is done, you should have a performance on a very simple
instrument that has some relation to the file you wish to analyse.

Then, the model could be made slightly more complex, ie - this instrument
is an ideal Karplus-Strong string with a simple frequency dependent loss
filter. It has the properties of the length of the string, where the
string is struck, and the amount of energy imparted. It is monophonic, and
can change pitch at a limited rate.

The disadvantages of this way of working would be - Iterating though each
sound a model could create would be *very* time consuming once the models
became more realistic. It's very hard to create good physical models of
real instruments.

The advantages would be -
It might actually work. Or at least provide a way to begin attacking
this interesting but extraordinarily difficult task. The model does not
just define a fixed set of sounds (samples) an instrument can create, but
also defines the limitations in how that instrument can be played.

I think that you would have to create a model of limitations of the player
as well as the instrument anyway if you were using samples. This would
be very difficult if the computer does not 'understand' the instrument
like a physical model, as you would have to create a large amount of
rules by hand for each sample.



2. Deduct from these analyzations the prime aspects of these sounds. If
we only have, say ten frequencies to represent this sound, which ones
would be the most usefull. Or would some other type of info about the
file be more imprtant than it's frequencies? So now we have a set of
data instead of just a pcm sound file. We can call these data sets,
"fingerprints." This is mainly to help speed up the math performed later
during step 4, though it will compromise the accuracy of the final
product. Ideally, the user should be able to select the amount of data
to be derived from the samples.

3. Analyze any given sound file. These would be the "real world"
sounds. Or anything at all. In fact, I was thinking last night that
the ultimate test for this software would be to feed it, say,
Beethoven's 9th, and see how close it could aproximate it.

4. Run differential, or co-efficent on the "real world" sound file
compared to all the "sound fingerprints" the program created in step 2.

5. Create midi file. After the program has deduced what would be the
best combination of instruments in which playing styles at what pitches
and what dynamics, playing at what kind of rhytmic figures, etc., the
program would simply create a multiple staff midi file with all said
info scored on it.

Viola!

  #25   Report Post  
Bob Cain
 
Posts: n/a
Default



Ryan wrote:
Bob Cain wrote in message ...


The Ghost could address this in some detail if anyone could
get him to do something besides insult people. When he was
young he published with one of the pioneers in the field of
hearing research, someone who I believe got a Nobel Prize
for it.




The Ghost?


Unimportant. If you don't know of him, you certainly don't
want to.

1. Analyze real instrument sound files. These files should inculde
every possible way every classical instrument can be played, from the
traditional to the avant garde. For the viols for example, from plain
jane arco to bartok's snapping strings to harmonics to different bow
pressures to playing behind the bridge to the tapping of fingers on
the body of the instruments. There should be files that represent the
instruments at all possible dynamic levels. There should be files
that feature the instruments playing in micotones if it can do so.
(Most classical instruments can.) Also, there should be analysis of
the instruments in "static form." By this I mean the part of the
sound after the intial attack, which can be looped over and over again
to give the impression the note is sustaining. This is done in
standard synthesis as well as good sample libraries. It may take
quite awhile to amass all these samples, but once collected the
analysis of them only has to be done once.


And has yet to be done once. :-)

You aren't really defining an analysis, or even the features
you would like extracted and cataloged. "Every possible way
an instrument can be played" has no meaning until you very
specifically give it that. It is what my high school
writing teacher called a glittering generality. I'm sorry
if that is a bit brutal but so was she. :-)

How would the subjective characteristics that your brain is
very good at discerining be algorithmically characterized
and what would be the form of the data the analysis
produced? You can't just describe it in subjective terms
because we have yet to teach machines this level of
subjective classification and discernment. We are a _long_
way from that.

Don't just offer the term FFT. There is no new information
in an FFT, just a different view of it. What you are
imagining would employ transforms of some kind, undoubtedly,
but which ones and exactly how they could be used to get at
the far more complex information you want is not even a well
formulated problem much less a solved one.

Imagine asking for a machine that could analyze and
catagorize smiles. What you are asking is far more
difficult and open ended.


2. Deduct from these analyzations the prime aspects of these sounds.


First you must very precisely characterize all of these
prime aspects via a, probably long, research program and
then figure out what processes must be applied to the data
to extract and classify them in those terms.

If we only have, say ten frequencies to represent this sound, which
ones would be the most usefull.


That particular "if" has no real connection to reality.

Or would some other type of info
about the file be more imprtant than it's frequencies?


Good question. Now you are getting to the heart of the matter.

So now we have
a set of data instead of just a pcm sound file.


Not quite yet we don't.

We can call these
data sets, "fingerprints."


What would be in these data sets.

This is mainly to help speed up the math
performed later during step 4, though it will compromise the accuracy
of the final product.


What math?

Ideally, the user should be able to select the
amount of data to be derived from the samples.


Cool.


3. Analyze any given sound file. These would be the "real world"
sounds. Or anything at all. In fact, I was thinking last night that
the ultimate test for this software would be to feed it, say,
Beethoven's 9th, and see how close it could aproximate it.


Approximate it with what?


4. Run differential, or co-efficent on the "real world" sound file
compared to all the "sound fingerprints" the program created in step
2.


Each of your analyzed snippets would be a vector in a very
high dimensional parameter space. Once you defined that
space and a way to deduce all the coordinates in it for a
particular fingerprint, you could then determine the
corresponding vectors for your "real world" sounds. Problem
is that once the dimensions of a space get large enough, any
arbitrary vector in it will almost certainly be orthogonal
to any other. What this means is that they have about as
much in common as "left" and "wrong." Matching is poorly
defined in such situations.


5. Create midi file. After the program has deduced what would be the
best combination of instruments in which playing styles at what
pitches and what dynamics, playing at what kind of rhytmic figures,
etc., the program would simply create a multiple staff midi file with
all said info scored on it.


Yeah, simply.

Viola!


What, you want to do all this synthesis with a single
instrument? :-)


Bob
--

"Things should be described as simply as possible, but no
simpler."

A. Einstein


  #26   Report Post  
Ryan
 
Posts: n/a
Default

Well, I don't know Phil. Your idea sounded interesting at first, but
then towards the end you describe how hard it would be to use
"realisitic" models anyway, so you kind of defeat your own suggestion.
Plus, this would be even more comp sci and math I'd have to learn. I
do appreciate your ideas however, and I thank you.

I was thinking maybe when I can afford it I would just spring for the
Vienna Symphonic Library Orchestral Cube. It purports to provide
samples of everything I want, recorded by world class players in
anechoic chambers. It would be ideal if it wasn't for the three
thousand dollar price tag.

Anyway, I'm starting to think maybe I should just do the work with my
ear instead of my computer. Most of the posters here tend to think a
software solution would be next to impossible. Might as well brush up
on my ear training and spend the time using my right side of the brain
instead of the left. Hell, from the looks of it I could spend three
years figuring out this software, it would probably only take me three
days to do a rough guess transcription. Maybe I'm finally figuring
out how much harder it is to find a lazy way of doing things.


philicorda wrote in message .org...

Why not use mathametical models of the instruments? I would imagine the
amount of samples required to cover all the sounds a violin could make
would be impossible (think of playing a false harmonic on all the strings
of a violin at every position, and with every bowing style). With a model,
you have defined the 'prime aspects of these sounds' in a very flexible
way. The computer could adjust the way the model is 'played' to find the
best fit to the sound you wish to analyse.

This would perhaps get nearer to fulfilling the interesting idea in your
original post-

"Perhaps a car engine sound file
would yield three Double Basses, a flute or two in very quiet irregular
rhythms, and maybe a horn would be involved during gear changes."

The computer could go through every single possible sound a violin could
make by iterating though all possible bow positions/angle/velocity, finger
positions etc until it found the combination that would most closely
approximate the sound you want to analyse.

If I were to persue this, I would brutally simplify things to start with.
For example, make some simple rules for an experiment...

All music is played on a single instrument model that creates perfect sine
waves. Each note this instrument makes has a fixed decay to silence over a
period of one second.
The only variables this instrument has is how loud each note is, and its
pitch, fixed to a chromatic scale.
The limitations of the 'player' of this instrument has is to play twenty
notes per second, and as many as ten notes at once.

Then, take the sound file to be analysed and every 20th of a second, try
each of the limited range of sounds this instrument can create until you
find the one that correlates most closely. (Litrally by fft correlation?)

Once that is done, you should have a performance on a very simple
instrument that has some relation to the file you wish to analyse.

Then, the model could be made slightly more complex, ie - this instrument
is an ideal Karplus-Strong string with a simple frequency dependent loss
filter. It has the properties of the length of the string, where the
string is struck, and the amount of energy imparted. It is monophonic, and
can change pitch at a limited rate.

The disadvantages of this way of working would be - Iterating though each
sound a model could create would be *very* time consuming once the models
became more realistic. It's very hard to create good physical models of
real instruments.

The advantages would be -
It might actually work. Or at least provide a way to begin attacking
this interesting but extraordinarily difficult task. The model does not
just define a fixed set of sounds (samples) an instrument can create, but
also defines the limitations in how that instrument can be played.

I think that you would have to create a model of limitations of the player
as well as the instrument anyway if you were using samples. This would
be very difficult if the computer does not 'understand' the instrument
like a physical model, as you would have to create a large amount of
rules by hand for each sample.



2. Deduct from these analyzations the prime aspects of these sounds. If
we only have, say ten frequencies to represent this sound, which ones
would be the most usefull. Or would some other type of info about the
file be more imprtant than it's frequencies? So now we have a set of
data instead of just a pcm sound file. We can call these data sets,
"fingerprints." This is mainly to help speed up the math performed later
during step 4, though it will compromise the accuracy of the final
product. Ideally, the user should be able to select the amount of data
to be derived from the samples.

3. Analyze any given sound file. These would be the "real world"
sounds. Or anything at all. In fact, I was thinking last night that
the ultimate test for this software would be to feed it, say,
Beethoven's 9th, and see how close it could aproximate it.

4. Run differential, or co-efficent on the "real world" sound file
compared to all the "sound fingerprints" the program created in step 2.

5. Create midi file. After the program has deduced what would be the
best combination of instruments in which playing styles at what pitches
and what dynamics, playing at what kind of rhytmic figures, etc., the
program would simply create a multiple staff midi file with all said
info scored on it.

Viola!

  #27   Report Post  
philicorda
 
Posts: n/a
Default

On Sat, 16 Oct 2004 23:44:13 -0700, Ryan wrote:

Well, I don't know Phil. Your idea sounded interesting at first, but
then towards the end you describe how hard it would be to use
"realisitic" models anyway, so you kind of defeat your own suggestion.


Absolutely. It would perhaps be a more ideal method, though it's far more
complicated and messy. I wonder how well the most simple model would work?
A computers 'interpretation' with a simple string and player model would
be interesting to hear, even though it may not bear much relationship to
the original music.

There are a number of programs out there that purport to do polyphonic
pitch detection -
http://www.music-notation.info/en/co...udio2midi.html

But, they rely on differentiating the different instruments by their
range, rather than their harmonic content, and I have no idea how well the
polyphonic pitch detection works. Perhaps combining the two approaches
of the pitch detection they do, and yours of harmonic 'fingerprints' to
identify the instruments?

Plus, this would be even more comp sci and math I'd have to learn. I
do appreciate your ideas however, and I thank you.

I was thinking maybe when I can afford it I would just spring for the
Vienna Symphonic Library Orchestral Cube. It purports to provide
samples of everything I want, recorded by world class players in
anechoic chambers. It would be ideal if it wasn't for the three
thousand dollar price tag.

Anyway, I'm starting to think maybe I should just do the work with my
ear instead of my computer. Most of the posters here tend to think a
software solution would be next to impossible. Might as well brush up
on my ear training and spend the time using my right side of the brain
instead of the left. Hell, from the looks of it I could spend three
years figuring out this software, it would probably only take me three
days to do a rough guess transcription. Maybe I'm finally figuring out
how much harder it is to find a lazy way of doing things.


Laziness is the mother of invention.

  #28   Report Post  
The Ghost
 
Posts: n/a
Default

Bob Cain wrote in message ...

The Ghost could address this in some detail if anyone could
get him to do something besides insult people.


Speak for yourself, you arrogant asshole. You have no knowledge of or
appreciation for what I can address. Furthermore, based on the
historical record, you couldn't care less. Four years ago, before I
became aware that you were not a decent human being, I answered your
questions because I knew something about the subject matter of your
inquiry. Rather than being appreciative and thanking me for the
information that I provided, you insulted me and started a feud that
continues to this day.
  #30   Report Post  
Ryan
 
Posts: n/a
Default

Bob Cain wrote in message ...

You aren't really defining an analysis, or even the features
you would like extracted and cataloged. "Every possible way
an instrument can be played" has no meaning until you very
specifically give it that. It is what my high school
writing teacher called a glittering generality. I'm sorry
if that is a bit brutal but so was she. :-)


Yes, If I was writing for another audience I would have to adress this
more specifically. But you know what I'm getting at. I don't want to
post a billion word technical rubrick.

How would the subjective characteristics that your brain is
very good at discerining be algorithmically characterized
and what would be the form of the data the analysis
produced?


What would be in these data sets.


What math?


Approximate it with what?


Hell man, these are the questions I came looking for the anwsers to.
You were supposed to answer these!


Viola!


What, you want to do all this synthesis with a single
instrument? :-)


lol
How is it spelled? Voiola?


  #31   Report Post  
Bob Cain
 
Posts: n/a
Default



Ryan wrote:

Bob Cain wrote in message ...

You aren't really defining an analysis, or even the features
you would like extracted and cataloged. "Every possible way
an instrument can be played" has no meaning until you very
specifically give it that. It is what my high school
writing teacher called a glittering generality. I'm sorry
if that is a bit brutal but so was she. :-)


Yes, If I was writing for another audience I would have to adress this
more specifically. But you know what I'm getting at. I don't want to
post a billion word technical rubrick.


:-) Aw, give it a shot.

How would the subjective characteristics that your brain is
very good at discerining be algorithmically characterized
and what would be the form of the data the analysis
produced?


What would be in these data sets.


What math?


Approximate it with what?


Hell man, these are the questions I came looking for the anwsers to.
You were supposed to answer these!


I hope you understand that my intent was to point out that
these aren't solved problems. There aren't even glimmers on
the horizon. You are defining a musical AI with an awesome
intelligence, processing capability and prodigous memory.

If you were to take this to a prospective Ph.D. advisor as
an area for a thesis, he'd look at you in amazement, shake
his head and, if he was kind, try to help you find one
little corner of it that might yield productive results if
you tugged on it for a few years.

There are people thinking and working on these kinds of
problems but I don't know where they congregate.

Viola!


What, you want to do all this synthesis with a single
instrument? :-)


lol
How is it spelled? Voiola?


:-) Voila!


Bob
--

"Things should be described as simply as possible, but no
simpler."

A. Einstein
  #32   Report Post  
hank alrich
 
Posts: n/a
Default

Ryan wrote:

Hell man, these are the questions I came looking for the anwsers to.
You were supposed to answer these!


He's asking you the questions for which you must provide clear answeres
in order to approach your goal.

--
ha
  #33   Report Post  
hank alrich
 
Posts: n/a
Default

A phasmagorical creature posted:

Speak for yourself, you arrogant asshole.


Bob Cain speaks for Bob Cain. A ghost, on the other hand, doesn't dare
unveil itself in the light of day, so no one knows for whom it attempts
to speak.

--
ha
  #34   Report Post  
Ryan
 
Posts: n/a
Default

Bob Cain wrote in message ...

Yes, If I was writing for another audience I would have to adress this
more specifically. But you know what I'm getting at. I don't want to
post a billion word technical rubrick.


:-) Aw, give it a shot.


You know, if I could be assured that what I want to do is feasible, I
really would write something like this up. Till then though, I'm a
busy man and it seems like a huge waste of time if nothing could ever
come from it.


How would the subjective characteristics that your brain is
very good at discerining be algorithmically characterized
and what would be the form of the data the analysis
produced?


What would be in these data sets.


What math?


Approximate it with what?


Hell man, these are the questions I came looking for the anwsers to.
You were supposed to answer these!


I hope you understand that my intent was to point out that
these aren't solved problems. There aren't even glimmers on
the horizon. You are defining a musical AI with an awesome
intelligence, processing capability and prodigous memory.


Yes. You are quite good at socratic method. I guess I just thought
you knew these answers but wanted to see me "jump through some hoops"
first, not malicously of course. But if what you're saying is that
the math, or system of maths this would require hasn't even been
"invented" yet, then that's an altogether different type of thing.

Anyway, thanks for your time and input.
  #35   Report Post  
Tom Loredo
 
Posts: n/a
Default


Hi Ryan-

The sines and cosines that get used to build up a waveform in
Fourier analysis are the "basis functions" of the Fourier
transform. It is possible to decompose signals using many
different types of bases. The Fourier basis (sines and cosines,
harmonically related if the signal is of finite extent) has
some nice mathematical properties that make the decomposition
(and recomposition) simpler, mathematically, than it is with
many other bases. But that simplicity doesn't make the
Fourier basis "right" for all applications.

In your case, you want to use as "basis functions" the signals
played by standard instruments. These are much more complicated
than the sines and cosines in a Fourier basis. Besides the
fact that the sustained waveform from an instrument playing
a note has a non-sinusoidal shape, notes are transient (they
start and stop in time) and also dynamic (their pitch, volume,
and timbre vary in time, e.g., due to
tremolo, vibrato, etc.). Although it is mathematically possible
to represent signals with such dynamic, transient structure via
a Fourier transform, I don't think a Fourier decomposition
is well-suited to your problem.

One approach is to actually take samples of the instruments
you'll use, playing all the notes available, and use them (with
various durations) directly as your basis. This would be the most
accurate approach, but the calculations you'd need to do to find
the expansion coefficients (i.e., the score!) would probably
be extremely difficult computationally, and probably not
well-defined (the basis is likely neither complete nor
orthogonal). You'd be doing something like additive synthesis,
but with a much bigger basis than is usually used! Looking
up some of the math associated with additive synthesis might
provide you with some leads.

A possible option that has the potential to be more computationally
tractible would be to use some kind of wavelet or other
time-scale or time-frequency transform rather
than a Fourier transform. Very roughly speaking, you can
think of such a transform as breaking up a signal into
*localized* pulses, i.e., notes! That is, where a Fourier
transform represents a signal as a sum of "eternal" sines
and cosines of specific frequencies, a time-frequency transform
breaks up the signal into separate parts that are localized both in
frequency *and* time. You might be able to find some way to
project a wavelet or other time-frequency transform of the sound
you are interested in onto the transforms of sounds from the
instruments you have available; this would give you the notes
and volumes needed to most closely match the desired signal.
This won't make any fundamental problems with the incompleteness
or redundancy of your basis (choice of instruments & notes) go
away, but use of such transforms might provide methods of
approximation that make the problem more tractable computationally.

A google search on "wavelets" and "music" will probably get you
started. This wavelet FAQ might also help:

http://www.math.ucdavis.edu/~saito/c...avelet_faq.pdf

Here's a review article on time-frequency analysis of sounds
from musical instruments---your basis functions, so to speak:

http://epubs.siam.org/sam-bin/getfil...cles/38228.pdf

If you want to learn more about Fourier expansions from
a musical point of view, see:

http://ccrma.stanford.edu/~jos/mdft/

Here's a reference that turned up in my own quick googling using
"time scale transform music" that may provide a starting point
for thinking along these lines, if you can find a copy:

Kronland-Martinet R., Grossmann A. "Application of time-frequency and
time-scale methods to the analysis, synthesis and transformation of
natural sounds." in "Representations of Musical Signals", C. Roads,
G. De Poli, A. Picciali Eds, MIT Press, october 1990.

Interlibrary loan may help you here!

A similar search using "time frequency transform music" turned up
"Musical Transformations using the Modification of Time-Frequency
Images" in a 1993 issue of *Computer Music Journal*:

http://mitpress.mit.edu/catalog/item...d=6768&ttype=6

This is just from some quick googling and these are probably not
the best or most recent references that may be relevant. Wavelet
and time-frequency analysis is now very mature and there are
entire textbooks and monographs on these topics. Good
luck with this.

Peace,
Tom Loredo

--

To respond by email, replace "somewhere" with "astro" in the
return address.


  #36   Report Post  
Bob Cain
 
Posts: n/a
Default



Ryan wrote:

Yes. You are quite good at socratic method. I guess I just thought
you knew these answers but wanted to see me "jump through some hoops"
first, not malicously of course. But if what you're saying is that
the math, or system of maths this would require hasn't even been
"invented" yet, then that's an altogether different type of thing.


It wasn't just to get you to jump through hoops, Ryan. I'm
truly interested in how a musically creative mind would
specify the problem in some detail. That's good input for
the more academic oriented folks who are working and
thinking at the computational level.

The biggest problem with all of this that I see is how to
specify in detail what's in the music that can be considered
features worth thinking about extracting algoritmically. If
a human can't get real down with that part then there is
little hope of implementing anything useful. Granted, for
the non-technically but strongly musically inclined it could
be a very frustrating experience to see how difficult it is
to reduce things that seem obvious to her to terms that have
any hope of an impelementation, but you gotta start somewhere.


Bob
--

"Things should be described as simply as possible, but no
simpler."

A. Einstein
  #38   Report Post  
Karl Winkler
 
Posts: n/a
Default

(Ryan) wrote in message . com...
I'm looking to find out more about writing some software that will use
traditional classical instruments to emulate "natural" or "non musical
sounds." The software will perform some type of analyses on an audio
file, I imagine FFT would be used at some point, but the problem with
FFT is that it only tells you what "perfect" or pure sine wave based
frequencies are present in a sound. Besides the flute, not much else
in an orchestra has anything close to a sine wave output. After this
analysis is done, the software will look through a library of sounds
made by traditional instruments. These sounds will include every
noise and playing style every traditional instrument can produce. The
software will then juggle the sounds around at various dynamic levels
in various rhythms and etc until it comes up with the closest
combination to the original sound. Perhaps a car engine sound file
would yield three Double Basses, a flute or two in very quiet
irregular rhythms, and maybe a horn would be involved during gear
changes. I might not have to tell you that Gyorgy Ligeti's
"Atmospheres" and his "Mechanical Music" served as the chief
inspiration for this idea.

Has anybody ever heard of anything like this, or know where I might
start to look for info on this subject? I'm not looking for
programming help, but rather, help with setting up the math. Are
there any scientific communities online that I could point my
questions to? Any books on this type of thing. I've heard Csound
might work for this. I thought Csound was for composing, not for
analyzing existing sound files. I can't seem to come up with the
right keywords to get anything out of Google, but I hoped someone here
might be able to put me on the right path.


I know it's not what you had originally asked, but give a listen to
the first few bars of Mahler's 1st symphony, last movement. Closest
thing I've heard to an orchestra sounding like a jet engine, without
intentionally doing so.

-Karl
  #39   Report Post  
Ryan
 
Posts: n/a
Default

I'm looking to find out more about writing some software that will use
traditional classical instruments to emulate "natural" or "non musical
sounds." The software will perform some type of analyses on an audio
file, I imagine FFT would be used at some point, but the problem with
FFT is that it only tells you what "perfect" or pure sine wave based
frequencies are present in a sound. Besides the flute, not much else
in an orchestra has anything close to a sine wave output. After this
analysis is done, the software will look through a library of sounds
made by traditional instruments. These sounds will include every
noise and playing style every traditional instrument can produce. The
software will then juggle the sounds around at various dynamic levels
in various rhythms and etc until it comes up with the closest
combination to the original sound. Perhaps a car engine sound file
would yield three Double Basses, a flute or two in very quiet
irregular rhythms, and maybe a horn would be involved during gear
changes.


(The Ghost) wrote in message . com...
(Ryan) wrote in message . com...

Well, you and I have no bad blood between us, Ghost. What's your take
on this whole idea?


I don't have time at this moment to backtrack and read the entire
thread. So, if you have a specific question, please (re)state it in
as concise terms as possible, and I will answer it if I feel that I am
qualified to do so. If not, I will do my best to refer you to someone
who can.

  #40   Report Post  
Ryan
 
Posts: n/a
Default

Tom Loredo wrote in message ...
Hi Ryan-

The sines and cosines that get used to build up a waveform in
Fourier analysis are the "basis functions" of the Fourier
transform. It is possible to decompose signals using many
different types of bases. The Fourier basis (sines and cosines,
harmonically related if the signal is of finite extent) has
some nice mathematical properties that make the decomposition
(and recomposition) simpler, mathematically, than it is with
many other bases. But that simplicity doesn't make the
Fourier basis "right" for all applications...


Thank you for this copious amount of unsolicited information. It is
already proving useful.
Reply
Thread Tools
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
CAKEWALK SONAR 4 PRODUCER EDITION, Vienna Symphonic Orchestra Pro Performance ( VSL ), for giga sampler, 28 DVDs, and Quantum Leap 17 DVDs, Sonic Implants Symphonic Brass Collection [2 DVDs], M-Audio ProSessions [2 CDs], Simon Harris Beats [3 CDs], n code_fu Marketplace 0 October 19th 04 03:13 AM
CAKEWALK SONAR 4 PRODUCER EDITION, Vienna Symphonic Orchestra Pro Performance ( VSL ), for giga sampler, 28 DVDs, and Quantum Leap 17 DVDs, Sonic Implants Symphonic Brass Collection [2 DVDs], M-Audio ProSessions [2 CDs], Simon Harris Beats [3 CDs], n astra35 General 0 October 17th 04 01:51 PM
CAKEWALK SONAR 4 PRODUCER EDITION, Vienna Symphonic Orchestra Pro Performance ( VSL ), for giga sampler, 28 DVDs, and Quantum Leap 17 DVDs, Sonic Implants Symphonic Brass Collection [2 DVDs], M-Audio ProSessions [2 CDs], Simon Harris Beats [3 CDs], n astra35 Tech 0 October 17th 04 01:51 PM
CAKEWALK SONAR 4 PRODUCER EDITION, Sonic Implants Symphonic Brass Collection [2 DVDs], Vienna Symphonic Orchestra Pro Performance [4 DVDs], M-Audio ProSessions [2 CDs], Simon Harris Beats [3 CDs], new !, other astra35 Tech 0 October 13th 04 09:40 PM
Sonic Implants Symphonic Brass Collection [2 DVDs], Vienna Symphonic Orchestra Pro Performance [4 DVDs], M-Audio ProSessions [2 CDs], Simon Harris Beats [3 CDs], new !, other code_fu Pro Audio 1 October 9th 04 05:16 PM


All times are GMT +1. The time now is 06:02 PM.

Powered by: vBulletin
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 AudioBanter.com.
The comments are property of their posters.
 

About Us

"It's about Audio and hi-fi"