Reply
 
Thread Tools Display Modes
  #361   Report Post  
Posted to rec.audio.pro
Richard Webb[_3_] Richard Webb[_3_] is offline
external usenet poster
 
Posts: 533
Default FLAC or other uncompressed formats, which is best?

On Thu 2012-May-24 21:00, Don Y writes:
Right, and, while you were affiliated with Kurzweil you
picked up on one thing Ray was very good at, it actually

reached out to those he was designing for, the end users
were an integral part of product development. That's why
when Ray first decided to work on ocr for the blind market
he sought out guidance from the National Federation of the
Blind.


Note National Federation of the Blind vs. American Federation FOR
the Blind! Get the wrong preposition in there and
you'll never hear the end of it!


Yep, I've been one of those to quibble over taht one. big
difference.

I used to spend a good deal with a guy ("Michael") who I
recall being tied to NFB somehow (I would stumble across
him in various places around the country instead of in
one single locality).

snip

YEah met him a few times. Michael Hingson iirc.
He just paused a bit and said, "That's a good question.
You know, Don, if you can tell me what it is like to SEE,
I'll tell you what it's like NOT to see!" I've still
not figured out how I could redo that conversation
and have either of us get any MORE information out of
it.


rotfl Can relate.

I never envied Ray his job. He had to move away from
engineering (why go to an engineering school if you want
to be a businessman?) to keep a business running. At
that time, we were very small. I'm willing to bet about
25 people, total. It was his job to make sure the money
kept coming in to keep us paid.


YEah I bet. Hard to take off the hat you've prepared your
life for and put on the other. Interesting story about the
crunch push when the money men wanted to see a working
product.
nFB helped bring ocr mainstream. OTher good aces
technology developers do much the same. Were I working for
somebody else and had to interact with their systems with
their choice of operating systems I could work with jaws.

.. i find it cumbersome to use, but I could get work done with
it. But, that isn't the developer's fault so much as the
environment he had to work with, i.e. giving blind folks
access to mainstream office applications.


I believe he now has a cell-phone sized device that provides similar
functionality? I have a Personal Reader here
but it is much newer than the minicomputer-based generation
that I worked on. It also supports a "hand scanner" (in addition to
the flatbed scanner) which was an option that wasn't available on
the machine when I was familiar with it.


YEp, and it's marketed in partnership with NFB. Mike is
involved in that one. I ever get some business things
caught up where I'd like to be one's in my future. Amazing
device. The ability to just sit down in a restaurant and
read the menu when they dont' ahve one in braille ... a
liberating experience g.

Thing is, you get so many gadgets you almost need to care a
pelican case with foam inserts to keep all yoru gadgets safe g.

snip
Yes, and many of us see in just that market products
designed by people like him who don't think that we, the end
users know wtf, and they have all the answers. The
arrogance of ignorance i always call it. Good intentions,
but we all know about good intentions.


Surprisingly common. Regrettably. Regardless of the
market and user base targeted. These folks should be made to USE
some of the products they've developed!


I've said this about a lot of things over the years. NOt
just use it, but use it in the manner that the end user is
likely to user it. Sometimes if you do that you'll end up
going back to the drawing board.

I've done a couple of simple station logging apps for ham
radio people doing comms for public service events such a
bike-a-thons, etc. Just because I was net manager for the
operation. I also have an older dos laptop I bring along,
because I can run it off 13.8 vdc. So, once I've rolled the thing, I sit it in front of my lady and say, here play with
it. IF she doesn't understand the menus or how to get waht
she needs then I go back to the drawing board. Sometimes
what's intuitive to me isn't to her, and when it's not I
know that I need to rethink the idea.

If you've read
some of the writings of Malcolm Chisholm he asserts rather
strongly that many mixing console manufacturers and
developers never sat behind one and tried to do a session
grin. iT wasn't that they went into the project intending
to design a mixing console that was ergonomically
unfriendly, it was jsut that they hadn't really grabbed any
working audio engineer types and said 'here use this, tell
us what you think."


Exactly. Hence the reason I pick the brains of everyone I
can. There's only so much you can "imagine" -- regardless
of how good your imagination might be!


Right, which is another reason I'm leery of a lot of the
digital mixing console offerings for my remote truck right
now. Were I working with the same act doign the same show,
or pretty close to, i could save my preferred working setup
on whatever storage media it uses, and if it crumps druing
the gig, all I've got to remember is the keystrokes to get
it to load it back upfor me. But, a remote truck might be
working a variety of things, and every time it goes out is
different.

Then there's the old what i do if I"ve got two of us working the console, one of us is flying in an effects cue with an
aux send, and the other one working with the faders for the
percussions section. How do we decide who gets waht menu
up? IF they solve the ergonomics to my liking though I'd
sure rather run cat5 from venue to truck, or even better,
fiber. Yes, part of that stumbling block is blindness
related (see other post) but it's a combination of factors,
the blindness, as well as the fluid working environment.

I recall buying a microwave oven for my mother in law
many years ago. New fangled gizmo! Being an engineer,
I liked the pushbutton keypad -- nothing mechanical that
would be likely to break, wipe clean finish, etc.


I went through that wrestling match a year ago. I like my
knob, point it at what I want, no guessing. Thsoe are hard
to find these days.

But, my wife insisted that we buy the model with the
rotary knob! Ick! But, she knew her mother and what
her mother would more readily relate to -- so I
deferred to her judgement. Of course, she was right.


Yep, I looked all over teh Memphis area, finally got the
last one they had at Kmart last summer. I've an oven on our kitchen range right now i can't use because of the damned
flat panel controls. i don't like cooking on electric
anyway, I've always preferred gas believe it or not. When I cut the fire off under that skillet or pan I want that fire
off now!!! Also, I can feel whether I've got high or low
flame. When I adjust the electric if i misjudge it's going
to be awhile before my adjustment manifests itself, and by
then it might be too far gone to salvage.

And now, to go put some pieces of beef on this charcoal, or
at least get the charcoal happening g.



Regards,
Richard
.... Love is being owned by a rottweiler!
--
| Remove .my.foot for email
| via Waldo's Place USA Fidonet-Internet Gateway Site
| Standard disclaimer: The views of this user are strictly his own.
  #362   Report Post  
Posted to rec.audio.pro
Don Y Don Y is offline
external usenet poster
 
Posts: 137
Default FLAC or other uncompressed formats, which is best?

Hi Richard,

On 5/25/2012 4:42 PM, Richard Webb wrote:

Note National Federation of the Blind vs. American Federation FOR
the Blind! Get the wrong preposition in there and
you'll never hear the end of it!


Yep, I've been one of those to quibble over taht one. big
difference.


And a very emotional one! The whole role of each organization
can be summed up in those two prepositions!

I used to spend a good deal with a guy ("Michael") who I
recall being tied to NFB somehow (I would stumble across
him in various places around the country instead of in
one single locality).

snip

YEah met him a few times. Michael Hingson iirc.


Ha! Excellent! I don't know if I ever knew his last name.
But, a quick google for images turned up lots of photos
of him that I could believe to be "what he looks like,
35 years later!"

He just paused a bit and said, "That's a good question.
You know, Don, if you can tell me what it is like to SEE,
I'll tell you what it's like NOT to see!" I've still
not figured out how I could redo that conversation
and have either of us get any MORE information out of
it.


rotfl Can relate.


But I couldn't! It had never occurred to me just how
silly the question was -- until afterwards.

Its sort of like the "tastes like milk" commercial
(What's milk taste like? shrug What does this
taste like? Milk!)

I never envied Ray his job. He had to move away from
engineering (why go to an engineering school if you want
to be a businessman?) to keep a business running. At
that time, we were very small. I'm willing to bet about
25 people, total. It was his job to make sure the money
kept coming in to keep us paid.


YEah I bet. Hard to take off the hat you've prepared your
life for and put on the other. Interesting story about the
crunch push when the money men wanted to see a working
product.


There are a couple of amusing stories that went along with
this -- but probably not appropriate for discussion in a
public forum! :

I believe he now has a cell-phone sized device that provides similar
functionality?


YEp, and it's marketed in partnership with NFB. Mike is
involved in that one. I ever get some business things
caught up where I'd like to be one's in my future. Amazing
device. The ability to just sit down in a restaurant and
read the menu when they dont' ahve one in braille ... a
liberating experienceg.


That was the feeling I would get whenever setting up a
new machine at a new site. You'd get things running
properly. Some representative from the client agency
would sit down to use the machine -- invariably visually
impaired -- and you'd watch them cringe as they tried to
make sense of that god-awful voice! Literally *squinting*
as if that would somehow improve their hearing skills!

But, you could tell the instant they understood the
dialect. Their eyes would literally go wide -- "Wow!
I can finally read my own personal mail without having
to rely on my secretary -- and having her aware of
things that are none of her business!"

Liberating is a good term.

Thing is, you get so many gadgets you almost need to care a
pelican case with foam inserts to keep all yoru gadgets safeg.


Exactly. Each does *one* thing. And, often, not well!

Surprisingly common. Regrettably. Regardless of the
market and user base targeted. These folks should be made to USE
some of the products they've developed!


I've said this about a lot of things over the years. NOt
just use it, but use it in the manner that the end user is
likely to user it. Sometimes if you do that you'll end up
going back to the drawing board.


There is also the hazzard of making a device that defines
how it must be used. Even if that is the way that 99%
of the user base is likely to use it, it forces 100% of
users to follow that prescription -- even if it isn't
a necessary condition for the device's operation!

"Why do I have to specify this parameter before that
parameter? They are independant yet you are forcing
me to pick a certain one before the other. How did you
decide that this is the only way it should be done?"

I'm reading _The Art of Choosing_, currently. It addresses
how people deal with choices -- among other things. Things
like how the number of choices can affect our satisfaction
with our eventual choice. Too many can be worse than too
few, for example. On the other hand, how choices are
presented to you can greatly affect how well you can make
a set of choices and how happy you can be with the result.

In one example, they demonstrated (through experiment) how
the order that choices are forced upon a user can lead to
increased or decreased satisfaction. Think about web
interfaces where you are forced to make certain choices
before you are presented with the next set of choices -- even
if the first set has no bearing on the second set!

In this case, they allowed real customers to specify the options
they wanted in the automobile they would be ordering. For one
set of customers, they presented the choices in order of
"most choices" to "least choices". E.g., there were more
choices for body paint color than engine size so body color
was "selected" first and, eventually, engine size. For another
set of customers, the order in which the choices were presented
was the exact opposite -- pick the engine, transmission, choice
of sound system, etc. and, finally, the COLOR of the vehicle.

The result of that experiment -- which might not be generalizable
to choice, in general -- was that people found it easier to
proceed from those options with FEW choices to those with
MORE choices than the other way around. I.e., once the user
had specified the engine, accessories, body style, etc., they
had a better image in their mind for how to specify the
remaining options -- like body color.

Wanna bet that most vendors just throw choices at the user
in whatever order is convenient for the vendor??! I.e., if
we know what file format he wants, then we can refine the
sample rates and data formats to those that are supported
*in* that file format. "Piece of cake!"

Why can't you let the user decide what is important to him
and *then* refine your offerings?! The technical problem
in implementing this is exactly the same! But, the
attitude conveyed to the user is entirely different!
*He* drives the device instead of the device driving *him*!

Exactly. Hence the reason I pick the brains of everyone I
can. There's only so much you can "imagine" -- regardless
of how good your imagination might be!


Right, which is another reason I'm leery of a lot of the
digital mixing console offerings for my remote truck right
now. Were I working with the same act doign the same show,
or pretty close to, i could save my preferred working setup
on whatever storage media it uses, and if it crumps druing
the gig, all I've got to remember is the keystrokes to get
it to load it back upfor me. But, a remote truck might be
working a variety of things, and every time it goes out is
different.


OK, I think I follow your reasoning -- though have no
firsthand experience in that application domain (so I
can't comment on how I would react when faced with
the same issues)

Then there's the old what i do if I"ve got two of us working
the console, one of us is flying in an effects cue with an
aux send, and the other one working with the faders for the
percussions section. How do we decide who gets waht menu
up? IF they solve the ergonomics to my liking though I'd
sure rather run cat5 from venue to truck, or even better,
fiber. Yes, part of that stumbling block is blindness
related (see other post) but it's a combination of factors,
the blindness, as well as the fluid working environment.


Can't multiple menus be displayed concurrently?
For example, there are many desktop GUIs that will
let you "pin" (think: thumbtack) a menu or a dialog
to the desktop so that it is "persistent". When you
want to remove the object, you remove the "pin"
and the object goes away.

So, you could open each thing that you wanted to access,
move them to appropriate parts of the desktop, then
pin them in place so they stay accessible/active.

You might also look into what are called "pie menus".
With these, you open the menu and find yourself in
the center of a circular "pie". From there, you pick
a direction to select a specific item from the menu.
Think of the menu as slices of a pie and you are just
deciding which slice you want -- always from the known
reference point in the center of the pie!

Of course, menus have to be designed to keep the number
of choices small. Much easier to pick from 6 or 8
"slices" than 16 or 18!
  #363   Report Post  
Posted to rec.audio.pro
Don Y Don Y is offline
external usenet poster
 
Posts: 137
Default FLAC or other uncompressed formats, which is best?

Hi Richard,

The asterisks spoke, because I"ve configured the screen
reader to speak them, as people use them to set text apart,
or the dreaded footnotes of course.


OK, so my use of them for emphasis detracts from your comprehension
instead of adding to it!

But, the all caps didn't, at least when jsut reading.

NOw, were I editing, capital letters are spoken with a bit
of a raised inflection, but, when read as words they're not.

When just in the reader, not editing, for example, reading
usenet articles, a book that's text or similar I have most
punctuation disabled so sentences sound normal.


Understood. But, would you have any cues that there might
be some punctuation that you might want to see which has
been silenced. For example, the cartoonish way of showing
a pejorative as a jumbled sequence of ad hoc punctuation
marks like $(&*^@#$!

On a different note, I assume you are also victimized by
spelling errors? For example, I tend to end up transposing
pairs of letters simply because one finger finds its way
to a key before the other -- which should have preceded it.
Like teh instead of the.

Cutting to the chase on some of this, colors aren't spoken
at all. I can have the screen reader, and most screen
reader, monitor a portion of the screen, say a status line,
for either a change in text displayed there, or a change in
attributes.


Does it simply speak each change encountered? What if the
time to speak the information exceeds the time between
changes? For example, a timer counting down seconds remaining
until a task's completion. But, "one minute forty nine seconds"
takes longer to speak than the time for the display to change
to "one minute forty eight seconds". Likewise, is the
screen reader preoccupied with this task or can it also
let you wander around to other parts of the screen while
it is monitoring that section?

That's another reason I like asap. IF, for
example, I want a bit of a different configuration which is
more compatible with a drop down menu in an app, we watch
for the status line to change. if it changes to x, load y
configuration, etc.


Ah, OK.

You're right, it takes a bit of extra work to make screen
access technology play with what a person might just
download or use off the shelf. That's why I like things
with textual configuration scripting or guidance, and the
ability to make the program operate as i want it to oeprate
,g.


Understood.

Oh, OK. Some software will note the "email context"
and try to actually keep track of this for you -- using
different voices for each party, etc. Though not their
ACTUAL voices!


NO different voices, I just have the greater than symbol in
my punctuation exceptions for anything that's a mail or
usenet reader app, so it says 'greater" then the line of
text.


OK. Obviously even different voices has a small upper
limit. Keeping track of three different parties in
quoted text would probably leave you distracted by those
voices instead of aided.

So, folks who just quote entire posts and bottom post their
replies are just as bad as folks who top post. Each
is equally hard for you to put back into context (you have to
REMEMBER what was said and remember the reply and then thread them
together in your mind)


YEp, you read through all that, or turn of the filter
quoting and see hree or four screens of quoted material for
a two liner replyg. one reason braille will always be
superior, the ability to skim.


This is directly analagous to reading printed text.
You can cherry pick through large amounts of information
with relatively little effort.

[jaws copy protection]

Yep, see my other post. My stumbling block came with it
mainly when I wanted to install a copy on mom's machine so I
could help her maintain it, a copy on the studio control
room machine, and one in the office. I'd *never* be using
all three simultaneously. Then when I had a system crash on
one system my install key floppy didn't play. That copy
protection has been hacked, but i refuse to play that game
for obvious ethical reasons. Others may, but i respect
intellectual property rights. Ted HEnter worked a long time
to develop it, and though i may not like his protection
scheme, that doesn't give me the right. YOu know the drill.


Yup. This is the dark side of illegal copying. It forces
authors to waste effort protecting their works. And, screws
legitimate users out of the ability to use the product
"fairly".

For example, if I have three computers but only use
one at a time, the morally correct thing is to have one
license. But, how does the author ensure that I really
*am* using just one at a time? How does the author
ensure that the "second computer" isn't a friend's
computer?

The Mercator project (now defunct) tried to layer a speech
interface *under* (not ON TOP OF!) the GUI in UNIX. I.e.,
it replaced the standard GUI libraries with speech-enabled
ones with which the "screen reader" could interact. So,
it knew that "these buttons are part of a group of RADIO
BUTTONS governing this particular option choice", and
"this text box expects a numeric value that specifies
the age of the person", etc.


YEah there were a couple like that, had heard of that one,
or the "speaqualizer project. Both were failures in the
marketplace. MErcator may have never made it to market, but
they tried with Speaqualizer for awhile.


Mercator was an academic project. Yet another example of
people thinking that there would be a "simple" way to
address this problem.

The only simple way to address the problem is NOT to provide
a visual interface! So, applications ALL have to rely on
the same non-visual interface to interact with their users!

If you provide a nonvisual OPTION, then applications will
only give token support -- if any -- to it. On the other
hand, if the only way to get information out of a device is
via that option, then they don't have a choice! This is
the approach I have been taking, lately. Pick an output
modality that addresses everyone in the target audience
and force everything to use that single mechanism!

Festival -- a free package probably available under Linux -- has a
lot of "context modules" that try to alter teh rules for
pronunciation based on context. I.e., so email addresses
are pronounced as "Richard dot Webb dot my dot foot at ..."
instead of some unpronounceable jumble of letters and symbols. For
example, the C C header would be pronounced as "carbon
copy", etc.


YEp, which was I think why so many complained when the
National WEather service went with dectalk speech synths
for their vhf radio forecasts.


The backup speech synthesizer in one of my products has similar
quality issues as Klatt's DECtalk (he wrote it while a student
and DEC commercialized it). It's biggest advantage is that
it is pretty lean when it comes to resources -- which translates
directly to implementation costs and reliability.

I have a DECtalk DTC01 and a DECtalk Express. Plus a few
of the Artic/Votrax-based synthesizers. All have the same
basic advantage -- and the same robotic speech quality!

On the other hand, Festival has a huge footprint. And, is
considerably easier to crash than DECtalk. Where DECtalk
and the other "simple" synthesizers will take a stab at
pronouncing damn near anything you throw at them, Festival
will chew on it for a fair bit of time before commiting
to a pronunciation -- which can be just as wrong as the
other products!

OK. Any particular reason why you're married to that machine?


I'd like to have the raid array for server, and yes, once
we've relocated net connected server is part of battle plan.
Raid would be nice. i've got another box which is going to
be dedicated to firewall/router duties, but would like to
keep that one as server, which was what it did in its former life.


Does the chassis force you to use a certain type of
disk drives? E.g., because of disk carriers? Many
older RAID offerings require SCSI disks. Would you
be happy with RAID in some other form?

INterestign that you learned braille. I'd be lost without
it


When I worked for Kurzweil, I was dealing with visually
impaired customers AT BEST! Seems disrespectful not to
learn to communicate in the form that THEY require.
I.e., doesn't do me much good to leave a handwritten note
telling them "I'll be back after lunch"!

[sightless V U meter]

Ah, OK. Clever.


YEp, for some plans and simple designs, goto ski.org and
download sktf.zip, it's about a 2 mb zip file, multiple
directories, but text files on lots of things, home brewing
adaptive vu solutions, soldering jigs, all sorts of stuff.


OK.

OK. So, this is "yet another DEVICE" that you have.
Like a tactile wris****ch, braille slate, talking
calculator, etc. I.e., it is designed for ONE PURPOSE.


Yep, usedto have the talking calculator, but now just use a
little command line calculator I found on the net some years
ago. OR do a lot of math in my head when out and about.


Yeah, I had given a lot of thought to how you provide
a means for letting folks review their calculations
with being able to view a "tape"

Understood. But, this can be done with different approaches!
For example, one approach is to always reset things to
"the beginning" -- or some other known state. Another
approach is to leave things where you last left them
on the assumption that you will want to do the same sort
of thing, again.


Maybe, but soem devices, such as ROland's sound modules like
to remember where you were last time, and heck, it might be
a week before i want to delve into its menus again, and I
might not remember where I was last time.


Ah, OK. No, I think a device should remember what
you did "last time" -- but, only while you are actively
and continuously using it. If you want to be able
to return to a certain set of options some days later,
you should be able to save those options and explicitly
restore them. If you turn the device off and start
over tomorrow, then everything should resort to some
default -- perhaps even one that YOU have defined
instead of that which the manufacturer has defined.

[computer interface with speech in a live environment]

What can you suggest as an alternative? Is the problem
the quality of the voice? Or the masking effects of
all that music in the background?


Yep, the music, and I'm supposed to be giving my ears to the
audio. Also, you can't amplify speech in an earbud loud
enough often unless you're doing bad things to the ear
canal.


Understood. You want a different communication channel
to interact with the device instead of having to share
the audio channel that you are devoting to the task at
hand.

I've used the same rationale to argue in favor of using
non-visual channels for visual tasks! I.e., those cases
where your eyes are busily engaged in some activity and
shouldn't have to be pulled away just so you could see
which virtual button you were pressing on your iPhone!
  #364   Report Post  
Posted to rec.audio.pro
Don Pearce[_3_] Don Pearce[_3_] is offline
external usenet poster
 
Posts: 2,417
Default FLAC or other uncompressed formats, which is best?

On Fri, 25 May 2012 15:38:54 -0700, "William Sommerwerck"
wrote:

As the wings go forwards through the air, they
twist the air downwards behind them. They pull
the air above them down as they go by. As they
try to pull the air down, the air tries to push the
wings up, and that's what holds the plane up in
the air.


That is very far removed from the common explanation. Bernoulli must be
spinning in his grave. Like a helicopter blade.


There are two explanations in common use. Neither is right or wrong,
both are just a way of looking at things. One considers pressure and
resulting force, the other moving air mass and Newtonian reaction. The
maths of both works out fine.

If you are doing wing design, the Navier-Stokes equations (which use
the first model) are tried and tested.

d
  #365   Report Post  
Posted to rec.audio.pro
Richard Webb[_3_] Richard Webb[_3_] is offline
external usenet poster
 
Posts: 533
Default FLAC or other uncompressed formats, which is best?

On Fri 2012-May-25 19:46, Don Y writes:

Note National Federation of the Blind vs. American Federation FOR
the Blind! Get the wrong preposition in there and
you'll never hear the end of it!

Yep, I've been one of those to quibble over taht one. big
difference.


And a very emotional one! The whole role of each organization can
be summed up in those two prepositions!


YEp, and not just emotional, but that difference makes all
the difference in the world. "for" is somebody doing
somethign "for" somebody. This small two letter word means these are the people being represented.

I used to spend a good deal with a guy ("Michael") who I
recall being tied to NFB somehow (I would stumble across
him in various places around the country instead of in
one single locality).

YEah met him a few times. Michael Hingson iirc.


Ha! Excellent! I don't know if I ever knew his last name.
But, a quick google for images turned up lots of photos
of him that I could believe to be "what he looks like,
35 years later!"


Yep. Mike was sort of a local hero for folks who worked in
the wtc on 9/11 helping lead a bunch of them out.

snip
I believe he now has a cell-phone sized device that provides similar
functionality?

YEp, and it's marketed in partnership with NFB. Mike is
involved in that one. I ever get some business things
caught up where I'd like to be one's in my future. Amazing
device. The ability to just sit down in a restaurant and
read the menu when they dont' ahve one in braille ... a
liberating experienceg.


That was the feeling I would get whenever setting up a
new machine at a new site. You'd get things running
properly. Some representative from the client agency
would sit down to use the machine -- invariably visually
impaired -- and you'd watch them cringe as they tried to
make sense of that god-awful voice! Literally *squinting*
as if that would somehow improve their hearing skills!


Yeah I know, when Ray went eventually with Digital
Equipment's Dectalk it improved on the original voice quite
a bit. The only way to get more natural sounding
synthesized speech than dectalk is the way At&T does it,
with capturing the phonemes of an actual speaker, easy to do when the vocabulary required is rather limited, such as
numbers and a few words. MOst of what the public
encounters with telephone systems is this latter type. As I ntoed, NOaa for a long time when they first went digital
with their vhf weather broadcasts was using the Dectalk
voices. I'm using a Doubletalk card here, not quite as
natural to the uninitiated, but still good enough, and, at
the time, half the price g. I also liked the serial port
doubletalk, small package, powered from a 9 volt cell.


But, you could tell the instant they understood the
dialect. Their eyes would literally go wide -- "Wow!
I can finally read my own personal mail without having
to rely on my secretary -- and having her aware of
things that are none of her business!"


Liberating is a good term.


Indeed it is, when youv'e never had the experience of being
able to read your own mail, or even identify wehther the
customer truly did give you a $20 bill.

Thing is, you get so many gadgets you almost need to

carry a
pelican case with foam inserts to keep all your gadgets safeg.


Exactly. Each does *one* thing. And, often, not well!


YEah I know. Been there done that. Back eyars ago when I
was doing briefcase live sound I'd have my vibrating vu
meters, the doubletalk external card, a talkign calculator,
talking vom, etc.

snip
I've said this about a lot of things over the years. NOt
just use it, but use it in the manner that the end user is
likely to use it. Sometimes if you do that you'll end up
going back to the drawing board.


There is also the hazzard of making a device that defines
how it must be used. Even if that is the way that 99%
of the user base is likely to use it, it forces 100% of
users to follow that prescription -- even if it isn't
a necessary condition for the device's operation!


YEah there's that. Liked your discussion of choices. To me color is one of the last things, were i buying a new vehicle I might want to think about. First off, I'd probably want
to talk to the dealer about the trailer towing package,
which will of course dictate the type of engine/drivetrain
available. Then we get into the amenities, bells and
whistles, etc. But first, the function of the thing is
going to be what I want to get nailed down first.

You know I wa reading a similar subject to yoru reading on
choices recently, an economics professor from MIT on how our choices impact the economic decisions we make, touching on
ethics, all sorts of stuff like that. Called Predictably
Irrational. Can't recall author's name right now, but it,
and his companion piece "the up side of irrationality" are
both interesting reads on the subject.
When I was operating a fixed location studio and I'd have a
songwriter coming in for demos or a group I'd always ask my
first question which was "In your mind's ear, when you hear
your song fully arranged and produced, what does it sound
like? Bring me an example of production already recorded
that fits what your mind's ear hears." This way, I could
choose the right capture techniques, such as how i"d place
instruments, how I'd mic drums, etc.

"Why do I have to specify this parameter before that
parameter? They are independant yet you are forcing
me to pick a certain one before the other. How did you
decide that this is the only way it should be done?"


Another reason I like configuring software with text files
if I can get it. I can look through the configuration file, set options I'm sure of, and do some more poring over the
docs to understand further waht needs to be defined.

snip
Wanna bet that most vendors just throw choices at the user
in whatever order is convenient for the vendor??! I.e., if
we know what file format he wants, then we can refine the
sample rates and data formats to those that are supported
*in* that file format. "Piece of cake!"


Yep, that's my whole way of looking at this sort of thing.
What do i want/ What will support what I want, i.e. sample
rate I wish, interplatform portability, etc.

Why can't you let the user decide what is important to him
and *then* refine your offerings?! The technical problem
in implementing this is exactly the same! But, the
attitude conveyed to the user is entirely different!
*He* drives the device instead of the device driving *him*!


Yeup, my point exactly. See below.

Right, which is another reason I'm leery of a lot of the
digital mixing console offerings for my remote truck right
now. Were I working with the same act doign the same show,
or pretty close to, i could save my preferred working setup
on whatever storage media it uses, and if it crumps druing
the gig, all I've got to remember is the keystrokes to get
it to load it back upfor me. But, a remote truck might be
working a variety of things, and every time it goes out is
different.


OK, I think I follow your reasoning -- though have no
firsthand experience in that application domain (so I
can't comment on how I would react when faced with
the same issues)


YEp, this one might be a sporting event, the next might be a festival with all sorts of acts coming on and off stage, the next event, capture of a gospel revival type event for
broadcast.

Then there's the old what i do if I've got two of us working
the console, one of us is flying in an effects cue with an
aux send, and the other one working with the faders for the
percussions section. How do we decide who gets waht menu
up? IF they solve the ergonomics to my liking though I'd
sure rather run cat5 from venue to truck, or even better,
fiber. Yes, part of that stumbling block is blindness
related (see other post) but it's a combination of factors,
the blindness, as well as the fluid working environment.


That, and I like my reliability. I'm so familiar with
analog consoles of various types that when the "oh ****"
moment hits, i fall back on what I've learned, and don't
have the anxiety of wondering if this thing's going to crump in a way that i can't get it back to getting usable work
done when it's for the money.

Can't multiple menus be displayed concurrently?
For example, there are many desktop GUIs that will
let you "pin" (think: thumbtack) a menu or a dialog
to the desktop so that it is "persistent". When you
want to remove the object, you remove the "pin"
and the object goes away.


There's the rub. I've seen two approaches with a lot of
these.

One approach gives you banks of channel strips, say 1-16,
17-32, etc. Possibly even in 8 channel banks, so make for a smaller footprint. SO, if I'm wanting to do a line check on channel 24 let's say, and we've got 8 channel banks, i've
got a choice, disrupt the work of the mixer mixing the show
while I do that line check, or not.

The other approach, limited actual controls, and your menu
selects whether those controls are faders, pan controls, aux sends, etc.

There's my main stumbling block. With my old analog iron
all the aux sends are there, bus assignments, vca groups,
all are right there. yEs it means sometimes the mixer is
working at full extension of his body to reach that control, but that control is there, and I can manipulate it, or
somebody else can while I'm doing something else. One of
the biggest praises you'll hear sung of a lot of the new
digital consoles is the smaller footprint, no more working
at full extension to reach that control. But, for me, that
smaller footprint is in exchange for reliability, and the
familiar interface. After all, I've been interacting with
analog consoles now for decades. But again, some parts of
that might be as simple as my ee friend when I was asking
him about digital audio metering when he basically gave me
the "free your mind instead" comment. Btw, this was a blind electrical engineer. HE reminded me that the work flow, and the development of a "house standard" was probably more
important for me to keep to on every project. I settled on
the usual 0vu = -18dbfs because it seems to be acceptable
most places digital audio might go. I just got in the habit of calibrating the system to that, and printing some 1 khz
tone at 0vu -18dbfs on anything that was going away, either
for mastering or for broadcast.

What I'd really want to do before plunking down the dollars
for a digital console was actually work with it for a couple of days first, and get my mind around some of the concepts,
then decide if that one fits in my working environment.
You might also look into what are called "pie menus".
With these, you open the menu and find yourself in
the center of a circular "pie". From there, you pick
a direction to select a specific item from the menu.
Think of the menu as slices of a pie and you are just
deciding which slice you want -- always from the known
reference point in the center of the pie!


INteresting concept. DOn't know if they offer that sort of
thing though. This is why I'd really need to sit down in
front of one for a day or two, at least with multitracks of
prerecorded material up to really put the thing through its
paces, and why often I'm reluctant to buy a new device from
the in-store demo. It's taken me a long time to decide on
one of those little recorders like the zoom, etc. But,
thanks to reviews in this group I think the Tascam is in my
very near future. i asked the reviewer specifically to use
the thing thinking about how easy it was to interact with
sans looking at the device.

Of course, menus have to be designed to keep the number
of choices small. Much easier to pick from 6 or 8
"slices" than 16 or 18!


rotfl Then there's that. YOu can always offer me related choices on a submenu.

Regards,
Richard
--
| Remove .my.foot for email
| via Waldo's Place USA Fidonet-Internet Gateway Site
| Standard disclaimer: The views of this user are strictly his own.


  #366   Report Post  
Posted to rec.audio.pro
Mxsmanic Mxsmanic is offline
external usenet poster
 
Posts: 805
Default FLAC or other uncompressed formats, which is best?

William Sommerwerck writes:

As the wings go forwards through the air, they
twist the air downwards behind them. They pull
the air above them down as they go by. As they
try to pull the air down, the air tries to push the
wings up, and that's what holds the plane up in
the air.


That is very far removed from the common explanation.


The "common" explanation is incorrect. This explanation is correct.
  #367   Report Post  
Posted to rec.audio.pro
Neil Gould Neil Gould is offline
external usenet poster
 
Posts: 872
Default FLAC or other uncompressed formats, which is best?

Don Y wrote:
Hi Neil,

On 5/25/2012 1:04 PM, Neil Gould wrote:
Don Y wrote:

The difference is between knowing the "facts" that describe the
thing in question vs. being able to "internalize" your understanding
of it. "Grok", if you understand the reference. Often, the latter
may be an incredibly dumbed down "feel" for what's going on
vs. a highly technical rationalization for it.

E.g., a wing provides lift because the THICKER air under
it pushes it up through the THINNER air flowing over it!
:-/

??!!??

Aside from that being completely wrong, I hope it's not an example of
"Grokking" the topic! ;-)


It's not a "technical explanation" but, rather, a way of internalizing
what is happening.

Why would one want to "internalize" a completely incorrect notion of how
something works? What is the value of that? Would it not be better to
"internalize" a valid explanation?

How would *you* explain lift to a 5 year old?

There are things in life that a 5 year old can't understand. Still, as one
who built flying model planes from earlier than that age, it is possible to
help a 5 year old work with the principles without their having to
understand the technical details. It is a pointess and possibly harmful
setback to the child to give explanations that are completely wrong.

--
best regards,

Neil


  #368   Report Post  
Posted to rec.audio.pro
Neil Gould Neil Gould is offline
external usenet poster
 
Posts: 872
Default FLAC or other uncompressed formats, which is best?

Don Pearce wrote:
On Fri, 25 May 2012 15:38:54 -0700, "William Sommerwerck"
wrote:

As the wings go forwards through the air, they
twist the air downwards behind them. They pull
the air above them down as they go by. As they
try to pull the air down, the air tries to push the
wings up, and that's what holds the plane up in
the air.


That is very far removed from the common explanation. Bernoulli must
be spinning in his grave. Like a helicopter blade.


There are two explanations in common use. Neither is right or wrong,
both are just a way of looking at things. One considers pressure and
resulting force, the other moving air mass and Newtonian reaction. The
maths of both works out fine.

If you are doing wing design, the Navier-Stokes equations (which use
the first model) are tried and tested.

Thanks for saving me a bit of time with your excellent summation! ;-)

--
best regards,

Neil


d



  #369   Report Post  
Posted to rec.audio.pro
Don Y Don Y is offline
external usenet poster
 
Posts: 137
Default FLAC or other uncompressed formats, which is best?

Hi Richard,

Ha! Excellent! I don't know if I ever knew his last name.
But, a quick google for images turned up lots of photos
of him that I could believe to be "what he looks like,
35 years later!"


Yep. Mike was sort of a local hero for folks who worked in
the wtc on 9/11 helping lead a bunch of them out.


Obviously long after my experiences with him. Must have been
a doubly terrifying experience for him.

That was the feeling I would get whenever setting up a
new machine at a new site. You'd get things running
properly. Some representative from the client agency
would sit down to use the machine -- invariably visually
impaired -- and you'd watch them cringe as they tried to
make sense of that god-awful voice! Literally *squinting*
as if that would somehow improve their hearing skills!


Yeah I know, when Ray went eventually with Digital
Equipment's Dectalk it improved on the original voice quite
a bit. The only way to get more natural sounding
synthesized speech than dectalk is the way At&T does it,
with capturing the phonemes of an actual speaker, easy to
do when the vocabulary required is rather limited, such as
numbers and a few words.


Exactly. But of extremely limited use!

MOst of what the public
encounters with telephone systems is this latter type. As I
ntoed, NOaa for a long time when they first went digital
with their vhf weather broadcasts was using the Dectalk


I suspect you could rework the weather broadcasts to use
a limited vocabulary and, thus, better speech quality.
On the other hand, it means that you have to be able to
anticipate EVERYTHING that you might need to say over
that medium. For example, you might not be prepared to
use it to announce an alien invasion! grin

I have a trimmed down synthesizer that I fall back on if
the primary synthesizer is unavailable in one of the products
I'm designing, currently. It needed to be robust -- so that
I could count on it working regardless of what might have
broken in the system. I could have opted for a better
quality limited vocabulary design -- but, didn't want to have
to set that vocabulary in stone and discover, later, that
I needed to be able to say something that I couldn't.

voices. I'm using a Doubletalk card here, not quite as
natural to the uninitiated, but still good enough, and, at
the time, half the priceg. I also liked the serial port
doubletalk, small package, powered from a 9 volt cell.


The DECtalk express suffers from the sin of requiring a
special rechargeable battery. Another fault, in my opinion,
for an assistive technology device (where do you buy that
replacement battery -- today??)

There is also the hazzard of making a device that defines
how it must be used. Even if that is the way that 99%
of the user base is likely to use it, it forces 100% of
users to follow that prescription -- even if it isn't
a necessary condition for the device's operation!


YEah there's that. Liked your discussion of choices. To me
color is one of the last things, were i buying a new vehicle
I might want to think about. First off, I'd probably want
to talk to the dealer about the trailer towing package,
which will of course dictate the type of engine/drivetrain
available. Then we get into the amenities, bells and
whistles, etc. But first, the function of the thing is
going to be what I want to get nailed down first.


Exactly. So, for a web site to ask you to pick a color,
first, isn't helpful. Especially if, later, you realize that
your choice has ruled out something else that you really
want most!

"I'm sorry but the automatic transmission option that you
selected disqualifies the choice of 7 liter diesel. Would
you like to start over?"

Imagine if it isn't even smart enough to tell you of
that constraint! You get to the point where you expect
to select an engine and find the engine not listed!

You know I wa reading a similar subject to yoru reading on
choices recently, an economics professor from MIT on how our
choices impact the economic decisions we make, touching on
ethics, all sorts of stuff like that. Called Predictably
Irrational. Can't recall author's name right now, but it,
and his companion piece "the up side of irrationality" are
both interesting reads on the subject.


Dan Ariely. We were "required" to take eight courses in
The Humanities to graduate. I guess they didn't want a
bunch of engineers with no appreciation of other aspects
of life and education let loose on the unsuspecting masses.

grin

I recall selecting American History as one of my courses
thinking I had already had two years of that in High School
so it would be a recent memory, for me! The professor was
an economist. So, I relearned all that history with an
entirely different spin than the noble presentation to
which I'd previously been subjected. Fascinating!

So, I've enjoyed reading books by economists that touch
on these sorts of subjects. _The Price of Everything_
discusses all of our actions -- social and otherwise -- in
terms of economic transactions. E.g., a woman selling
uterine services in a marriage transaction.

_The Art of Choosing_ describes how we "value" choice
in different societies and how it impacts our decisions.
For example, how much we will "spend" to keep choices
available even if they aren't choices of which we would want
to avail ourselves.

_How We Decide_ and _Predictably Irrational_ looked at how
easily we are manipulated and con ourselves in our behavioral
choices, etc. How we can actually think an $10 pill is
better than an identical $0.50 pill, etc. How we *don't*
have a "Market" in which consumers and producers compromise
on price but, rather, how Producers manipulate our expectations
of price to a point that they are happy with, etc.

By far, the experiments that have been concocted and presented
in the texts are the most fascinating. And, they make you
laugh at the snobbery that you often see around you -- the folks
who couldn't differentiate an $80 bottle of wine from a $2
bottle of wine -- yet, when confronted with the $80 price tag
ON THE $2 BOTTLE, would *swear* it tastes a LOT better than
the $80 bottle that has been mislabeled as $2!

When I was operating a fixed location studio and I'd have a
songwriter coming in for demos or a group I'd always ask my
first question which was "In your mind's ear, when you hear
your song fully arranged and produced, what does it sound
like? Bring me an example of production already recorded
that fits what your mind's ear hears." This way, I could
choose the right capture techniques, such as how i"d place
instruments, how I'd mic drums, etc.


Good point! I would never buy consumer kit from specs.
Rather, how it sounded to me when reproducing the sorts of
program material I was listening to at that point in my life.

"Why do I have to specify this parameter before that
parameter? They are independant yet you are forcing
me to pick a certain one before the other. How did you
decide that this is the only way it should be done?"


Another reason I like configuring software with text files
if I can get it. I can look through the configuration file,
set options I'm sure of, and do some more poring over the
docs to understand further waht needs to be defined.


The problem with that approach comes when two option
choices are interdependant. There is nothing preventing
you from asking for a set of incompatible options -- until
some program examines your choices and complains.

Can't multiple menus be displayed concurrently?
For example, there are many desktop GUIs that will
let you "pin" (think: thumbtack) a menu or a dialog
to the desktop so that it is "persistent". When you
want to remove the object, you remove the "pin"
and the object goes away.


There's the rub. I've seen two approaches with a lot of
these.

One approach gives you banks of channel strips, say 1-16,
17-32, etc. Possibly even in 8 channel banks, so make for a
smaller footprint. SO, if I'm wanting to do a line check on
channel 24 let's say, and we've got 8 channel banks, i've
got a choice, disrupt the work of the mixer mixing the show
while I do that line check, or not.

The other approach, limited actual controls, and your menu
selects whether those controls are faders, pan controls, aux sends, etc.

There's my main stumbling block. With my old analog iron
all the aux sends are there, bus assignments, vca groups,
all are right there. yEs it means sometimes the mixer is
working at full extension of his body to reach that control,
but that control is there, and I can manipulate it, or
somebody else can while I'm doing something else.


Understood. The same sort of thing is true with theatrical
lighting panels, video switchers (the video equivalent of
an audio mixer), etc. To make everything visible and accessible
or just some selectable subset of it. I have the same sort
of problem when authoring multimedia presentations (though
those aren't done in real time)

That's why I thought the push-pin approach might be a
good compromise -- let you decide which parts of the interface
you want to have access to. But you are still constrained by the
physical size of the display.

What I'd really want to do before plunking down the dollars
for a digital console was actually work with it for a couple
of days first, and get my mind around some of the concepts,
then decide if that one fits in my working environment.


I would imagine that might give you an 80% idea of what
the change would be like. But, someday, you'd find
yourself facing a problem that had to be solved NOW and
scurrying to sort out how to get to the solution you
want in that environment.

Sort of like a surgeon doing a laproscopic procedure and
suddenly everything going to ****. Drop the tools, grab
a big knife and cut the patient open. You're not going
to fix the problem through that tiny incision -- unless
you are incredibly skilled and fluent with the technology!

You might also look into what are called "pie menus".
With these, you open the menu and find yourself in
the center of a circular "pie". From there, you pick
a direction to select a specific item from the menu.
Think of the menu as slices of a pie and you are just
deciding which slice you want -- always from the known
reference point in the center of the pie!


INteresting concept. DOn't know if they offer that sort of
thing though.


grin Because they haven't had to think about the full
range of users that might sit behind their kit!

As I said, I've put a lot of thought into what you really need
to interact with a given device and how to minimize the
cognitive loading on the user. You don't want to require 100%
of his attention. Especially if it is to perform some low
grade task!

Imagine if you had to type the word "ANSWER" on a keyboard to
answer your phone. Ridiculous, right? It would require too
much focused attention for a task that should be trivial.

This is why I'd really need to sit down in
front of one for a day or two, at least with multitracks of
prerecorded material up to really put the thing through its
paces, and why often I'm reluctant to buy a new device from
the in-store demo. It's taken me a long time to decide on
one of those little recorders like the zoom, etc. But,
thanks to reviews in this group I think the Tascam is in my
very near future. i asked the reviewer specifically to use
the thing thinking about how easy it was to interact with
sans looking at the device.

Of course, menus have to be designed to keep the number
of choices small. Much easier to pick from 6 or 8
"slices" than 16 or 18!


rotfl Then there's that. YOu can always offer me related
choices on a submenu.


Exactly. With use, you develop a sort of muscle memory
as your hands are accustomed to making certain motions to
do certain things. If, instead, you have to coordinate your
eyes and hands to *pick* a particular option from a linear
list, you have to rely on that visual feedback to ensure
you are at the right point in that list before making your
selection.

E.g., one of the devices that I am developing uses speech for
its sole output medium and a touchpad for its sole input
medium. You issue "gestures" on the touchpad to initiate
commands and selections. And, hear the results of those
commands.

So, for example, you might drag your fingertip across the
touchpad from left to right to cause the mechanism to move
"to the right" -- while you are watching it! Then, tap
the touchpad to cause it to stop. "Draw" a circular motion
counterclockwise to cause the grabber to open. Drag your
fingertip from top to bottom to cause it to be lowered.
Tap, again, to stop. Draw a clockwise circle to command the
grabber to close. etc.

All of these are trivial actions that you could easily memorize.
None of which requires any precision on your part. They can
all be performed while your eyes are busy with another task.
And, none of them DISTRACT you from that task.

Repeat the example with some task that does NOT require
vision to see the significance of this approach.
  #370   Report Post  
Posted to rec.audio.pro
Richard Webb[_3_] Richard Webb[_3_] is offline
external usenet poster
 
Posts: 533
Default FLAC or other uncompressed formats, which is best?

On Fri 2012-May-25 20:46, Don Y writes:
The asterisks spoke, because I've configured the screen
reader to speak them, as people use them to set text apart,
or the dreaded footnotes of course.


OK, so my use of them for emphasis detracts from your comprehension
instead of adding to it!



But, the all caps didn't, at least when just reading.

NOw, were I editing, capital letters are spoken with a bit
of a raised inflection, but, when read as words they're not.

When just in the reader, not editing, for example, reading
usenet articles, a book that's text or similar I have most
punctuation disabled so sentences sound normal.


Understood. But, would you have any cues that there might
be some punctuation that you might want to see which has
been silenced. For example, the cartoonish way of showing
a pejorative as a jumbled sequence of ad hoc punctuation
marks like $(&*^@#$!


I note i saw those in my reader,when reading, not editing
this reply. SO obviously I had at some point made them
exceptions, probably because of the common use of many of
those symjbols elsewhere, such as @ in email addresses, # as pound symbol, etc. etc. I probably have much more enabled
in a usenet/mail/bbs reader application than I would in just a straight text reader that I'd use to read a novel g.

On a different note, I assume you are also victimized by
spelling errors? For example, I tend to end up transposing
pairs of letters simply because one finger finds its way
to a key before the other -- which should have preceded it.
Like teh instead of the.


I'm victimized more by makign them g.

Cutting to the chase on some of this, colors aren't spoken
at all. I can have the screen reader, and most screen
readers, monitor a portion of the screen, say a status line,
for either a change in text displayed there, or a change in
attributes.


Does it simply speak each change encountered? What if the
time to speak the information exceeds the time between
changes? For example, a timer counting down seconds remaining until
a task's completion. But, "one minute forty nine seconds" takes
longer to speak than the time for the display to change to "one
minute forty eight seconds". Likewise, is the
screen reader preoccupied with this task or can it also
let you wander around to other parts of the screen while
it is monitoring that section?


Usually I'll tell the screen reader app to remain silent,
and park speech cursor over that display. IF that display
is going to be changing a lot I tell the screen reader to
ignore that line entirely, and only force it to go there
when I want to look at it. Usually I'll use monitoring of a status line to tell ti when to change configurations. E.g.
when status line changes to x from y load a configuration
which tracks a light bar as your focus of where you are,
etc. etc. Rather complex, and probably extremely boring to
the folks here. We should probably take this line to email g..

snip
big snip
one reason braille will always be
superior, the ability to skim.


This is directly analagous to reading printed text.
You can cherry pick through large amounts of information
with relatively little effort.


Indeed, and, believe it or not, i retain what i read better, as well as read fast using it. IN most cases, synthesized
speech is the most cost effective and most effective in
other ways, compromise that can be achieved. Braille
displays are clunky, hard to maintain and don't achieve good reading speed, or efficient work flow, unless you're a
customer service rep dealing with both the computer and
customers on the phone. After all, they take your hands
away from the keyboard, they can display a very limited
amount of text at one shot, all them little solenoid springs and mechanical parts ... aaargh

Yup. This is the dark side of illegal copying. It forces
authors to waste effort protecting their works. And, screws
legitimate users out of the ability to use the product
"fairly".


For example, if I have three computers but only use
one at a time, the morally correct thing is to have one
license. But, how does the author ensure that I really
*am* using just one at a time? How does the author
ensure that the "second computer" isn't a friend's
computer?


Exactly waht I ran into with it. My mom didn't want a
screenreader, bu was glad to have my help maintaining her
system, often without her having to stand over my shoulder
and play screenreader. The two machines at the studio, I'd
only be using one of them at a time, and the owner of said
studio didn't want a screen reader either. I didn't use the product at all at home, I was beta testing a competitor's
screen reader for the gui environment in fact. I just
didn't think it was fair to my employer to use a beta at
work.

The Mercator project (now defunct) tried to layer a speech
interface *under* (not ON TOP OF!) the GUI in UNIX. I.e.,
it replaced the standard GUI libraries with speech-enabled

ones with which the "screen reader" could interact.

snip
YEah there were a couple like that, had heard of that one,
or the "speaqualizer project. Both were failures in the
marketplace. MErcator may have never made it to market, but
they tried with Speaqualizer for awhile.


Mercator was an academic project. Yet another example of
people thinking that there would be a "simple" way to
address this problem.


Yep, and the Speaqualizer tried to do it as an integral part of hardware, you get speech as soon as the machine boots,
giving you access to bios, etc.

The only simple way to address the problem is NOT to provide a
visual interface! So, applications ALL have to rely on
the same non-visual interface to interact with their users!


If you provide a nonvisual OPTION, then applications will
only give token support -- if any -- to it. On the other
hand, if the only way to get information out of a device is
via that option, then they don't have a choice! This is
the approach I have been taking, lately. Pick an output
modality that addresses everyone in the target audience
and force everything to use that single mechanism!


iNdeed, and this is what we're finding with a lot of web
portals that do things that are only usable with vision.
Anyone who's doing web development should looik at a series
of articles discussing just this issue in this month's
Braille MOnitor, available in text, no doubt from
www.nfb.org.

Festival -- a free package probably available under Linux -- has a
lot of "context modules" that try to alter teh rules for
pronunciation based on context. I.e., so email addresses
are pronounced as "Richard dot Webb dot my dot foot at ..."

snip
YEp, which was I think why so many complained when the
National WEather service went with dectalk speech synths
for their vhf radio forecasts.


The backup speech synthesizer in one of my products has similar
quality issues as Klatt's DECtalk (he wrote it while a student and
DEC commercialized it). It's biggest advantage is that
it is pretty lean when it comes to resources -- which translates
directly to implementation costs and reliability.


Did you ever check out that kid a few years ago that made a
dectalk sing? HE spent some serious time coding that, iirc
the kid was only 16 years old or so when he did this one. I used to have a url for it, but it disappeared in Katrina. I can't even recall his name it's been so long.


I have a DECtalk DTC01 and a DECtalk Express. Plus a few
of the Artic/Votrax-based synthesizers. All have the same
basic advantage -- and the same robotic speech quality!


Yep, as does my doubletalk. I had, before Katrina, two
doubletalk internal cards, a doubletalk lite external, and
an audaptor.

On the other hand, Festival has a huge footprint. And, is
considerably easier to crash than DECtalk. Where DECtalk
and the other "simple" synthesizers will take a stab at
pronouncing damn near anything you throw at them, Festival
will chew on it for a fair bit of time before commiting
to a pronunciation -- which can be just as wrong as the
other products!


This is why a lot of the screenreader developers, and speech synth developers sort of "shared the load" you might say.
Common pronunciation, at the phoneme level is often handled
by rom within the synthesizer itself, exceptions and the
like are handled by the software on your hard disk.

OK. Any particular reason why you're married to that machine?

I'd like to have the raid array for server, and yes, once
we've relocated net connected server is part of battle plan.
Raid would be nice. i've got another box which is going to
be dedicated to firewall/router duties, but would like to
keep that one as server, which was what it did in its former life.


Does the chassis force you to use a certain type of
disk drives? E.g., because of disk carriers? Many
older RAID offerings require SCSI disks. Would you
be happy with RAID in some other form?


I doubt the chasis does, I'd have to look inside the box.
MOst i did with it when it was given to me was boot it up
once. IF we could get raid in some other form, that would
be cool too. Would lose those two big scsi hard drives then and have to do somethign else with them, but ... Just trying to use what's existing in the box with minimal $$$ outlay,
if possible. Still it might be worth doing that to get
totally away from windows as server app. I'll have to way
pros and cons of that one when we get to it. Right now that machine is sitting in storage unit with some of those little deseccant packages inside the case g.

DY [sightless V U meter]

Ah, OK. Clever.


YEp, for some plans and simple designs, goto ski.org and
download sktf.zip, it's about a 2 mb zip file, multiple
directories, but text files on lots of things, home brewing
adaptive vu solutions, soldering jigs, all sorts of stuff.


OK.


OK. So, this is "yet another DEVICE" that you have.
Like a tactile wris****ch, braille slate, talking
calculator, etc. I.e., it is designed for ONE PURPOSE.


snip
Yeah, I had given a lot of thought to how you provide
a means for letting folks review their calculations
with being able to view a "tape"


Believe it or not, I've done a lot of this in batch scripts
g.

Re menus in devices ...
Maybe, but some devices, such as ROland's sound modules like
to remember where you were last time, and heck, it might be
a week before i want to delve into its menus again, and I
might not remember where I was last time.


Ah, OK. No, I think a device should remember what
you did "last time" -- but, only while you are actively
and continuously using it. If you want to be able

Y to return to a certain set of options some days later,
Y you should be able to save those options and explicitly
Y restore them. If you turn the device off and start
Y over tomorrow, then everything should resort to some
default -- perhaps even one that YOU have defined
instead of that which the manufacturer has defined.


Yeah sounds like a good compromise. If I go back in there
during the same 'session" it remembers where I last was.
OTherwise, it goes to the start.

DY [computer interface with speech in a live environment]

What can you suggest as an alternative? Is the problem
the quality of the voice? Or the masking effects of
all that music in the background?


Yep, the music, and I'm supposed to be giving my ears to the
audio. Also, you can't amplify speech in an earbud loud
enough often unless you're doing bad things to the ear
canal.


Understood. You want a different communication channel
to interact with the device instead of having to share
the audio channel that you are devoting to the task at
hand.


That's it exactly. I'm accustomed, as i said in another
post this thread, to not having to do anything butdirectly
communicate with the device. IF I want that channel strip
assigned to a certain bus or certain vca group, push the
button. IF I want audio from that channel on aux bus 3, i
adjust that aux send. I don't have to play where the f*$@
am I? IT's automatic like buttoning your coat, zipping your pants or tying your shoes.

I've used the same rationale to argue in favor of using
non-visual channels for visual tasks! I.e., those cases
where your eyes are busily engaged in some activity and
shouldn't have to be pulled away just so you could see
which virtual button you were pressing on your iPhone!


Uh huh! Just my point with some of these complex devices
that are made for people to operate while driving, etc.
Give them an auditory channel for the info, keep the eyes on the road, and the hands upon the wheel please.


Regards,
Richard
--
| Remove .my.foot for email
| via Waldo's Place USA Fidonet-Internet Gateway Site
| Standard disclaimer: The views of this user are strictly his own.


  #371   Report Post  
Posted to rec.audio.pro
Don Y Don Y is offline
external usenet poster
 
Posts: 137
Default FLAC or other uncompressed formats, which is best?

Hi Richard,

[status line and dynamic information displays]

Usually I'll tell the screen reader app to remain silent,
and park speech cursor over that display. IF that display
is going to be changing a lot I tell the screen reader to
ignore that line entirely, and only force it to go there
when I want to look at it. Usually I'll use monitoring of a
status line to tell ti when to change configurations. E.g.
when status line changes to x from y load a configuration
which tracks a light bar as your focus of where you are,
etc. etc. Rather complex, and probably extremely boring to
the folks here. We should probably take this line to emailg..


OK. I'll pick a suitable subject line so my mail is
recognizable -- though probably not today as I am busy
getting my other half ready for a trip.

This is directly analagous to reading printed text.
You can cherry pick through large amounts of information
with relatively little effort.


Indeed, and, believe it or not, i retain what i read better, as
well as read fast using it. IN most cases, synthesized
speech is the most cost effective and most effective in
other ways, compromise that can be achieved. Braille
displays are clunky, hard to maintain and don't achieve good
reading speed, or efficient work flow, unless you're a
customer service rep dealing with both the computer and
customers on the phone. After all, they take your hands
away from the keyboard, they can display a very limited
amount of text at one shot, all them little solenoid springs
and mechanical parts ...aaargh


Not to mention the expense! I have a Braille N Speak here
(I think that is the name). There is an Italian firm that
makes a Braille display that is pzieoelectric (spelling?)
which should be easier to keep running. But, I think they
want $700/cell -- or something equally outrageous!

Maybe if the eurozone crumbles, you can pick these up
for a song! grin

[copy protection]

Exactly waht I ran into with it. My mom didn't want a
screenreader, bu was glad to have my help maintaining her
system, often without her having to stand over my shoulder
and play screenreader. The two machines at the studio, I'd
only be using one of them at a time, and the owner of said
studio didn't want a screen reader either. I didn't use the
product at all at home, I was beta testing a competitor's
screen reader for the gui environment in fact. I just
didn't think it was fair to my employer to use a beta at
work.


Borland used to have a "like a book" license. I.e., you
could move the license around. But, still not as flexible
as it could be.

The problem is, people think there's no "cost", there, and
let that confuse their idea of "value"! Just because you
didn't pay for something, doesn't make it valueless! I.e.,
if you think there is no value to that second copy of the
software, then live without it -- you should not experience
any *costs* if it had no *value*!

If you provide a nonvisual OPTION, then applications will
only give token support -- if any -- to it. On the other
hand, if the only way to get information out of a device is
via that option, then they don't have a choice! This is
the approach I have been taking, lately. Pick an output
modality that addresses everyone in the target audience
and force everything to use that single mechanism!


iNdeed, and this is what we're finding with a lot of web
portals that do things that are only usable with vision.
Anyone who's doing web development should looik at a series
of articles discussing just this issue in this month's
Braille MOnitor, available in text, no doubt from
www.nfb.org.


I've not looked at their site in a long time. To be honest,
I'm a bit set off by the "evangelism". Perhaps it is necessary
to bring the "message" to the "unwashed masses". But, in my
case, it feels like preaching to the choir.

Sort of like telling a smoker he should quit. I'm sure he
already knows that

The backup speech synthesizer in one of my products has similar
quality issues as Klatt's DECtalk (he wrote it while a student and
DEC commercialized it). It's biggest advantage is that
it is pretty lean when it comes to resources -- which translates
directly to implementation costs and reliability.


Did you ever check out that kid a few years ago that made a
dectalk sing? HE spent some serious time coding that, iirc
the kid was only 16 years old or so when he did this one. I
used to have a url for it, but it disappeared in Katrina. I
can't even recall his name it's been so long.


The Votrax VS6.3 was capable of singing (poorly). As well as
multilingual speech. I recall hearing one speak German. But,
the extra capabilities don't really translate into better
"regular speech". So, why pay for them?

At Kurzweil, we had frequent failures in the Votrax subsystem.
All of the boards were potted -- to discourage copying -- so
when one of the four boards died, it was irreparable. Yet
another case of someone going out of their way to protect their
market share. Funny, I don't hear the name Votrax bandied
about anymore so I guess they wasted their efforts clinging
to the past instead of embracing the future!

I have a DECtalk DTC01 and a DECtalk Express. Plus a few
of the Artic/Votrax-based synthesizers. All have the same
basic advantage -- and the same robotic speech quality!


Yep, as does my doubletalk. I had, before Katrina, two
doubletalk internal cards, a doubletalk lite external, and
an audaptor.


While it is unfortunate (the **** poor quality of the speech),
it is still surprisingly easy to get used to the oddities
of these "dialects". Especially when the alternative may
be to be deprived of some interactions!

On the other hand, Festival has a huge footprint. And, is
considerably easier to crash than DECtalk. Where DECtalk
and the other "simple" synthesizers will take a stab at
pronouncing damn near anything you throw at them, Festival
will chew on it for a fair bit of time before commiting
to a pronunciation -- which can be just as wrong as the
other products!


This is why a lot of the screenreader developers, and speech
synth developers sort of "shared the load" you might say.
Common pronunciation, at the phoneme level is often handled
by rom within the synthesizer itself, exceptions and the
like are handled by the software on your hard disk.


Festival approaches everything from the top down.
It tries to understand the context of the material.
From that, the appropriate pronunciation rules. And,
finally, actually synthesizing the speech waveforms.

But, the implementation never focused on performance
issues. Rather, they wrote it so that it would be easier
to write and maintain. So, it takes a fair bit of
resources just to say "Hello". Those resources translate into
dollars in anything other than a PC environment (i.e., your
talking calculator would speak nicer but cost more!)

Does the chassis force you to use a certain type of
disk drives? E.g., because of disk carriers? Many
older RAID offerings require SCSI disks. Would you
be happy with RAID in some other form?


I doubt the chasis does, I'd have to look inside the box.


I know many of the Dell machines that I've used over the years
had special drive carriers for the RAID drives. Often so you
could hot-swap them, etc. Others just hid the array in the
bowels of the machine.

Note that RAID can be a huge bellyache! When a drive fails,
you usually don't have many options other than replacing it
as is in the working set. To do this, many machines have
special BIOS extensions that you have to work with *in*
the BIOS (which may be an issue accessing, for you) to
add the drive to the working set, format it, rebuild its
contents, etc.

I.e., you may be better served by a regular disk with a
second one that you keep "off-line" and copy the stuff you
want to preserve.

I've torn down all my RAID arrays or reconfigured them as
JBOD's (Just a Bunch of Disks) because it makes it so much
easier for me to handle them. E.g., I can remove a disk and
put it in another machine. Doing this with a RAID array
often has the disk recognized as "foreign" and the controller
immediately wants to reformat it. "No!!! I just want you
to let me access the files on it!!"

MOst i did with it when it was given to me was boot it up
once. IF we could get raid in some other form, that would
be cool too. Would lose those two big scsi hard drives then
and have to do somethign else with them, but ... Just trying
to use what's existing in the box with minimal $$$ outlay,


Understood.

if possible. Still it might be worth doing that to get
totally away from windows as server app. I'll have to way
pros and cons of that one when we get to it. Right now that
machine is sitting in storage unit with some of those little
deseccant packages inside the caseg.


I wonder how much spiders like dessicants?? grin

[defaults vs. defaults vs. defaults]

Yeah sounds like a good compromise. If I go back in there
during the same 'session" it remembers where I last was.
OTherwise, it goes to the start.


Exactly. I spent a fair bit of time on this subject, recently.
It's especially significant when it isn't easy to *review*
the settings in place -- e.g., when you can't just glance up
at a screen and reassure yourself that everything seems "about
right".

Understood. You want a different communication channel
to interact with the device instead of having to share
the audio channel that you are devoting to the task at
hand.


That's it exactly. I'm accustomed, as i said in another
post this thread, to not having to do anything butdirectly
communicate with the device. IF I want that channel strip
assigned to a certain bus or certain vca group, push the
button.


BECAUSE YOU RELY ON MEMORY! IMO, the singles biggest asset
a visually impaired user has is memory. Remembering where
you last put something. Remembering how you set a parameter.
etc. Blind man with altzheimer's has got to be shear terror!

IF I want audio from that channel on aux bus 3, i
adjust that aux send. I don't have to play where the f*$@
am I? IT's automatic like buttoning your coat, zipping your
pants or tying your shoes.


Great analogies! They are low skill tasks so why require
lots of attention to perform them?

I've used the same rationale to argue in favor of using
non-visual channels for visual tasks! I.e., those cases
where your eyes are busily engaged in some activity and
shouldn't have to be pulled away just so you could see
which virtual button you were pressing on your iPhone!


Uh huh! Just my point with some of these complex devices
that are made for people to operate while driving, etc.
Give them an auditory channel for the info, keep the eyes on
the road, and the hands upon the wheel please.


Exactly. But there are lots of other cases that aren't
as dramatic. Why do I need to look at my iPod to use
it? Or, my telephone? Why can't I read my appointment
calendar while I'm taking a walk around the neighborhood
(I have to stop moving in order to read all the details
on that silly little display as it "bounces around" too
much when I am walking)?

Why should a soldier have to take his eyes off The Enemy just
to consult some fancy piece of high-tech kit? Or, an
opthamalogist have to remove his eyes from peering into yours
just to see what some device is trying to convey to him?
  #372   Report Post  
Posted to rec.audio.pro
Richard Webb[_3_] Richard Webb[_3_] is offline
external usenet poster
 
Posts: 533
Default FLAC or other uncompressed formats, which is best?

On Sat 2012-May-26 14:15, Don Y writes:
Yep. Mike was sort of a local hero for folks who worked in
the wtc on 9/11 helping lead a bunch of them out.


Obviously long after my experiences with him. Must have been a
doubly terrifying experience for him.


Yes, and what amazes me is that his dog guide didn't totally freak out. I've seen dog guides act rather unpredictably in crowds of people who are either panicked, or big crowds of
blind folks who aren't as careful about don't step on doggie etc. iT takes a rather disciplined human/canine team to
keep the dog calm and on mission at such times. I've
trained my own a couple of times, but never did the dog go
to the bandstand or the gig with me. I don't believe a dog
should be subjected to the constant environment of loud
music, etc. Dogs are, after all, color blind, their eyes
aren't that great to begin with, and its their noses and
ears that tell them about their environment. Subjecting a
dog to that sort of environment on a regular basis is
cruelty imho. Ymmv of course.

snip

Yeah I know, when Ray went eventually with Digital
Equipment's Dectalk it improved on the original voice quite
a bit. The only way to get more natural sounding
synthesized speech than dectalk is the way At&T does it,

snip

Exactly. But of extremely limited use!


YEs,the vocabulary of the device is quite limited, but, if
it can work for your application it's much easier for
average folks to understand.

snip

I suspect you could rework the weather broadcasts to use
a limited vocabulary and, thus, better speech quality.
On the other hand, it means that you have to be able to
anticipate EVERYTHING that you might need to say over
that medium. For example, you might not be prepared to
use it to announce an alien invasion! grin


Right, and with the same alerting system it might have to
announce all sorts of things, which is why I think noaa went originally with dectalks. I think they've changed some of
that now, but there for awhile I heard Dectalk's "huge
Harry" voice quite a bit. I think i commented, some not
quite as familiar asserted noaa was using their "perfect
Paul" but I beg to differ. sounded more like the former to
me.

I have a trimmed down synthesizer that I fall back on if
the primary synthesizer is unavailable in one of the products I'm
designing, currently. It needed to be robust -- so that I could
count on it working regardless of what might have
broken in the system. I could have opted for a better
quality limited vocabulary design -- but, didn't want to have to set
that vocabulary in stone and discover, later, that
I needed to be able to say something that I couldn't.


That's the weakness of the captured phoneme model. Memory
intensive, and limited in utility.

I'm using a Doubletalk card here, not quite as
natural to the uninitiated, but still good enough, and, at
the time, half the priceg. I also liked the serial port
doubletalk, small package, powered from a 9 volt cell.


The DECtalk express suffers from the sin of requiring a
special rechargeable battery. Another fault, in my opinion, for an
assistive technology device (where do you buy that
replacement battery -- today??)


Yeah I know, and that rechargeable is going to eventually
fail to recharge, and it might decide to do so just when I
really want it, and no, I don't want to wait to mail order
another one, and then be told that the "other one" I ordered doesn't function in the way I've become accustomed to, or
won't work in my environment at all.

There is also the hazzard of making a device that defines
how it must be used.

snip
YEah there's that. Liked your discussion of choices. To me
color is one of the last things, were i buying a new vehicle

snip

Exactly. So, for a web site to ask you to pick a color,
first, isn't helpful. Especially if, later, you realize that your
choice has ruled out something else that you really
want most!


YEs, and I'm probably more willing to compromise color,
bells and whistles or something similar to get what I really want. Function though trumps almost everything else in my
world, if it won't do the job I won't pay any price for it.

snip
You know I was reading a similar subject to your reading on
choices recently, an economics professor from MIT on how our
choices impact the economic decisions we make, touching on
ethics, all sorts of stuff like that. Called Predictably
Irrational. Can't recall author's name right now, but it,
and his companion piece "the up side of irrationality" are
both interesting reads on the subject.


Dan Ariely. We were "required" to take eight courses in
The Humanities to graduate. I guess they didn't want a
bunch of engineers with no appreciation of other aspects
of life and education let loose on the unsuspecting masses.


YEp, that's the man. Dan's an interesting read on the
subject. I was thinking I'd place money that you'd read him or was familiar.

I recall selecting American History as one of my courses
thinking I had already had two years of that in High School
so it would be a recent memory, for me! The professor was
an economist. So, I relearned all that history with an
entirely different spin than the noble presentation to
which I'd previously been subjected. Fascinating!


Gives you a different perspective doesn't it?
I'd suggest Dan to audio engineers, marketing people, pretty much any profession. Some real thought provoking stuff
there about the way we operate in everyday life, how we
interact with our customers/clients, etc. For the
uninitiated that aren't bored to tears and are still
following along, it's not all dry tome, there are some great moments in those two books that will tickle your funny bone.

So, I've enjoyed reading books by economists that touch
on these sorts of subjects. _The Price of Everything_
discusses all of our actions -- social and otherwise -- in
terms of economic transactions. E.g., a woman selling
uterine services in a marriage transaction.


Good analogy!!!

_The Art of Choosing_ describes how we "value" choice
in different societies and how it impacts our decisions.
For example, how much we will "spend" to keep choices
available even if they aren't choices of which we would want to
avail ourselves.


INteresting. I'll have to see if it's in braille or bug
the library system to make it so.

_How We Decide_ and _Predictably Irrational_ looked at how
easily we are manipulated and con ourselves in our behavioral
choices, etc. How we can actually think an $10 pill is
better than an identical $0.50 pill, etc. How we *don't*
have a "Market" in which consumers and producers compromise
on price but, rather, how Producers manipulate our expectations of
price to a point that they are happy with, etc.


Indeed, we hoodwink ourselves often without really thinking
about it.

By far, the experiments that have been concocted and presented in
the texts are the most fascinating. And, they make you
laugh at the snobbery that you often see around you -- the folks who
couldn't differentiate an $80 bottle of wine from a $2
bottle of wine -- yet, when confronted with the $80 price tag ON THE
$2 BOTTLE, would *swear* it tastes a LOT better than
the $80 bottle that has been mislabeled as $2!


Indeed. i liked Dan's description of "the trust game" in
Predictably irrational, and what happens when trust is
violated and the opportunity for revenge is offered. That
had me slapping my leg for awhile.

When I was operating a fixed location studio and I'd have a
songwriter coming in for demos or a group I'd always ask my
first question which was "In your mind's ear, when you hear
your song fully arranged and produced, what does it sound
like? Bring me an example of production already recorded
that fits what your mind's ear hears." This way, I could
choose the right capture techniques, such as

snip

Good point! I would never buy consumer kit from specs.
Rather, how it sounded to me when reproducing the sorts of
program material I was listening to at that point in my life.


I never do, I want to hear it, listen to it. Now if I were
wearing the producer's hat I'd argue for my vision of the
finished producct if it didn't fit what you heard, but I at
least need a starting point if I'm the engineer. What you
want to hear at the other end is going to govern how I
approach it from the beginning, just as the green color
scheme might not be available with the trailer towing
package.

"Why do I have to specify this parameter before that
parameter? They are independant yet you are forcing
me to pick a certain one before the other. How did you

snip
Another reason I like configuring software with text files
if I can get it. I can look through the configuration file,
set options I'm sure of, and do some more poring over the
docs to understand further what needs to be defined.


The problem with that approach comes when two option
choices are interdependant. There is nothing preventing
you from asking for a set of incompatible options -- until
some program examines your choices and complains.


INdeed it does, and many who offer software configurable in
this way will tell you that these two options are mutually
exclusive, you can not have both.

Can't multiple menus be displayed concurrently?
For example, there are many desktop GUIs that will
let you "pin" (think: thumbtack) a menu or a dialog

snip
There's the rub. I've seen two approaches with a lot of
these.
One approach gives you banks of channel strips, say 1-16,
17-32, etc. Possibly even in 8 channel banks, so make for a
smaller footprint. SO, if I'm wanting to do a line check on

channel 24 let's say, and we've got 8 channel banks, i've

snip
There's my main stumbling block. With my old analog iron
all the aux sends are there, bus assignments, vca groups,
all are right there. Yes it means sometimes the mixer is
working at full extension of his body to reach that control,
but that control is there, and I can manipulate it, or
somebody else can while I'm doing something else.


Understood. The same sort of thing is true with theatrical
lighting panels, video switchers (the video equivalent of
an audio mixer), etc. To make everything visible and accessible or
just some selectable subset of it. I have the same sort
of problem when authoring multimedia presentations (though
those aren't done in real time)


There ya go. IF it's there I don't have to think about
where am I, how i get there.

That's why I thought the push-pin approach might be a
good compromise -- let you decide which parts of the interface you
want to have access to. But you are still constrained by the
physical size of the display.


Might, so long as that channel to my brain doesn't force me
to take my attention from the primary task, which is paying
attention to the audio.

What I'd really want to do before plunking down the dollars
for a digital console was actually work with it for a couple
of days first, and get my mind around some of the concepts,

snip
I would imagine that might give you an 80% idea of what
the change would be like. But, someday, you'd find
yourself facing a problem that had to be solved NOW and
scurrying to sort out how to get to the solution you
want in that environment.


YEs, and I need that 80% first. Because within that 80% are going to be every day situations.

Sort of like a surgeon doing a laproscopic procedure and
suddenly everything going to ****. Drop the tools, grab
a big knife and cut the patient open. You're not going
to fix the problem through that tiny incision -- unless
you are incredibly skilled and fluent with the technology!


Again, a very good analogy.

As I said, I've put a lot of thought into what you really need to
interact with a given device and how to minimize the
cognitive loading on the user. You don't want to require 100% of
his attention. Especially if it is to perform some low
grade task!


Indeed, something that I believe is becoming lost to many of our product designers.

Imagine if you had to type the word "ANSWER" on a keyboard to answer
your phone. Ridiculous, right? It would require too much focused
attention for a task that should be trivial.


Right, and they forget that no matter the channel, there's
only so much bandwidth it can support, and that has more
relevance than just your internet connection.

This is why I'd really need to sit down in
front of one for a day or two, at least with multitracks of
prerecorded material up to really put the thing through its
paces, and why often I'm reluctant to buy a new device from
the in-store demo. It's taken me a long time to decide on
one of those little recorders like the zoom, etc. But,

snip
Exactly. With use, you develop a sort of muscle memory
as your hands are accustomed to making certain motions to
do certain things. If, instead, you have to coordinate your eyes
and hands to *pick* a particular option from a linear
list, you have to rely on that visual feedback to ensure
you are at the right point in that list before making your
selection.


INdeed, and whether it be visual or auditory, forcing you to give your attention to the device to choose might distract
you from more important work. HOw many people look at the
dtmf pad on their touchtone phone when dialing a number?
Once you drive a car for awhile, when it starts to rain you
automaticallly know wehre to find the controls for the
windshield wipers. Get rid of that car, buy another, and
for awhile muscle memory is going to fail you, until you get used to the new one.
Interesting discussion of your touch screen controlled
device. Thanks!


Regards,
Richard
--
| Remove .my.foot for email
| via Waldo's Place USA Fidonet-Internet Gateway Site
| Standard disclaimer: The views of this user are strictly his own.
  #373   Report Post  
Posted to rec.audio.pro
Don Y Don Y is offline
external usenet poster
 
Posts: 137
Default FLAC or other uncompressed formats, which is best?

Hi Richard,

[World Trade Center]

Yes, and what amazes me is that his dog guide didn't totally
freak out. I've seen dog guides act rather unpredictably in
crowds of people who are either panicked, or big crowds of
blind folks who aren't as careful about don't step on doggie etc.
iT takes a rather disciplined human/canine team to
keep the dog calm and on mission at such times.


I got a lecture from Michael, one time, about the nature
of that relationship. I, of course, see dogs as pets or
companions. Michael stressed that this was not the case
with service dogs. They *can't* make mistakes. If my
dog runs out into the road to chase a car, worst case,
I may lose a dog. A service dog doing the same thing
could result in losing your life!

I've
trained my own a couple of times, but never did the dog go
to the bandstand or the gig with me. I don't believe a dog
should be subjected to the constant environment of loud
music, etc. Dogs are, after all, color blind, their eyes
aren't that great to begin with, and its their noses and
ears that tell them about their environment. Subjecting a
dog to that sort of environment on a regular basis is
cruelty imho. Ymmv of course.


I recall visiting the "Guide Dogs for the Blind" facility
in Palo Alto (?). Amazing to see the effort that goes
into raising and training a service dog! It's truly an
"investment", not a "pet".

I suspect you could rework the weather broadcasts to use
a limited vocabulary and, thus, better speech quality.
On the other hand, it means that you have to be able to
anticipate EVERYTHING that you might need to say over
that medium. For example, you might not be prepared to
use it to announce an alien invasion!grin


Right, and with the same alerting system it might have to
announce all sorts of things, which is why I think noaa went
originally with dectalks. I think they've changed some of
that now, but there for awhile I heard Dectalk's "huge
Harry" voice quite a bit. I think i commented, some not
quite as familiar asserted noaa was using their "perfect
Paul" but I beg to differ. sounded more like the former to
me.


All of the voices are just tweeks to a core set of
parameters in the waveform generator. E.g., the
"backup" synthesizer that I designed is conceptually
modeled largely on the Klatt synthesizer -- which was
the basis of DECtalk.

But, no matter how much you tweek the parameters, it still
sounds like the same voice. I.e., as if they all shared
the same genes!

I have a trimmed down synthesizer that I fall back on if
the primary synthesizer is unavailable in one of the products I'm
designing, currently. It needed to be robust -- so that I could
count on it working regardless of what might have
broken in the system. I could have opted for a better
quality limited vocabulary design -- but, didn't want to have to set
that vocabulary in stone and discover, later, that
I needed to be able to say something that I couldn't.


That's the weakness of the captured phoneme model. Memory
intensive, and limited in utility.


There are different approaches with different resource and capability
tradeoffs.

At one end of the spectrum, you can prerecord canned utterances
and just play them back to the user. Or, assemble them from
smaller phrases that have been carefully recorded with inflection
that seems to fit together, seemlessly.

At the other end, something that builds waveforms from mathematical
models -- like KlattTalk. Or, even articulatory synthesizers that
try to model the entire vocal tract!

In the middle, you can have things like diphone synthesizers
where you take speech samples (from real people) and carefully
cut them into "units" which you then assemble dynamically to
make whole phonemes and, eventually, words and utterances.

Since patching phonemes together results in artificial transitions
between adjacent phonemes (i.e., transitioning from an "ah" sound to
a "th" sound like in "father"), diphone synthesis deals with just
the transitions and pieces transitions together!

For example, a phoneme based synthesizer would have an "f" phoneme,
an "ah" phoneme, a "th" phoneme, etc. that it would glue together
(smoothing out the bumps between them!) to speak "father".

A diphone synthesizer would have recordings of the silence-to-f
transition, the f-to-ah transition, the ah-to-th transition, etc.
So, it would piece together these transitions and have them
*meet* in the middle of real phonemes! I.e., the silence-to-f
glues to the f-to-ah in the *middle* of the "f" sound. It's
easier to smooth the start of an "f" with the end of an "f"
than it is to smooth an "f" to an "ah".

But, there are a boatload more transitions than there are
phonemes! For example, if you have just 4 phonemes, you might
have 12 or 16 transitions:
1-2 1-3 1-4 2-1 2-3 2-4 3-1 3-2 3-4 4-1 4-2 4-3
plus, perhaps, 1-1 2-2 3-3 and 4-4.

Imagine if you have 40 or 60 phonemes!

The DECtalk express suffers from the sin of requiring a
special rechargeable battery. Another fault, in my opinion, for an
assistive technology device (where do you buy that
replacement battery -- today??)


Yeah I know, and that rechargeable is going to eventually
fail to recharge, and it might decide to do so just when I
really want it, and no, I don't want to wait to mail order
another one, and then be told that the "other one" I ordered
doesn't function in the way I've become accustomed to, or
won't work in my environment at all.


Or, "Sorry, we no longer sell those parts. But, could we
interest you in our new model? It's only $295..."

Dan Ariely. We were "required" to take eight courses in
The Humanities to graduate. I guess they didn't want a
bunch of engineers with no appreciation of other aspects
of life and education let loose on the unsuspecting masses.


YEp, that's the man. Dan's an interesting read on the
subject. I was thinking I'd place money that you'd read
him or was familiar.


Many of the texts I mentioned led to each other. I.e., if
the subject matter is interesting, you tend to look for
similar texts to wade through.

The reference librarian at the local library branch tends
to know what sort of titles might interest me so she feeds
them to me as she comes across them.

Another interesting read was _The Compass of Pleasure_
which deals with how we perceive pleasure! I.e., the
similarities between the rush you get from an epiphany,
sex, drugs, food, etc. Unfortunately, it relied too
heavily on neurochemistry and neurophysiology to explain
what was going on in the brain in each of these scenarios.
So, it was a harder read than it needed to be.

But, it explains how you can become desensitized to
certain experiences, etc. While reading it, I had a
flashback to a video game design from a friend. I
asked him why there was so much variation in the
quality/impact of some of the special effects. I.e.,
why not ALWAYS put the most spectacular versions out
there? His reply, "then they cease to be spectacular!"

I've learned to adopt a similar approach in many of the
things that I do routinely. For example, I bake a fair
bit (cookies, pastry, deserts, etc.). I used to strive
for consistency in each batch. All the cookies the
same size, texture, etc.

Now, I intentionally vary parts of the batch in different
ways. Make some cookies larger or smaller. Bake some
a bit less, others a bit more. *Burn* some, etc. And,
people nibbling on them tend to notice them, more -- less
likely to get into a mindless eating mode where you just
shovel them down without noticing what -- or how many -- you
are eating!

I recall selecting American History as one of my courses
thinking I had already had two years of that in High School
so it would be a recent memory, for me! The professor was
an economist. So, I relearned all that history with an
entirely different spin than the noble presentation to
which I'd previously been subjected. Fascinating!


Gives you a different perspective doesn't it?


Disturbing. Also makes you feel like a real *sap* for
buying into much of that naive patriotism. frown

I'd suggest Dan to audio engineers, marketing people, pretty
much any profession. Some real thought provoking stuff
there about the way we operate in everyday life, how we
interact with our customers/clients, etc. For the
uninitiated that aren't bored to tears and are still
following along, it's not all dry tome, there are some
great moments in those two books that will tickle your funny bone.


Agreed. My current read (_The Art of Choosing_ -- actually
on its way back to the library, tonight) also exposed me
to many differences in cultures that I probably never would
have been aware of -- even if I had visited some of these
cultures!

For example, in parts of Europe, doctors make care decisions
and just tell patients and family what those will be. Very
different from how things are done in the U S A. And, there
are psychological consequences to these differences for the
folks in those situations!

So, I've enjoyed reading books by economists that touch
on these sorts of subjects. _The Price of Everything_
discusses all of our actions -- social and otherwise -- in
terms of economic transactions. E.g., a woman selling
uterine services in a marriage transaction.


Good analogy!!!


Disturbing analogy! Especially that someone would think
about this sort of "activity" in those terms!

_How We Decide_ and _Predictably Irrational_ looked at how
easily we are manipulated and con ourselves in our behavioral
choices, etc. How we can actually think an $10 pill is
better than an identical $0.50 pill, etc. How we *don't*
have a "Market" in which consumers and producers compromise
on price but, rather, how Producers manipulate our expectations of
price to a point that they are happy with, etc.


Indeed, we hoodwink ourselves often without really thinking
about it.


Or, how marketers exploit these behaviors to get you to
increase what you were willing to pay, originally.

frown

There's my main stumbling block. With my old analog iron
all the aux sends are there, bus assignments, vca groups,
all are right there. Yes it means sometimes the mixer is
working at full extension of his body to reach that control,
but that control is there, and I can manipulate it, or
somebody else can while I'm doing something else.


Understood. The same sort of thing is true with theatrical
lighting panels, video switchers (the video equivalent of
an audio mixer), etc. To make everything visible and accessible or
just some selectable subset of it. I have the same sort
of problem when authoring multimedia presentations (though
those aren't done in real time)


There ya go. IF it's there I don't have to think about
where am I, how i get there.


For a better example, imagine being lost in phone menu hell!
"Where am I? How do I get to where I want to be? Should
I just hang up and start over?"

Exactly. With use, you develop a sort of muscle memory
as your hands are accustomed to making certain motions to
do certain things. If, instead, you have to coordinate your eyes
and hands to *pick* a particular option from a linear
list, you have to rely on that visual feedback to ensure
you are at the right point in that list before making your
selection.


INdeed, and whether it be visual or auditory, forcing you to
give your attention to the device to choose might distract
you from more important work. HOw many people look at the
dtmf pad on their touchtone phone when dialing a number?


Exactly. How many iPhone users can dial with their hand
*in* a purse, etc.?

Once you drive a car for awhile, when it starts to rain you
automaticallly know wehre to find the controls for the
windshield wipers. Get rid of that car, buy another, and
for awhile muscle memory is going to fail you, until you get
used to the new one.


Yup. The same is true of a different keyboard layout.
For example, my UNIX machines tend to have different keys
in different places -- and different functionality -- than
my PC's. So swiveling my desk chair to type on one
keyboard or another means I have to mentally switch my
typing patterns to a different layout -- then, back again
when I swivel the chair to its original orientation.

Interesting discussion of your touch screen controlled
device. Thanks!


That and the multimedia / home-automation system are my
most ambitious undertakings to rely on alternative
display and control technologies.

The touchpad is particularly different because it requires
the user to think in terms of shapes and how they differ.
But, without *looking* at them! It has to be a more
internalized, intuitive "feel". And, I have to use lots
of completely bogus terms to describe characteristics of
those shapes.

For example, an 'S' is "curvier" than a 'C' while a 'Z'
is "jagged-er" than an 'S', etc. Note that not all
shapes can be easily correlated with letters. So, you
need a lexicon that users can relate to so that they
know how a particular shape is likely to be defined.
E.g., imagine a W standing on its side. Or, a 'C'
lying face down. Or a "box open on bottom" (imagine a
square 'U' upside down).

Then, consider geometric operators applied to those
shapes. For example, one shape suggesting "do this"
while it's mirror image says "do the opposite".

To be successful, you can't have what appear to be
arbitrary symbols with arbitrary meanings. It has
to be something that a user can readily relate to
WITHOUT THINKING -- like zipping their fly grin
  #374   Report Post  
Posted to rec.audio.pro
Richard Webb[_3_] Richard Webb[_3_] is offline
external usenet poster
 
Posts: 533
Default FLAC or other uncompressed formats, which is best?

On Sat 2012-May-26 15:12, Don Y writes:
snip
Usually I'll tell the screen reader app to remain silent,
and park speech cursor over that display. IF that display
is going to be changing a lot I tell the screen reader to
ignore that line entirely, and only force it to go there
when I want to look at it. Usually I'll use monitoring of a
status line to tell ti when to change configurations. E.g.
when status line changes to x from y load a configuration
which tracks a light bar as your focus of where you are,
etc. etc. Rather complex, and probably extremely boring to
the folks here. We should probably take this line to emailg..


OK. I'll pick a suitable subject line so my mail is
recognizable -- though probably not today as I am busy
getting my other half ready for a trip.


Probably good for general access technology discussions g. Be sure to despamproof the address g.

This is directly analagous to reading printed text.
You can cherry pick through large amounts of information
with relatively little effort.


Indeed, and, believe it or not, i retain what i read better, as
well as read fast using it. IN most cases, synthesized
speech is the most cost effective and most effective in
other ways, compromise that can be achieved. Braille
displays are clunky, hard to maintain and don't achieve good
reading speed, or efficient work flow, unless you're a
customer service rep dealing with both the computer and
customers on the phone. After all, they take your hands
away from the keyboard, they can display a very limited
amount of text at one shot, all them little solenoid springs
and mechanical parts ...aaargh


Not to mention the expense! I have a Braille N Speak here
(I think that is the name). There is an Italian firm that
makes a Braille display that is pzieoelectric (spelling?)
which should be easier to keep running. But, I think they
want $700/cell -- or something equally outrageous!


YEp, and that's expensive. I've haerd over the years
piezzoelectric might be the wave of the future in those
things, but right now they're electromechanical clunkety
clackety clickety bang bang devices.

Maybe if the eurozone crumbles, you can pick these up
for a song! grin


rotfl I want the one that gives me the whole 25 line 40
cell page g. yeah dream on.

[copy protection]


Exactly what I ran into with it. My mom didn't want a
screenreader, but was glad to have my help maintaining her
system, often without her having to stand over my shoulder
and play screenreader.

snip

Borland used to have a "like a book" license. I.e., you
could move the license around. But, still not as flexible
as it could be.


YEah and that depends on the honor of those paying for it.

The problem is, people think there's no "cost", there, and
let that confuse their idea of "value"! Just because you
didn't pay for something, doesn't make it valueless! I.e.,
if you think there is no value to that second copy of the
software, then live without it -- you should not experience
any *costs* if it had no *value*!


Yep, that's the whole point. I know a friend of mine
offered to get me a cracked copy of jaws a couple years ago, and I said thanks but no thanks. The more ethically honest
approach is to not buy it in the first place if you don't
like it. State rehab bought my copy years ago but would not spend any more dough for another key disk, and I refused to
spend any more for another key because they told me they
didn't trust me, and I wasn't welcome to service other
peoples' machines even on a volunteer basis. This means
that a person wanting to go into repairing computers for
others and maintaining them can't buy the software with
confidence either, so hence I don't own a copy.

If you provide a nonvisual OPTION, then applications will
only give token support -- if any -- to it. On the other
hand, if the only way to get information out of a device is
via that option, then they don't have a choice! This is
the approach I have been taking, lately. Pick an output
modality that addresses everyone in the target audience
and force everything to use that single mechanism!


Indeed, and this is what we're finding with a lot of web
portals that do things that are only usable with vision.
Anyone who's doing web development should looik at a series
of articles discussing just this issue in this month's
Braille MOnitor, available in text, no doubt from
www.nfb.org.


I've not looked at their site in a long time. To be honest, I'm a
bit set off by the "evangelism". Perhaps it is necessary to bring
the "message" to the "unwashed masses". But, in my case, it feels
like preaching to the choir.


YEah I know, I get a little of that too, but i remind myself they're always trying to expose newcomers, so a good bit of
that is necessary. But, I mention it in this group because
there are a lot of folks here developing web content, often
for others.

Sort of like telling a smoker he should quit. I'm sure he
already knows that


YEah this is true, but you're also having to grab that
newcomer and overcome conditioning that has been with him
all his life, so a bit of total immersion is necessary,
though us old hands get rather tired of the proselytizing
g.

The backup speech synthesizer in one of my products has similar
quality issues as Klatt's DECtalk (he wrote it while a student and
DEC commercialized it). It's biggest advantage is that
it is pretty lean when it comes to resources -- which translates
directly to implementation costs and reliability.


Did you ever check out that kid a few years ago that made a
dectalk sing? HE spent some serious time coding that, iirc
the kid was only 16 years old or so when he did this one.

snip

The Votrax VS6.3 was capable of singing (poorly). As well as
multilingual speech. I recall hearing one speak German. But, the
extra capabilities don't really translate into better
"regular speech". So, why pay for them?


YEah I'd heard it could. I used the Votrax while working at a large vending site, it was coupled with a coin sorter via
a serial cable.

At Kurzweil, we had frequent failures in the Votrax subsystem. All
of the boards were potted -- to discourage copying -- so when one of
the four boards died, it was irreparable. Yet
another case of someone going out of their way to protect their
market share. Funny, I don't hear the name Votrax bandied
about anymore so I guess they wasted their efforts clinging
to the past instead of embracing the future!


YEah I know, other than that coin sorter I've never seen a
Votrax box anywhere else.

I have a DECtalk DTC01 and a DECtalk Express. Plus a few
of the Artic/Votrax-based synthesizers. All have the same

basic advantage -- and the same robotic speech quality!


Yep, as does my doubletalk. I had, before Katrina, two
doubletalk internal cards, a doubletalk lite external, and
an audaptor.


While it is unfortunate (the **** poor quality of the speech), it is
still surprisingly easy to get used to the oddities
of these "dialects". Especially when the alternative may
be to be deprived of some interactions!


YEs, and you can grow accustomed to it. My lady is starting to understand this one fairly well, but then she should
after living with me for over a decade g.

snip
Festival approaches everything from the top down.
It tries to understand the context of the material.
From that, the appropriate pronunciation rules. And,
finally, actually synthesizing the speech waveforms.


I'd heard that one elsewhere. I think an electrical
engineering friend was telling me about that one a few years ago.

But, the implementation never focused on performance
issues. Rather, they wrote it so that it would be easier
to write and maintain. So, it takes a fair bit of
resources just to say "Hello". Those resources translate into
dollars in anything other than a PC environment (i.e., your
talking calculator would speak nicer but cost more!)


INdeed, and there's that principle of no free lunch again.
You need fairly quick response to be interactive, so you
gotta sacrifice something to get it g.

Does the chassis force you to use a certain type of
disk drives? E.g., because of disk carriers? Many
older RAID offerings require SCSI disks. Would you
be happy with RAID in some other form?

snip
I know many of the Dell machines that I've used over the years had
special drive carriers for the RAID drives. Often so you could
hot-swap them, etc. Others just hid the array in the
bowels of the machine.


Have seen that in those too. This is of course a tower,
might be Dell, might be another.
Note that RAID can be a huge bellyache! When a drive fails, you
usually don't have many options other than replacing it
as is in the working set. To do this, many machines have
special BIOS extensions that you have to work with *in*
the BIOS (which may be an issue accessing, for you) to
add the drive to the working set, format it, rebuild its
contents, etc.


This is true also, and i"m debating on that issue too,
whether raid is overkill, or i can really get by with a
script that might run during periods of low activity to copy files to a drive that's offline. So, it's not written in
stone, but i mentally like the idea of raid, but, when
relocated and able to get reasonably priced broadband that
allows me to run my own servers and doesn't force me to
subsidize Rupert and Disney I'll be at the point of decision g.

snip
machine is sitting in storage unit with some of those little

deseccant packages inside the caseg.


I wonder how much spiders like dessicants?? grin


rotflmao!!!

snip
Understood. You want a different communication channel
to interact with the device instead of having to share
the audio channel that you are devoting to the task at
hand.


That's it exactly. I'm accustomed, as i said in another
post this thread, to not having to do anything butdirectly
communicate with the device. IF I want that channel strip
assigned to a certain bus or certain vca group, push the
button.


BECAUSE YOU RELY ON MEMORY! IMO, the singles biggest asset
a visually impaired user has is memory. Remembering where
you last put something. Remembering how you set a parameter. etc.
Blind man with altzheimer's has got to be shear terror!


Indeed, I've told my kids if that happens bring the gun!

IF I want audio from that channel on aux bus 3, i
adjust that aux send. I don't have to play where the f*$@
am I? IT's automatic like buttoning your coat, zipping your
pants or tying your shoes.


Great analogies! They are low skill tasks so why require
lots of attention to perform them?


Beyond that, see comments my previous post on bandwidth.
The primary mission is the audio, that's what I'm paid for.
There is already the intercom situation trying to occupy
some of that bandwidth, and we all know that every channel,
in every aspect of life, is bandwidth limited. The am
station that carries Rush has less available bandwidth than
does the TV channel carrying the Simpsons. The only way to
get more bandwidth on a channel is to increase the size of
the channel. To get more water flow to your house increase
the size of the main. Selecting a bus assignment, etc. are
normally low bandwidth things, depress the switch, then
either your finger tells you the switch is depressed, or the idiot light tells you, etc. Trying to listen to syntehsized speech while trying to give my attention to selecting which
bank of channels I'm on, or whether those controls are aux
send 3 or aux send 4 while trying to listen to the audio I'm paid to listen for, and the guy running the spotlight
talkign about the chick with the big knockers in the front
row is just ... a bit overwhelming. Somethin's got to give
somewhere.

I've used the same rationale to argue in favor of using
non-visual channels for visual tasks! I.e., those cases
where your eyes are busily engaged in some activity and
shouldn't have to be pulled away just so you could see
which virtual button you were pressing on your iPhone!


Uh huh! Just my point with some of these complex devices
that are made for people to operate while driving, etc.
Give them an auditory channel for the info, keep the eyes on
the road, and the hands upon the wheel please.


Exactly. But there are lots of other cases that aren't
as dramatic. Why do I need to look at my iPod to use
it? Or, my telephone? Why can't I read my appointment
calendar while I'm taking a walk around the neighborhood
(I have to stop moving in order to read all the details
on that silly little display as it "bounces around" too
much when I am walking)?


True, although the developers might wonder why you need to
look at your calendar when you're walking around your
neighborhood in the first place. But, it's well within the
realm of possibility that, for whatever reason, you might
wish to do so, maybe because you can't quite recall whether
you've an appointment this morning, and if you can refresh
your memory you'll know whether you ahve the time to stop
and chat with the old boy walking his dog you see all the
time, and buy a cup of coffee to sit and chat awhile.


Why should a soldier have to take his eyes off The Enemy just to
consult some fancy piece of high-tech kit? Or, an
opthamalogist have to remove his eyes from peering into yours just
to see what some device is trying to convey to him?


Good points as well, and we dont' think. I've been glad to
see that the gps folks such as Tomtom realize this, and give the driver instructions with a voice.


Regards,
Richard
--
| Remove .my.foot for email
| via Waldo's Place USA Fidonet-Internet Gateway Site
| Standard disclaimer: The views of this user are strictly his own.
  #375   Report Post  
Posted to rec.audio.pro
Don Y Don Y is offline
external usenet poster
 
Posts: 137
Default FLAC or other uncompressed formats, which is best?

Hi Richard,

On 5/26/2012 7:20 PM, Richard Webb wrote:
On Sat 2012-May-26 15:12, Don Y writes:


Rather complex, and probably extremely boring to
the folks here. We should probably take this line to email


OK. I'll pick a suitable subject line so my mail is
recognizable -- though probably not today as I am busy
getting my other half ready for a trip.


Probably good for general access technology discussionsg.
Be sure to despamproof the addressg.


OK. I'll reply to this and any further USENET posts
by you to your email address.


  #376   Report Post  
Posted to rec.audio.pro
John Williamson John Williamson is offline
external usenet poster
 
Posts: 1,753
Default FLAC or other uncompressed formats, which is best?

Richard Webb wrote:


Uh huh, which i hope will suddenly occur to all these
producers and others who think that the audio we consume
*must* be squeezed to get maximum level, hence destroying
the music. When you listen to a symphony, or a good jazz
piece, the dynamics vary. Those variations are part of what makes it interesting and causes you to think of the music as something more than background noise. Variations in
texture, size, etc. cause people to actually think about
what they're "consuming' and actually derive enjoyment from
it.


I recorded a school orchestra playing a while ago, taking care to set
the peaks as close to zero dB as I could, but maintaining all the
dynamic range. Listening to the pseudo soundfield recording, it feels as
if you're there on headphones, and sounds excellent on speaker.

The first comment of the conductor on hearing it was "It's too quiet..."

I just sent them the master copies and left them to ask one of the
students to compress it to taste. No pressure on me, it was a freebie
anyway.

--
Tciao for Now!

John.
  #377   Report Post  
Posted to rec.audio.pro
Doug McDonald[_6_] Doug McDonald[_6_] is offline
external usenet poster
 
Posts: 57
Default FLAC or other uncompressed formats, which is best?

On 5/27/2012 9:42 AM, John Williamson wrote:
Richard Webb wrote:


Uh huh, which i hope will suddenly occur to all these
producers and others who think that the audio we consume
*must* be squeezed to get maximum level, hence destroying
the music. When you listen to a symphony, or a good jazz
piece, the dynamics vary. Those variations are part of what makes it interesting and causes you to
think of the music as something more than background noise. Variations in
texture, size, etc. cause people to actually think about
what they're "consuming' and actually derive enjoyment from
it.


I recorded a school orchestra playing a while ago, taking care to set the peaks as close to zero dB
as I could, but maintaining all the dynamic range. Listening to the pseudo soundfield recording, it
feels as if you're there on headphones, and sounds excellent on speaker.

The first comment of the conductor on hearing it was "It's too quiet..."

I just sent them the master copies and left them to ask one of the students to compress it to taste.
No pressure on me, it was a freebie anyway.


The question is ... what was the piece of music?

Doug McDonald
  #378   Report Post  
Posted to rec.audio.pro
Richard Webb[_3_] Richard Webb[_3_] is offline
external usenet poster
 
Posts: 533
Default FLAC or other uncompressed formats, which is best?

On Sat 2012-May-26 22:03, Don Y writes:
Yes, and what amazes me is that his dog guide didn't totally
freak out. I've seen dog guides act rather unpredictably in
crowds of people who are either panicked, or big crowds of
blind folks who aren't as careful about don't step on doggie etc.

snip
iT takes a rather disciplined human/canine team to
keep the dog calm and on mission at such times.


I got a lecture from Michael, one time, about the nature
of that relationship. I, of course, see dogs as pets or
companions. Michael stressed that this was not the case
with service dogs. They *can't* make mistakes. If my
dog runs out into the road to chase a car, worst case,
I may lose a dog. A service dog doing the same thing
could result in losing your life!


Yep, which is why I've lectured blind musicians about
bringing their partners into such environments. i'd be just as critical of the blind guy who insisted that his dog
coexist with him in the noisy machine shop, etc. That
relationship should be nurtured, and respect shown the
animal part of said team. After all, we ask the public to
respect it and are always admonishing them "when doggie is
in harness don't pet doggie."

re speech synthesis:
All of the voices are just tweeks to a core set of
parameters in the waveform generator. E.g., the
"backup" synthesizer that I designed is conceptually
modeled largely on the Klatt synthesizer -- which was
the basis of DECtalk.


Indeed, very similar voices, some differences in pitch, but
timbres are very similar, for the musicians among us g.

But, no matter how much you tweek the parameters, it still
sounds like the same voice. I.e., as if they all shared
the same genes!


Good analogy! For some reason though dectalk is one of the
most easily understood by the neophyte to that world. I
remember helping a blind vendor computerize his bookkeeping
and inventory a few years ago (wehre I ran into the Votrax)
and trying various synthesizers with him, including ARtic's
offerings, RC systems' doubletalk series that I'm fond of,
etc. We ended up going with Dectalk because he could
understand it best.


I have a trimmed down synthesizer that I fall back on if
the primary synthesizer is unavailable in one of the products I'm
designing, currently. It needed to be robust -- so that I could
count on it working regardless of what might have

broken in the system.

snip
That's the weakness of the captured phoneme model. Memory
intensive, and limited in utility.


There are different approaches with different resource and
capability tradeoffs.


Indeed. YOur discussion of constructing sounds from
phonemes and various building blocks was probably very
instructive to some here. For those with a daw you can
illustrate this to yourself quite easily, zoom in on the
waveforms quite closely with your daw. Imagine your world
of constructing the piece of music from various takes, being sure that the tempo is the same, or close enough, etc. etc.
Many of us know about the battle with getting that smooth
seamless punch in. This then is an exercise in the smooth
seamless punch attempt from hell. Analog guys try this one
with your razor blade g.
snip

But, there are a boatload more transitions than there are
phonemes! For example, if you have just 4 phonemes, you might have
12 or 16 transitions:
1-2 1-3 1-4 2-1 2-3 2-4 3-1 3-2 3-4 4-1 4-2 4-3
plus, perhaps, 1-1 2-2 3-3 and 4-4.


Yep, and the more varied you want the vocabulary to be, the
more the mathematical possibilities expand. This is why
you'll never get fully natural sounding speech no matter
what type of syntehsis you use. YOu could apply a
supercomputer to the task and still not get it right
reliably. YOu could get closer, but ...

The DECtalk express suffers from the sin of requiring a
special rechargeable battery. Another fault, in my opinion, for an
assistive technology device (where do you buy that
replacement battery -- today??)

Yeah I know, and that rechargeable is going to eventually
fail to recharge, and it might decide to do so just when I
really want it, and no, I don't want to wait to mail order
another one, and then be told that the "other one" I ordered
doesn't function in the way I've become accustomed to, or
won't work in my environment at all.


Or, "Sorry, we no longer sell those parts. But, could we
interest you in our new model? It's only $295..."

Yep, and that's the one that doesn't function in my
environment at all. This is why often these days i look for gear that uses off the shelf rechargeable batteries.
obsolete this!

Dan Ariely. We were "required" to take eight courses in
The Humanities to graduate. I guess they didn't want a
bunch of engineers with no appreciation of other aspects
of life and education let loose on the unsuspecting masses.

YEp, that's the man. Dan's an interesting read on the
subject. I was thinking I'd place money that you'd read
him or was familiar.


Many of the texts I mentioned led to each other. I.e., if
the subject matter is interesting, you tend to look for
similar texts to wade through.


I'm sure they did. Having just been exposed recently I
found it fascinating. Makes one think about a lot of
things, how we are manipulated, and manipulate ourselves
when it comes to our expectations. hEre again, look at the
example of the vibrating vu indicator I put in my pocket on
gigs. Before the vibrating silent pager made those little
vibrator motors ubiquitous I never expected that I'd be able to have a silent vu indicator. IF I got one, it would be
impractical to use because it would force me to devote one
hand to monitoring it all the time, leaving me with one hand to operate the mixing console. But, they were some of the
first things I arranged to recreate after Katrina because
they're inexpensive to build, and had i not the facilities
to gin them up with off the shelf parts and some project
boxes I would have paid good money to acquire them again.

The reference librarian at the local library branch tends
to know what sort of titles might interest me so she feeds
them to me as she comes across them.


Always handy. Had one at library for the blind in Iowa who
was good about that, but these days my braille library
service comes from a multistate center, and although they
might be trained in library science they're more akin to a
warehouse filling orders.

Another interesting read was _The Compass of Pleasure_
which deals with how we perceive pleasure! I.e., the
similarities between the rush you get from an epiphany,
sex, drugs, food, etc. Unfortunately, it relied too
heavily on neurochemistry and neurophysiology to explain
what was going on in the brain in each of these scenarios.
So, it was a harder read than it needed to be.


INdeed. i have heard of that one but never seen it. About
a year ago I did "this is your brain on music" though. I
still wonder though how much the strange environment of the
mri and all that doesn't mess with the results however.
After all, we don't have sex, or listen to music or other
such activities when crammed into a metal tube in a noisy
sterile environment.
If you could capture those brain images of a guy listening
to Duke Ellington or the Beatles while he's kicked back in
his favorite chair without the apparatus you might get far
different results.

But, it explains how you can become desensitized to
certain experiences, etc. While reading it, I had a
flashback to a video game design from a friend. I
asked him why there was so much variation in the
quality/impact of some of the special effects. I.e.,
why not ALWAYS put the most spectacular versions out
there? His reply, "then they cease to be spectacular!"


I"ve heard this in a lot of endeavors, or a version of.
After awhile too much "hey wow" desnsitizes the mind to it.
We see it in every day life. Drifting off topic for the
group again sorry guys. I'm sure that the customer service
rep is glad to have that braille display to do his job
interacting with customers on the phone instead of being
forced to rely on speech, but as for me, I know about
MOore's law and how wonderful the technology has become just in my time on this earth, Ray's reading machine, information at my fingertips, etc. etc. For the money I'd just as soon
wait for that next generation of braille displays that uses
maybe a combination of piezzoelectricity and electrochemical reactions to give me a page of refreshable braille grin.

But, back to topic. Why do we go to so many live concerts
and come away dsiappointed with the sound, because we, and
the folks putting on the show have been conditioned to think that's the way it always is. We got used to live sound
being the way it was back in the days of those awful
sounding voice of the theater cabinets, etc. That was what live amplified music was supposed to sound like in our
heads, and even though we've got better signal processing,
better speaker arrays, etc. WE still go for that sound
because, it's the way live msuic requiring amplification
sounded to us.

I've learned to adopt a similar approach in many of the
things that I do routinely. For example, I bake a fair
bit (cookies, pastry, deserts, etc.). I used to strive
for consistency in each batch. All the cookies the
same size, texture, etc.


Now, I intentionally vary parts of the batch in different
ways. Make some cookies larger or smaller. Bake some
a bit less, others a bit more. *Burn* some, etc. And,
people nibbling on them tend to notice them, more -- less
likely to get into a mindless eating mode where you just
shovel them down without noticing what -- or how many -- you are
eating!


Uh huh, which i hope will suddenly occur to all these
producers and others who think that the audio we consume
*must* be squeezed to get maximum level, hence destroying
the music. When you listen to a symphony, or a good jazz
piece, the dynamics vary. Those variations are part of what makes it interesting and causes you to think of the music as something more than background noise. Variations in
texture, size, etc. cause people to actually think about
what they're "consuming' and actually derive enjoyment from
it.
I recall selecting American History as one of my courses
thinking I had already had two years of that in High School
so it would be a recent memory, for me! The professor was
an economist. So, I relearned all that history with an
entirely different spin than the noble presentation to
which I'd previously been subjected. Fascinating!


Gives you a different perspective doesn't it?


Disturbing. Also makes you feel like a real *sap* for
buying into much of that naive patriotism. frown


INdeed it does, and reinforces some lessons you've already
learned in life intuitively, but makes you think about them
a bit. This old hippie learned to think about economists in
a different light thanks to DAn. Before, I'd read Paul
Krugman and these other guys, and put less stock in waht
they said than I did in the guy that spun up the weather
forecast i heard this morning.

I'd suggest Dan to audio engineers, marketing people, pretty
much any profession. Some real thought provoking stuff
there about the way we operate in everyday life, how we
interact with our customers/clients, etc. For the
uninitiated that aren't bored to tears and are still
following along, it's not all dry tome, there are some
great moments in those two books that will tickle your funny bone.


Agreed. My current read (_The Art of Choosing_ -- actually
on its way back to the library, tonight) also exposed me
to many differences in cultures that I probably never would
have been aware of -- even if I had visited some of these
cultures!


For example, in parts of Europe, doctors make care decisions and
just tell patients and family what those will be. Very
different from how things are done in the U S A. And, there are
psychological consequences to these differences for the
folks in those situations!


INdeed, and I don't think I'd find that acceptable at all.
I've had to do battle with the medical professionals
responsible for my lady's care, and, a year ago switched her primary care physician over an issue wehre he endangered her life just for an unnecessary test.

So, I've enjoyed reading books by economists that touch
on these sorts of subjects. _The Price of Everything_
discusses all of our actions -- social and otherwise -- in

.. terms of economic transactions. E.g., a woman selling
uterine services in a marriage transaction.


Good analogy!!!


Disturbing analogy! Especially that someone would think
about this sort of "activity" in those terms!


Indeed it is, but true nonetheless. AS I commented above,
I'm sort of newly converted to that one.


Re mixers ...
There ya go. IF it's there I don't have to think about
where am I, how i get there.


For a better example, imagine being lost in phone menu hell! "Where
am I? How do I get to where I want to be? Should
I just hang up and start over?"


I've done that, just backed out and started over. But, on a gig that's not a viable option.

Exactly. With use, you develop a sort of muscle memory
as your hands are accustomed to making certain motions to

.. do certain things. If, instead, you have to coordinate your eyes
and hands to *pick* a particular option from a linear
list, you have to rely on that visual feedback to ensure
you are at the right point in that list before making your

selection.


INdeed, and whether it be visual or auditory, forcing you to

. give your attention to the device to choose might distract
you from more important work. HOw many people look at the
dtmf pad on their touchtone phone when dialing a number?


Exactly. How many iPhone users can dial with their hand
*in* a purse, etc.?


Few i'd bet. But muscle memory manifests itself in lots of
interesting ways. YEars ago one of my side jobs was
throwing about 300 newspapers every morning. My daughter
folowed along one morning, and noted as we got to one place, i was looking at her while walking, carrying on a
conversation. my hand dipped into the bag at my side,
launched the newspaper which landed on the stopp right by
the customer's front door. My kid says 'wow dad, you
weren't even looking when you threw that!"
I didn't need to, when I got to that spot my hand just
automatically threw the paper, while gauging the weight and
adjusting the throw so as to put it right wehre i wanted it. had it been a lighter paper, such as mOnday morning, I'd
want to toss more gently. For the larger Wednesday edition, put some more oomph behind it.

Once you drive a car for awhile, when it starts to rain you
automatically know wehre to find the controls for the
windshield wipers. Get rid of that car, buy another, and
for awhile muscle memory is going to fail you, until you get
used to the new one.


DYYup. The same is true of a different keyboard layout.
For example, my UNIX machines tend to have different keys
in different places -- and different functionality -- than
my PC's. So swiveling my desk chair to type on one
keyboard or another means I have to mentally switch my
typing patterns to a different layout -- then, back again
when I swivel the chair to its original orientation.


Can relate to that one.

Thansk for the interesting discussions. We probably should
take general blindness and screen access stuff to email,
jsut so as not to offend was my only point elsewhere. Great to have another intelligent voice in this forum!

Regards,
Richard
--
| Remove .my.foot for email
| via Waldo's Place USA Fidonet-Internet Gateway Site
| Standard disclaimer: The views of this user are strictly his own.
  #379   Report Post  
Posted to rec.audio.pro
John Williamson John Williamson is offline
external usenet poster
 
Posts: 1,753
Default FLAC or other uncompressed formats, which is best?

Doug McDonald wrote:
On 5/27/2012 9:42 AM, John Williamson wrote:
Richard Webb wrote:


Uh huh, which i hope will suddenly occur to all these
producers and others who think that the audio we consume
*must* be squeezed to get maximum level, hence destroying
the music. When you listen to a symphony, or a good jazz
piece, the dynamics vary. Those variations are part of what makes it
interesting and causes you to
think of the music as something more than background noise.
Variations in
texture, size, etc. cause people to actually think about
what they're "consuming' and actually derive enjoyment from
it.


I recorded a school orchestra playing a while ago, taking care to set
the peaks as close to zero dB
as I could, but maintaining all the dynamic range. Listening to the
pseudo soundfield recording, it
feels as if you're there on headphones, and sounds excellent on speaker.

The first comment of the conductor on hearing it was "It's too quiet..."

I just sent them the master copies and left them to ask one of the
students to compress it to taste.
No pressure on me, it was a freebie anyway.


The question is ... what was the piece of music?

The programme included a scaramouche with solo saxophone, a percussion
piece, and Saint Saen's Symphony No. 9 with organ using the Grande Orgue
at Rouen cathedral. This last was "interesting" for me to balance and
for the orchestra to time, as there was a thirty or forty yard gap
between the organ and orchestra. I don't have the full list on this
computer, but I'll post it and links to .wav files of my finished
product later if anyone's interested. I don't know what the orchestra
did with it after I handed them their copy. Constructive criticism is
welcomed. All recorded using the internal microphones on a Zoom H2 in
surround mode.

--
Tciao for Now!

John.
  #380   Report Post  
Posted to rec.audio.pro
Don Y Don Y is offline
external usenet poster
 
Posts: 137
Default FLAC or other uncompressed formats, which is best?

Hi Richard,

[eliding much to try to get back within the charter of the newsgroup]

I have a trimmed down synthesizer that I fall back on if
the primary synthesizer is unavailable in one of the products I'm
designing, currently. It needed to be robust -- so that I could
count on it working regardless of what might have
broken in the system.

snip
That's the weakness of the captured phoneme model. Memory
intensive, and limited in utility.


There are different approaches with different resource and
capability tradeoffs.


Indeed. YOur discussion of constructing sounds from
phonemes and various building blocks was probably very
instructive to some here. For those with a daw you can
illustrate this to yourself quite easily, zoom in on the
waveforms quite closely with your daw. Imagine your world
of constructing the piece of music from various takes, being
sure that the tempo is the same, or close enough, etc. etc.


The advantage of diphone synthesis is that the point at
which you are pasting things together has a greater
chance (theoretically) of being the same "sound".
I.e., the silence-to-f and f-to-ah diphones should
be meeting in the middle of an "f" sound. The amount
of processing required to glue the front end of an "f"
(from the silence-to-f diphone) to the back end of an
"f" sound (from the f-to-ah diphone) is less than
the effort required to transition from an "f" to an
"ah" -- or an "f" to an "oo" or an "oh", etc.

Many of us know about the battle with getting that smooth
seamless punch in. This then is an exercise in the smooth
seamless punch attempt from hell. Analog guys try this one
with your razor bladeg.


Ha!

But, there are a boatload more transitions than there are
phonemes! For example, if you have just 4 phonemes, you might have
12 or 16 transitions:
1-2 1-3 1-4 2-1 2-3 2-4 3-1 3-2 3-4 4-1 4-2 4-3
plus, perhaps, 1-1 2-2 3-3 and 4-4.


Yep, and the more varied you want the vocabulary to be, the
more the mathematical possibilities expand. This is why
you'll never get fully natural sounding speech no matter
what type of syntehsis you use. YOu could apply a
supercomputer to the task and still not get it right
reliably. YOu could get closer, but ...


Imagine trying to piece together sound samples of a trombone
as it slides from one note to another -- and making it natural
sounding. If you start with samples of the actual notes,
you have to do a bit of crunching to "compute" the waveform
as it transitions to the next note (i.e., as the slide is
moved in or out).

On the other hand, if you have samples of ALL the various
transitions -- C to A, C to D, etc. -- you can more readily
paste two consecutive transitions together!

Also, the fact that you are working with samples of actual
speech when doing diphone synthesis means you can truly get
different "voices" from the synthesizer. They need not share
the same "genes" as the voices in DECtalk, for example.
E.g., you could make a synthesizer that sounds like a particular
speaker -- by design!

Another interesting read was _The Compass of Pleasure_
which deals with how we perceive pleasure! I.e., the
similarities between the rush you get from an epiphany,
sex, drugs, food, etc. Unfortunately, it relied too
heavily on neurochemistry and neurophysiology to explain
what was going on in the brain in each of these scenarios.
So, it was a harder read than it needed to be.


INdeed. i have heard of that one but never seen it. About
a year ago I did "this is your brain on music" though. I
still wonder though how much the strange environment of the
mri and all that doesn't mess with the results however.
After all, we don't have sex, or listen to music or other
such activities when crammed into a metal tube in a noisy
sterile environment.
If you could capture those brain images of a guy listening
to Duke Ellington or the Beatles while he's kicked back in
his favorite chair without the apparatus you might get far
different results.


Some of the experiments cited in these texts were disturbing in
the extent to which the subjects were tested.

E.g., the "pleasure" book cited an experiment where a
gay man was "pleasured" by a female prostitute while
sitting in an fMRI. The "goal" being to understand
his pleasure response and "cure" him of his homosexuality.

Similar tests imaged the brains of pedophiles as they
viewed pictures of children.

And, even the suggestion of using this sort of device as
an ultimate lie detector!

But, back to topic. Why do we go to so many live concerts
and come away dsiappointed with the sound, because we, and
the folks putting on the show have been conditioned to think
that's the way it always is. We got used to live sound
being the way it was back in the days of those awful
sounding voice of the theater cabinets, etc. That was what
live amplified music was supposed to sound like in our
heads, and even though we've got better signal processing,
better speaker arrays, etc. WE still go for that sound
because, it's the way live msuic requiring amplification
sounded to us.


"Unprocessed" live music often disappoints because it seems
like it's "not enough (sound)". As if the aural equivalent
of "non-fat milk" -- it just doesn't seem like it is really
milk!

Now, I intentionally vary parts of the batch in different
ways. Make some cookies larger or smaller. Bake some
a bit less, others a bit more. *Burn* some, etc. And,
people nibbling on them tend to notice them, more -- less
likely to get into a mindless eating mode where you just
shovel them down without noticing what -- or how many -- you are
eating!


Uh huh, which i hope will suddenly occur to all these
producers and others who think that the audio we consume
*must* be squeezed to get maximum level, hence destroying
the music. When you listen to a symphony, or a good jazz
piece, the dynamics vary. Those variations are part of what
makes it interesting and causes you to think of the music
as something more than background noise. Variations in
texture, size, etc. cause people to actually think about
what they're "consuming' and actually derive enjoyment from
it.


I think most audio is consumed from canned reproductions.
People hear the same piece performed the same way each
time they listen to it. Anything that doesn't sound
exactly that same way seems (to many) to be "wrong"
or "broken" in some way.

Thansk for the interesting discussions. We probably should
take general blindness and screen access stuff to email,
jsut so as not to offend was my only point elsewhere. Great
to have another intelligent voice in this forum!


I've already started composing some "deeper" queries
for "private consumption". Be a while before I can get
them mailed, though -- problems with my mail server,
currently.



  #381   Report Post  
Posted to rec.audio.pro
Don Y Don Y is offline
external usenet poster
 
Posts: 137
Default FLAC or other uncompressed formats, which is best?

Hi Richard,

On 5/27/2012 6:59 PM, Richard Webb wrote:

The advantage of diphone synthesis is that the point at
which you are pasting things together has a greater
chance (theoretically) of being the same "sound".
I.e., the silence-to-f and f-to-ah diphones should
be meeting in the middle of an "f" sound. The amount
of processing required to glue the front end of an "f"
(from the silence-to-f diphone) to the back end of an
"f" sound (from the f-to-ah diphone) is less than
the effort required to transition from an "f" to an
"ah" -- or an "f" to an "oo" or an "oh", etc.


Yep, lots of combinations, lots of possibilities.


A good collection of historical synthesizers, he
http://www.cs.indiana.edu/rhythmsp/ASA/Contents.html
including several of the DECtalk voices.

Unfortunately, it wasn't created from a common set of
text applied to the different synthesizers so you can't
readily compare one technology to another.

For example, you'd need to be pretty familiar with phoneme
based synthetic speech to note teh differences with the
diphone sample presented there.

On the other hand, if you have samples of ALL the various
transitions -- C to A, C to D, etc. -- you can more readily
paste two consecutive transitions together!


Yep, but you'e got to store all them, and select them, and
that takes, there again, storage and raw processing power.
THe U.S Coast guard still seems to be using the dectalk or
very similar for their synthesis for high seas weather
broadcasts on hf single sideband, but last I heard wlo radio
out of MObile Alabama they were using a different syntehsis
technique, with a female voice iirc. As for me, when I read
them for a ham radio network I usually braille them first,
or did before my embosser crumped enough to need a trip to
the embosser doctor in Florida eventually. NOw I justgrab
the warnings by listening to the synth and taking shorthand
notes with a slateg.


Dots 1-3-5, 1-2-5
Dots 2-3-4, 1-2-5, 2-4, 2-3-4-5, 2-3-5

Also, the fact that you are working with samples of actual
speech when doing diphone synthesis means you can truly get
different "voices" from the synthesizer. They need not share the
same "genes" as the voices in DECtalk, for example.
E.g., you could make a synthesizer that sounds like a particular
speaker -- by design!


Indeed, if you've got the computing power for it you could
mimick just about any voice.


But it doesn't really take much to get 80% of the personality
of a voice. For example, if you have ~40 unique phonemes,
then, conceivably, you have ~1600 possible diphones. In
reality, often 15% less than that. And, with those ~1400
diphones, you can "say anything" (unlimited vocabulary).

In practice, however, you usually have to include several
different copies of vowel sounds to convey different sorts
of stress. So, you start with maybe ~55-60 phonemes which
would suggest ~3600 diphones (in practice, you only use
~2200 of those).

Still, it's a very large unit database! And, you still have
to splice the diphones together (not trivial).

[OTOH, much less computationally expensive than synthesizing
the actual waveform from a mathematical model!]

I think most audio is consumed from canned reproductions.
People hear the same piece performed the same way each
time they listen to it. Anything that doesn't sound
exactly that same way seems (to many) to be "wrong"
or "broken" in some way.


True, and most of those reproductions have been dynamically
processed before being delivered to them, by the broadcast
air chain if not elsewhere in the production chain. They
never feel that great crescendo from very soft to in your
face, in your ears, can't get away. It's an experience that
is lost on most without formal music training who are under
about age 50. I don't think my mother has listened to music
that wasn't brought to her ears via processing transducers
and amplifiers since I was playing trumpet in high school.
I'd be willing to wager that I can't find a dozen people
within a mile of me who have gone to a totally unamplified
musical performance in the last two years.


I think, for some artists who rely heavily on "processing"
their work (e.g., "one man bands") that the option for a
live performance isn't remotely possible!

Thanks for the interesting discussions. We probably should
take general blindness and screen access stuff to email,
just so as not to offend was my only point elsewhere. Great
to have another intelligent voice in this forum!


I've already started composing some "deeper" queries
for "private consumption". Be a while before I can get
them mailed, though -- problems with my mail server,
currently.


Can relate. That's why I use more than one these days. FUn
with emailg.. I'm one of the loudest bitchers about folks
straying from the charter, especially on religion and
politics, so figure what's good for the goose ...


Agreed. I'd have taken this offlist sooner but for the
request/interest that was expressed in "eavesdropping".
  #382   Report Post  
Posted to rec.audio.pro
Richard Webb[_3_] Richard Webb[_3_] is offline
external usenet poster
 
Posts: 533
Default FLAC or other uncompressed formats, which is best?

On Sun 2012-May-27 16:21, Don Y writes:
re speech syntehsis ...
There are different approaches with different resource and
capability tradeoffs.


Indeed. YOur discussion of constructing sounds from
phonemes and various building blocks was probably very
instructive to some here. For those with a daw you can
illustrate this to yourself quite easily, zoom in on the
waveforms quite closely with your daw. Imagine your world
of constructing the piece of music from various takes, being

sure that the tempo is the same, or close enough, etc. etc.


The advantage of diphone synthesis is that the point at
which you are pasting things together has a greater
chance (theoretically) of being the same "sound".
I.e., the silence-to-f and f-to-ah diphones should
be meeting in the middle of an "f" sound. The amount
of processing required to glue the front end of an "f"
(from the silence-to-f diphone) to the back end of an
"f" sound (from the f-to-ah diphone) is less than
the effort required to transition from an "f" to an
"ah" -- or an "f" to an "oo" or an "oh", etc.


Yep, lots of combinations, lots of possibilities.

Many of us know about the battle with getting that smooth
seamless punch in. This then is an exercise in the smooth
seamless punch attempt from hell. Analog guys try this one
with your razor bladeg.


Ha!


I got pretty good with an editing block back in the day but
doubt I'd ever be that good ,g..

But, there are a boatload more transitions than there are
phonemes! For example, if you have just 4 phonemes, you might have
12 or 16 transitions:
1-2 1-3 1-4 2-1 2-3 2-4 3-1 3-2 3-4 4-1 4-2 4-3
plus, perhaps, 1-1 2-2 3-3 and 4-4.

Yep, and the more varied you want the vocabulary to be, the
more the mathematical possibilities expand. This is why
you'll never get fully natural sounding speech no matter
what type of syntehsis you use. YOu could apply a
supercomputer to the task and still not get it right
reliably. YOu could get closer, but ...


Imagine trying to piece together sound samples of a trombone as it
slides from one note to another -- and making it natural sounding.
If you start with samples of the actual notes,
you have to do a bit of crunching to "compute" the waveform
as it transitions to the next note (i.e., as the slide is
moved in or out).


Yep, exactly, you either need lots of memory to store those
samples, lots of horsepower to process them, or both. Look
out for Moore's law though, it's probably on the horizon
g.

On the other hand, if you have samples of ALL the various
transitions -- C to A, C to D, etc. -- you can more readily
paste two consecutive transitions together!


Yep, but you'e got to store all them, and select them, and
that takes, there again, storage and raw processing power.
THe U.S Coast guard still seems to be using the dectalk or
very similar for their synthesis for high seas weather
broadcasts on hf single sideband, but last I heard wlo radio out of MObile Alabama they were using a different syntehsis
technique, with a female voice iirc. As for me, when I read them for a ham radio network I usually braille them first,
or did before my embosser crumped enough to need a trip to
the embosser doctor in Florida eventually. NOw I justgrab
the warnings by listening to the synth and taking shorthand
notes with a slate g.

Also, the fact that you are working with samples of actual
speech when doing diphone synthesis means you can truly get
different "voices" from the synthesizer. They need not share the
same "genes" as the voices in DECtalk, for example.
E.g., you could make a synthesizer that sounds like a particular
speaker -- by design!


Indeed, if you've got the computing power for it you could
mimick just about any voice.

Another interesting read was _The Compass of Pleasure_
which deals with how we perceive pleasure! I.e., the
similarities between the rush you get from an epiphany,
sex, drugs, food, etc. Unfortunately, it relied too

heavily on neurochemistry and neurophysiology to explain
what was going on in the brain in each of these scenarios.
So, it was a harder read than it needed to be.


INdeed. i have heard of that one but never seen it. About
a year ago I did "this is your brain on music" though. I
still wonder though how much the strange environment of the
mri and all that doesn't mess with the results however.
After all, we don't have sex, or listen to music or other
such activities when crammed into a metal tube in a noisy
sterile environment.
If you could capture those brain images of a guy listening
to Duke Ellington or the Beatles while he's kicked back in
his favorite chair without the apparatus you might get far
different results.


Some of the experiments cited in these texts were disturbing in the
extent to which the subjects were tested.


yEs, indeed, so have been some of them I've read about, I've read of the pedophile test elsewhere, and some of the other
testing of sexual arousal etc. I believe in Playboy. Still
the environment of the fMRI or whatever they're using has to skew things a bit imho.

snip
But, back to topic. Why do we go to so many live concerts
and come away disappointed with the sound, because we, and
the folks putting on the show have been conditioned to think
that's the way it always is. We got used to live sound
being the way it was back in the days of those awful
sounding voice of the theater cabinets, etc. That was what
live amplified music was supposed to sound like in our
heads, and even though we've got better signal processing,
better speaker arrays, etc. WE still go for that sound
because, it's the way live music requiring amplification
sounded to us.


"Unprocessed" live music often disappoints because it seems
like it's "not enough (sound)". As if the aural equivalent
of "non-fat milk" -- it just doesn't seem like it is really
milk!


Indeed, and I'm glad i had the opportunity to "cut my teeth" as it were on unprocessed live music. See related thread of a couple weeks back in this newsgroup discussing a bad
sounding mastering job. Iirc I asserted in that one that
too many young folks haven't had the opportunity to
experience live music that wasn't brought to their ears via
at least one transducer.

Now, I intentionally vary parts of the batch in different
ways. Make some cookies larger or smaller. Bake some
a bit less, others a bit more. *Burn* some, etc. And,
people nibbling on them tend to notice them, more -- less
likely to get into a mindless eating mode where you just
shovel them down without noticing what -- or how many -- you are
eating!


Uh huh, which i hope will suddenly occur to all these
producers and others who think that the audio we consume
*must* be squeezed to get maximum level, hence destroying
the music. When you listen to a symphony, or a good jazz
piece, the dynamics vary. Those variations are part of what
makes it interesting and causes you to think of the music
as something more than background noise. Variations in
texture, size, etc. cause people to actually think about
what they're "consuming' and actually derive enjoyment from
it.


I think most audio is consumed from canned reproductions.
People hear the same piece performed the same way each
time they listen to it. Anything that doesn't sound
exactly that same way seems (to many) to be "wrong"
or "broken" in some way.


True, and most of those reproductions have been dynamically
processed before being delivered to them, by the broadcast
air chain if not elsewhere in the production chain. They
never feel that great crescendo from very soft to in your
face, in your ears, can't get away. It's an experience that is lost on most without formal music training who are under
about age 50. I don't think my mother has listened to music that wasn't brought to her ears via processing transducers
and amplifiers since I was playing trumpet in high school.
I'd be willing to wager that I can't find a dozen people
within a mile of me who have gone to a totally unamplified
musical performance in the last two years.

Thanks for the interesting discussions. We probably should
take general blindness and screen access stuff to email,
just so as not to offend was my only point elsewhere. Great
to have another intelligent voice in this forum!


I've already started composing some "deeper" queries
for "private consumption". Be a while before I can get
them mailed, though -- problems with my mail server,
currently.


Can relate. That's why I use more than one these days. FUn with email g.. I'm one of the loudest bitchers about folks straying from the charter, especially on religion and
politics, so figure what's good for the goose ...




Regards,
Richard
--
| Remove .my.foot for email
| via Waldo's Place USA Fidonet-Internet Gateway Site
| Standard disclaimer: The views of this user are strictly his own.
  #383   Report Post  
Posted to rec.audio.pro
John Williamson John Williamson is offline
external usenet poster
 
Posts: 1,753
Default FLAC or other uncompressed formats, which is best?

John Williamson wrote:
Doug McDonald wrote:
On 5/27/2012 9:42 AM, John Williamson wrote:
Richard Webb wrote:


Uh huh, which i hope will suddenly occur to all these
producers and others who think that the audio we consume
*must* be squeezed to get maximum level, hence destroying
the music. When you listen to a symphony, or a good jazz
piece, the dynamics vary. Those variations are part of what makes it
interesting and causes you to
think of the music as something more than background noise.
Variations in
texture, size, etc. cause people to actually think about
what they're "consuming' and actually derive enjoyment from
it.

I recorded a school orchestra playing a while ago, taking care to set
the peaks as close to zero dB
as I could, but maintaining all the dynamic range. Listening to the
pseudo soundfield recording, it
feels as if you're there on headphones, and sounds excellent on speaker.

The first comment of the conductor on hearing it was "It's too quiet..."

I just sent them the master copies and left them to ask one of the
students to compress it to taste.
No pressure on me, it was a freebie anyway.


The question is ... what was the piece of music?

The programme included a scaramouche with solo saxophone, a percussion
piece, and Saint Saen's Symphony No. 9 with organ using the Grande Orgue
at Rouen cathedral. This last was "interesting" for me to balance and
for the orchestra to time, as there was a thirty or forty yard gap
between the organ and orchestra. I don't have the full list on this
computer, but I'll post it and links to .wav files of my finished
product later if anyone's interested. I don't know what the orchestra
did with it after I handed them their copy. Constructive criticism is
welcomed. All recorded using the internal microphones on a Zoom H2 in
surround mode.

Links now he-

www.oysterbroadcast.co.uk/click_2.html

There are both .wav files and mp3 files linked to.

--
Tciao for Now!

John.
  #384   Report Post  
Posted to rec.audio.pro
Richard Webb[_3_] Richard Webb[_3_] is offline
external usenet poster
 
Posts: 533
Default FLAC or other uncompressed formats, which is best?

On Sun 2012-May-27 21:56, Don Y writes:
snip
I.e., the silence-to-f and f-to-ah diphones should
be meeting in the middle of an "f" sound. The amount
of processing required to glue the front end of an "f"
(from the silence-to-f diphone) to the back end of an
"f" sound (from the f-to-ah diphone) is less than
the effort required to transition from an "f" to an
"ah" -- or an "f" to an "oo" or an "oh", etc.


Yep, lots of combinations, lots of possibilities.


A good collection of historical synthesizers, he
http://www.cs.indiana.edu/rhythmsp/ASA/Contents.html
including several of the DECtalk voices.


Unfortunately, it wasn't created from a common set of
text applied to the different synthesizers so you can't
readily compare one technology to another.


That would have been cool had they done their comparison
with the same text. I think I helped the assistive tech
folks back in IOwa do a tape recording of common speech
synths of the day, and we used the same passage for each,
iirc. The passage in Gone with the Wind where whatshishead
tells Scarlet Frankly he doesn't give a damn. I think we
had an Accent SA, a Dectalk, an Artic and a Doubletalk in
that one. I don't know if it survives or not, was recorded
to cassette iirc.


snip
On the other hand, if you have samples of ALL the various
transitions -- C to A, C to D, etc. -- you can more readily
paste two consecutive transitions together!


Yep, but you've got to store all them, and select them, and
that takes, there again, storage and raw processing power.
THe U.S. Coast guard still seems to be using the dectalk or
very similar for their synthesis for high seas weather
broadcasts on hf single sideband, but last I heard wlo radio
out of MObile Alabama they were using a different syntehsis
technique, with a female voice iirc. As for me, when I read
them for a ham radio network I usually braille them first,
or did before my embosser crumped enough to need a trip to
the embosser doctor in Florida eventually. NOw I justgrab

.. the warnings by listening to the synth and taking shorthand
notes with a slateg.


Dots 1-3-5, 1-2-5
Dots 2-3-4, 1-2-5, 2-4, 2-3-4-5, 2-3-5


YEah that's what i thought when my embosser quit reproducing dot 1. For the uniinitiated he said "oh ****."

Also, the fact that you are working with samples of actual
speech when doing diphone synthesis means you can truly get
different "voices" from the synthesizer. They need not share the
same "genes" as the voices in DECtalk, for example.
E.g., you could make a synthesizer that sounds like a particular
speaker -- by design!


Indeed, if you've got the computing power for it you could
mimick just about any voice.


But it doesn't really take much to get 80% of the personality of a
voice. For example, if you have ~40 unique phonemes,
then, conceivably, you have ~1600 possible diphones. In
reality, often 15% less than that. And, with those ~1400
diphones, you can "say anything" (unlimited vocabulary).


Again true! Hadn't really thought of that, but makes sense.

In practice, however, you usually have to include several
different copies of vowel sounds to convey different sorts
of stress. So, you start with maybe ~55-60 phonemes which
would suggest ~3600 diphones (in practice, you only use
~2200 of those).


And there's the rub! expression is a big part of speech,
and that usually happens with the diphones that are our
vowell sounds. Screenreaders try a bit to change the
inflection if the sentenced terminates with a question mark, for example.

Still, it's a very large unit database! And, you still have to
splice the diphones together (not trivial).


Yep, and that's where the storage and memory intensive comes in.

[OTOH, much less computationally expensive than synthesizing the
actual waveform from a mathematical model!]


Right. But, as any of us know who've tried to synthesize
musical instruments, getting all those combinations is the
fun part.

I think most audio is consumed from canned reproductions.
People hear the same piece performed the same way each
time they listen to it. Anything that doesn't sound
exactly that same way seems (to many) to be "wrong"
or "broken" in some way.


True, and most of those reproductions have been dynamically
processed before being delivered to them, by the broadcast
air chain if not elsewhere in the production chain. They

snip
I'd be willing to wager that I can't find a dozen people
within a mile of me who have gone to a totally unamplified
musical performance in the last two years.


I think, for some artists who rely heavily on "processing"
their work (e.g., "one man bands") that the option for a
live performance isn't remotely possible!


True enough. Back a few years ago I used to do music on
hold and some music beds with midi sound modules. People
commented frequently about how my instruments felt more real than a lot of synthesized music they ehard. This was
because I play a variety of instruments, and I never tried
to arrange the music where instruments were doing things
they odn't naturally do.

snip
I'm one of the loudest bitchers about folks
straying from the charter, especially on religion and
politics, so figure what's good for the goose ...


Agreed. I'd have taken this offlist sooner but for the
request/interest that was expressed in "eavesdropping".


True, including my colleague Frank. Some of the minutiae of access technology though probably crowded the edge grin.



Regards,
Richard
--
| Remove .my.foot for email
| via Waldo's Place USA Fidonet-Internet Gateway Site
| Standard disclaimer: The views of this user are strictly his own.
  #385   Report Post  
Posted to rec.audio.pro
Anahata Anahata is offline
external usenet poster
 
Posts: 378
Default FLAC or other uncompressed formats, which is best?

On Thu, 24 May 2012 18:00:29 -0700, Don Y wrote:

"Those who can, do.
Those who CAN'T, troll!"


Thank you. That's made my day!

--
Anahata
--/-- http://www.treewind.co.uk
+44 (0)1638 720444

Reply
Thread Tools
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
HELP needed understanding AIFF & FLAC "lossless" formats Terry[_3_] Pro Audio 29 June 3rd 08 05:40 PM
Uncompressed Digital Video vs. Uncompressed Digital Audio Radium Tech 72 February 15th 07 05:50 AM
Flac Vs. Wav [email protected] Tech 10 September 26th 06 03:15 PM
Source for uncompressed CDs? Carey Carlan Pro Audio 13 August 1st 06 08:04 AM
need converter from dp3 or dp4 formats to wav or ses formats tom williams Pro Audio 2 April 1st 04 11:17 PM


All times are GMT +1. The time now is 06:41 PM.

Powered by: vBulletin
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 AudioBanter.com.
The comments are property of their posters.
 

About Us

"It's about Audio and hi-fi"