Home |
Search |
Today's Posts |
#361
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
On Thu 2012-May-24 21:00, Don Y writes:
Right, and, while you were affiliated with Kurzweil you picked up on one thing Ray was very good at, it actually reached out to those he was designing for, the end users were an integral part of product development. That's why when Ray first decided to work on ocr for the blind market he sought out guidance from the National Federation of the Blind. Note National Federation of the Blind vs. American Federation FOR the Blind! Get the wrong preposition in there and you'll never hear the end of it! Yep, I've been one of those to quibble over taht one. big difference. I used to spend a good deal with a guy ("Michael") who I recall being tied to NFB somehow (I would stumble across him in various places around the country instead of in one single locality). snip YEah met him a few times. Michael Hingson iirc. He just paused a bit and said, "That's a good question. You know, Don, if you can tell me what it is like to SEE, I'll tell you what it's like NOT to see!" I've still not figured out how I could redo that conversation and have either of us get any MORE information out of it. rotfl Can relate. I never envied Ray his job. He had to move away from engineering (why go to an engineering school if you want to be a businessman?) to keep a business running. At that time, we were very small. I'm willing to bet about 25 people, total. It was his job to make sure the money kept coming in to keep us paid. YEah I bet. Hard to take off the hat you've prepared your life for and put on the other. Interesting story about the crunch push when the money men wanted to see a working product. nFB helped bring ocr mainstream. OTher good aces technology developers do much the same. Were I working for somebody else and had to interact with their systems with their choice of operating systems I could work with jaws. .. i find it cumbersome to use, but I could get work done with it. But, that isn't the developer's fault so much as the environment he had to work with, i.e. giving blind folks access to mainstream office applications. I believe he now has a cell-phone sized device that provides similar functionality? I have a Personal Reader here but it is much newer than the minicomputer-based generation that I worked on. It also supports a "hand scanner" (in addition to the flatbed scanner) which was an option that wasn't available on the machine when I was familiar with it. YEp, and it's marketed in partnership with NFB. Mike is involved in that one. I ever get some business things caught up where I'd like to be one's in my future. Amazing device. The ability to just sit down in a restaurant and read the menu when they dont' ahve one in braille ... a liberating experience g. Thing is, you get so many gadgets you almost need to care a pelican case with foam inserts to keep all yoru gadgets safe g. snip Yes, and many of us see in just that market products designed by people like him who don't think that we, the end users know wtf, and they have all the answers. The arrogance of ignorance i always call it. Good intentions, but we all know about good intentions. Surprisingly common. Regrettably. Regardless of the market and user base targeted. These folks should be made to USE some of the products they've developed! I've said this about a lot of things over the years. NOt just use it, but use it in the manner that the end user is likely to user it. Sometimes if you do that you'll end up going back to the drawing board. I've done a couple of simple station logging apps for ham radio people doing comms for public service events such a bike-a-thons, etc. Just because I was net manager for the operation. I also have an older dos laptop I bring along, because I can run it off 13.8 vdc. So, once I've rolled the thing, I sit it in front of my lady and say, here play with it. IF she doesn't understand the menus or how to get waht she needs then I go back to the drawing board. Sometimes what's intuitive to me isn't to her, and when it's not I know that I need to rethink the idea. If you've read some of the writings of Malcolm Chisholm he asserts rather strongly that many mixing console manufacturers and developers never sat behind one and tried to do a session grin. iT wasn't that they went into the project intending to design a mixing console that was ergonomically unfriendly, it was jsut that they hadn't really grabbed any working audio engineer types and said 'here use this, tell us what you think." Exactly. Hence the reason I pick the brains of everyone I can. There's only so much you can "imagine" -- regardless of how good your imagination might be! Right, which is another reason I'm leery of a lot of the digital mixing console offerings for my remote truck right now. Were I working with the same act doign the same show, or pretty close to, i could save my preferred working setup on whatever storage media it uses, and if it crumps druing the gig, all I've got to remember is the keystrokes to get it to load it back upfor me. But, a remote truck might be working a variety of things, and every time it goes out is different. Then there's the old what i do if I"ve got two of us working the console, one of us is flying in an effects cue with an aux send, and the other one working with the faders for the percussions section. How do we decide who gets waht menu up? IF they solve the ergonomics to my liking though I'd sure rather run cat5 from venue to truck, or even better, fiber. Yes, part of that stumbling block is blindness related (see other post) but it's a combination of factors, the blindness, as well as the fluid working environment. I recall buying a microwave oven for my mother in law many years ago. New fangled gizmo! Being an engineer, I liked the pushbutton keypad -- nothing mechanical that would be likely to break, wipe clean finish, etc. I went through that wrestling match a year ago. I like my knob, point it at what I want, no guessing. Thsoe are hard to find these days. But, my wife insisted that we buy the model with the rotary knob! Ick! But, she knew her mother and what her mother would more readily relate to -- so I deferred to her judgement. Of course, she was right. Yep, I looked all over teh Memphis area, finally got the last one they had at Kmart last summer. I've an oven on our kitchen range right now i can't use because of the damned flat panel controls. i don't like cooking on electric anyway, I've always preferred gas believe it or not. When I cut the fire off under that skillet or pan I want that fire off now!!! Also, I can feel whether I've got high or low flame. When I adjust the electric if i misjudge it's going to be awhile before my adjustment manifests itself, and by then it might be too far gone to salvage. And now, to go put some pieces of beef on this charcoal, or at least get the charcoal happening g. Regards, Richard .... Love is being owned by a rottweiler! -- | Remove .my.foot for email | via Waldo's Place USA Fidonet-Internet Gateway Site | Standard disclaimer: The views of this user are strictly his own. |
#362
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
Hi Richard,
On 5/25/2012 4:42 PM, Richard Webb wrote: Note National Federation of the Blind vs. American Federation FOR the Blind! Get the wrong preposition in there and you'll never hear the end of it! Yep, I've been one of those to quibble over taht one. big difference. And a very emotional one! The whole role of each organization can be summed up in those two prepositions! I used to spend a good deal with a guy ("Michael") who I recall being tied to NFB somehow (I would stumble across him in various places around the country instead of in one single locality). snip YEah met him a few times. Michael Hingson iirc. Ha! Excellent! I don't know if I ever knew his last name. But, a quick google for images turned up lots of photos of him that I could believe to be "what he looks like, 35 years later!" He just paused a bit and said, "That's a good question. You know, Don, if you can tell me what it is like to SEE, I'll tell you what it's like NOT to see!" I've still not figured out how I could redo that conversation and have either of us get any MORE information out of it. rotfl Can relate. But I couldn't! It had never occurred to me just how silly the question was -- until afterwards. Its sort of like the "tastes like milk" commercial (What's milk taste like? shrug What does this taste like? Milk!) I never envied Ray his job. He had to move away from engineering (why go to an engineering school if you want to be a businessman?) to keep a business running. At that time, we were very small. I'm willing to bet about 25 people, total. It was his job to make sure the money kept coming in to keep us paid. YEah I bet. Hard to take off the hat you've prepared your life for and put on the other. Interesting story about the crunch push when the money men wanted to see a working product. There are a couple of amusing stories that went along with this -- but probably not appropriate for discussion in a public forum! : I believe he now has a cell-phone sized device that provides similar functionality? YEp, and it's marketed in partnership with NFB. Mike is involved in that one. I ever get some business things caught up where I'd like to be one's in my future. Amazing device. The ability to just sit down in a restaurant and read the menu when they dont' ahve one in braille ... a liberating experienceg. That was the feeling I would get whenever setting up a new machine at a new site. You'd get things running properly. Some representative from the client agency would sit down to use the machine -- invariably visually impaired -- and you'd watch them cringe as they tried to make sense of that god-awful voice! Literally *squinting* as if that would somehow improve their hearing skills! But, you could tell the instant they understood the dialect. Their eyes would literally go wide -- "Wow! I can finally read my own personal mail without having to rely on my secretary -- and having her aware of things that are none of her business!" Liberating is a good term. Thing is, you get so many gadgets you almost need to care a pelican case with foam inserts to keep all yoru gadgets safeg. Exactly. Each does *one* thing. And, often, not well! Surprisingly common. Regrettably. Regardless of the market and user base targeted. These folks should be made to USE some of the products they've developed! I've said this about a lot of things over the years. NOt just use it, but use it in the manner that the end user is likely to user it. Sometimes if you do that you'll end up going back to the drawing board. There is also the hazzard of making a device that defines how it must be used. Even if that is the way that 99% of the user base is likely to use it, it forces 100% of users to follow that prescription -- even if it isn't a necessary condition for the device's operation! "Why do I have to specify this parameter before that parameter? They are independant yet you are forcing me to pick a certain one before the other. How did you decide that this is the only way it should be done?" I'm reading _The Art of Choosing_, currently. It addresses how people deal with choices -- among other things. Things like how the number of choices can affect our satisfaction with our eventual choice. Too many can be worse than too few, for example. On the other hand, how choices are presented to you can greatly affect how well you can make a set of choices and how happy you can be with the result. In one example, they demonstrated (through experiment) how the order that choices are forced upon a user can lead to increased or decreased satisfaction. Think about web interfaces where you are forced to make certain choices before you are presented with the next set of choices -- even if the first set has no bearing on the second set! In this case, they allowed real customers to specify the options they wanted in the automobile they would be ordering. For one set of customers, they presented the choices in order of "most choices" to "least choices". E.g., there were more choices for body paint color than engine size so body color was "selected" first and, eventually, engine size. For another set of customers, the order in which the choices were presented was the exact opposite -- pick the engine, transmission, choice of sound system, etc. and, finally, the COLOR of the vehicle. The result of that experiment -- which might not be generalizable to choice, in general -- was that people found it easier to proceed from those options with FEW choices to those with MORE choices than the other way around. I.e., once the user had specified the engine, accessories, body style, etc., they had a better image in their mind for how to specify the remaining options -- like body color. Wanna bet that most vendors just throw choices at the user in whatever order is convenient for the vendor??! I.e., if we know what file format he wants, then we can refine the sample rates and data formats to those that are supported *in* that file format. "Piece of cake!" Why can't you let the user decide what is important to him and *then* refine your offerings?! The technical problem in implementing this is exactly the same! But, the attitude conveyed to the user is entirely different! *He* drives the device instead of the device driving *him*! Exactly. Hence the reason I pick the brains of everyone I can. There's only so much you can "imagine" -- regardless of how good your imagination might be! Right, which is another reason I'm leery of a lot of the digital mixing console offerings for my remote truck right now. Were I working with the same act doign the same show, or pretty close to, i could save my preferred working setup on whatever storage media it uses, and if it crumps druing the gig, all I've got to remember is the keystrokes to get it to load it back upfor me. But, a remote truck might be working a variety of things, and every time it goes out is different. OK, I think I follow your reasoning -- though have no firsthand experience in that application domain (so I can't comment on how I would react when faced with the same issues) Then there's the old what i do if I"ve got two of us working the console, one of us is flying in an effects cue with an aux send, and the other one working with the faders for the percussions section. How do we decide who gets waht menu up? IF they solve the ergonomics to my liking though I'd sure rather run cat5 from venue to truck, or even better, fiber. Yes, part of that stumbling block is blindness related (see other post) but it's a combination of factors, the blindness, as well as the fluid working environment. Can't multiple menus be displayed concurrently? For example, there are many desktop GUIs that will let you "pin" (think: thumbtack) a menu or a dialog to the desktop so that it is "persistent". When you want to remove the object, you remove the "pin" and the object goes away. So, you could open each thing that you wanted to access, move them to appropriate parts of the desktop, then pin them in place so they stay accessible/active. You might also look into what are called "pie menus". With these, you open the menu and find yourself in the center of a circular "pie". From there, you pick a direction to select a specific item from the menu. Think of the menu as slices of a pie and you are just deciding which slice you want -- always from the known reference point in the center of the pie! Of course, menus have to be designed to keep the number of choices small. Much easier to pick from 6 or 8 "slices" than 16 or 18! |
#363
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
Hi Richard,
The asterisks spoke, because I"ve configured the screen reader to speak them, as people use them to set text apart, or the dreaded footnotes of course. OK, so my use of them for emphasis detracts from your comprehension instead of adding to it! But, the all caps didn't, at least when jsut reading. NOw, were I editing, capital letters are spoken with a bit of a raised inflection, but, when read as words they're not. When just in the reader, not editing, for example, reading usenet articles, a book that's text or similar I have most punctuation disabled so sentences sound normal. Understood. But, would you have any cues that there might be some punctuation that you might want to see which has been silenced. For example, the cartoonish way of showing a pejorative as a jumbled sequence of ad hoc punctuation marks like $(&*^@#$! On a different note, I assume you are also victimized by spelling errors? For example, I tend to end up transposing pairs of letters simply because one finger finds its way to a key before the other -- which should have preceded it. Like teh instead of the. Cutting to the chase on some of this, colors aren't spoken at all. I can have the screen reader, and most screen reader, monitor a portion of the screen, say a status line, for either a change in text displayed there, or a change in attributes. Does it simply speak each change encountered? What if the time to speak the information exceeds the time between changes? For example, a timer counting down seconds remaining until a task's completion. But, "one minute forty nine seconds" takes longer to speak than the time for the display to change to "one minute forty eight seconds". Likewise, is the screen reader preoccupied with this task or can it also let you wander around to other parts of the screen while it is monitoring that section? That's another reason I like asap. IF, for example, I want a bit of a different configuration which is more compatible with a drop down menu in an app, we watch for the status line to change. if it changes to x, load y configuration, etc. Ah, OK. You're right, it takes a bit of extra work to make screen access technology play with what a person might just download or use off the shelf. That's why I like things with textual configuration scripting or guidance, and the ability to make the program operate as i want it to oeprate ,g. Understood. Oh, OK. Some software will note the "email context" and try to actually keep track of this for you -- using different voices for each party, etc. Though not their ACTUAL voices! NO different voices, I just have the greater than symbol in my punctuation exceptions for anything that's a mail or usenet reader app, so it says 'greater" then the line of text. OK. Obviously even different voices has a small upper limit. Keeping track of three different parties in quoted text would probably leave you distracted by those voices instead of aided. So, folks who just quote entire posts and bottom post their replies are just as bad as folks who top post. Each is equally hard for you to put back into context (you have to REMEMBER what was said and remember the reply and then thread them together in your mind) YEp, you read through all that, or turn of the filter quoting and see hree or four screens of quoted material for a two liner replyg. one reason braille will always be superior, the ability to skim. This is directly analagous to reading printed text. You can cherry pick through large amounts of information with relatively little effort. [jaws copy protection] Yep, see my other post. My stumbling block came with it mainly when I wanted to install a copy on mom's machine so I could help her maintain it, a copy on the studio control room machine, and one in the office. I'd *never* be using all three simultaneously. Then when I had a system crash on one system my install key floppy didn't play. That copy protection has been hacked, but i refuse to play that game for obvious ethical reasons. Others may, but i respect intellectual property rights. Ted HEnter worked a long time to develop it, and though i may not like his protection scheme, that doesn't give me the right. YOu know the drill. Yup. This is the dark side of illegal copying. It forces authors to waste effort protecting their works. And, screws legitimate users out of the ability to use the product "fairly". For example, if I have three computers but only use one at a time, the morally correct thing is to have one license. But, how does the author ensure that I really *am* using just one at a time? How does the author ensure that the "second computer" isn't a friend's computer? The Mercator project (now defunct) tried to layer a speech interface *under* (not ON TOP OF!) the GUI in UNIX. I.e., it replaced the standard GUI libraries with speech-enabled ones with which the "screen reader" could interact. So, it knew that "these buttons are part of a group of RADIO BUTTONS governing this particular option choice", and "this text box expects a numeric value that specifies the age of the person", etc. YEah there were a couple like that, had heard of that one, or the "speaqualizer project. Both were failures in the marketplace. MErcator may have never made it to market, but they tried with Speaqualizer for awhile. Mercator was an academic project. Yet another example of people thinking that there would be a "simple" way to address this problem. The only simple way to address the problem is NOT to provide a visual interface! So, applications ALL have to rely on the same non-visual interface to interact with their users! If you provide a nonvisual OPTION, then applications will only give token support -- if any -- to it. On the other hand, if the only way to get information out of a device is via that option, then they don't have a choice! This is the approach I have been taking, lately. Pick an output modality that addresses everyone in the target audience and force everything to use that single mechanism! Festival -- a free package probably available under Linux -- has a lot of "context modules" that try to alter teh rules for pronunciation based on context. I.e., so email addresses are pronounced as "Richard dot Webb dot my dot foot at ..." instead of some unpronounceable jumble of letters and symbols. For example, the C C header would be pronounced as "carbon copy", etc. YEp, which was I think why so many complained when the National WEather service went with dectalk speech synths for their vhf radio forecasts. The backup speech synthesizer in one of my products has similar quality issues as Klatt's DECtalk (he wrote it while a student and DEC commercialized it). It's biggest advantage is that it is pretty lean when it comes to resources -- which translates directly to implementation costs and reliability. I have a DECtalk DTC01 and a DECtalk Express. Plus a few of the Artic/Votrax-based synthesizers. All have the same basic advantage -- and the same robotic speech quality! On the other hand, Festival has a huge footprint. And, is considerably easier to crash than DECtalk. Where DECtalk and the other "simple" synthesizers will take a stab at pronouncing damn near anything you throw at them, Festival will chew on it for a fair bit of time before commiting to a pronunciation -- which can be just as wrong as the other products! OK. Any particular reason why you're married to that machine? I'd like to have the raid array for server, and yes, once we've relocated net connected server is part of battle plan. Raid would be nice. i've got another box which is going to be dedicated to firewall/router duties, but would like to keep that one as server, which was what it did in its former life. Does the chassis force you to use a certain type of disk drives? E.g., because of disk carriers? Many older RAID offerings require SCSI disks. Would you be happy with RAID in some other form? INterestign that you learned braille. I'd be lost without it When I worked for Kurzweil, I was dealing with visually impaired customers AT BEST! Seems disrespectful not to learn to communicate in the form that THEY require. I.e., doesn't do me much good to leave a handwritten note telling them "I'll be back after lunch"! [sightless V U meter] Ah, OK. Clever. YEp, for some plans and simple designs, goto ski.org and download sktf.zip, it's about a 2 mb zip file, multiple directories, but text files on lots of things, home brewing adaptive vu solutions, soldering jigs, all sorts of stuff. OK. OK. So, this is "yet another DEVICE" that you have. Like a tactile wris****ch, braille slate, talking calculator, etc. I.e., it is designed for ONE PURPOSE. Yep, usedto have the talking calculator, but now just use a little command line calculator I found on the net some years ago. OR do a lot of math in my head when out and about. Yeah, I had given a lot of thought to how you provide a means for letting folks review their calculations with being able to view a "tape" Understood. But, this can be done with different approaches! For example, one approach is to always reset things to "the beginning" -- or some other known state. Another approach is to leave things where you last left them on the assumption that you will want to do the same sort of thing, again. Maybe, but soem devices, such as ROland's sound modules like to remember where you were last time, and heck, it might be a week before i want to delve into its menus again, and I might not remember where I was last time. Ah, OK. No, I think a device should remember what you did "last time" -- but, only while you are actively and continuously using it. If you want to be able to return to a certain set of options some days later, you should be able to save those options and explicitly restore them. If you turn the device off and start over tomorrow, then everything should resort to some default -- perhaps even one that YOU have defined instead of that which the manufacturer has defined. [computer interface with speech in a live environment] What can you suggest as an alternative? Is the problem the quality of the voice? Or the masking effects of all that music in the background? Yep, the music, and I'm supposed to be giving my ears to the audio. Also, you can't amplify speech in an earbud loud enough often unless you're doing bad things to the ear canal. Understood. You want a different communication channel to interact with the device instead of having to share the audio channel that you are devoting to the task at hand. I've used the same rationale to argue in favor of using non-visual channels for visual tasks! I.e., those cases where your eyes are busily engaged in some activity and shouldn't have to be pulled away just so you could see which virtual button you were pressing on your iPhone! |
#364
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
On Fri, 25 May 2012 15:38:54 -0700, "William Sommerwerck"
wrote: As the wings go forwards through the air, they twist the air downwards behind them. They pull the air above them down as they go by. As they try to pull the air down, the air tries to push the wings up, and that's what holds the plane up in the air. That is very far removed from the common explanation. Bernoulli must be spinning in his grave. Like a helicopter blade. There are two explanations in common use. Neither is right or wrong, both are just a way of looking at things. One considers pressure and resulting force, the other moving air mass and Newtonian reaction. The maths of both works out fine. If you are doing wing design, the Navier-Stokes equations (which use the first model) are tried and tested. d |
#365
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
On Fri 2012-May-25 19:46, Don Y writes:
Note National Federation of the Blind vs. American Federation FOR the Blind! Get the wrong preposition in there and you'll never hear the end of it! Yep, I've been one of those to quibble over taht one. big difference. And a very emotional one! The whole role of each organization can be summed up in those two prepositions! YEp, and not just emotional, but that difference makes all the difference in the world. "for" is somebody doing somethign "for" somebody. This small two letter word means these are the people being represented. I used to spend a good deal with a guy ("Michael") who I recall being tied to NFB somehow (I would stumble across him in various places around the country instead of in one single locality). YEah met him a few times. Michael Hingson iirc. Ha! Excellent! I don't know if I ever knew his last name. But, a quick google for images turned up lots of photos of him that I could believe to be "what he looks like, 35 years later!" Yep. Mike was sort of a local hero for folks who worked in the wtc on 9/11 helping lead a bunch of them out. snip I believe he now has a cell-phone sized device that provides similar functionality? YEp, and it's marketed in partnership with NFB. Mike is involved in that one. I ever get some business things caught up where I'd like to be one's in my future. Amazing device. The ability to just sit down in a restaurant and read the menu when they dont' ahve one in braille ... a liberating experienceg. That was the feeling I would get whenever setting up a new machine at a new site. You'd get things running properly. Some representative from the client agency would sit down to use the machine -- invariably visually impaired -- and you'd watch them cringe as they tried to make sense of that god-awful voice! Literally *squinting* as if that would somehow improve their hearing skills! Yeah I know, when Ray went eventually with Digital Equipment's Dectalk it improved on the original voice quite a bit. The only way to get more natural sounding synthesized speech than dectalk is the way At&T does it, with capturing the phonemes of an actual speaker, easy to do when the vocabulary required is rather limited, such as numbers and a few words. MOst of what the public encounters with telephone systems is this latter type. As I ntoed, NOaa for a long time when they first went digital with their vhf weather broadcasts was using the Dectalk voices. I'm using a Doubletalk card here, not quite as natural to the uninitiated, but still good enough, and, at the time, half the price g. I also liked the serial port doubletalk, small package, powered from a 9 volt cell. But, you could tell the instant they understood the dialect. Their eyes would literally go wide -- "Wow! I can finally read my own personal mail without having to rely on my secretary -- and having her aware of things that are none of her business!" Liberating is a good term. Indeed it is, when youv'e never had the experience of being able to read your own mail, or even identify wehther the customer truly did give you a $20 bill. Thing is, you get so many gadgets you almost need to carry a pelican case with foam inserts to keep all your gadgets safeg. Exactly. Each does *one* thing. And, often, not well! YEah I know. Been there done that. Back eyars ago when I was doing briefcase live sound I'd have my vibrating vu meters, the doubletalk external card, a talkign calculator, talking vom, etc. snip I've said this about a lot of things over the years. NOt just use it, but use it in the manner that the end user is likely to use it. Sometimes if you do that you'll end up going back to the drawing board. There is also the hazzard of making a device that defines how it must be used. Even if that is the way that 99% of the user base is likely to use it, it forces 100% of users to follow that prescription -- even if it isn't a necessary condition for the device's operation! YEah there's that. Liked your discussion of choices. To me color is one of the last things, were i buying a new vehicle I might want to think about. First off, I'd probably want to talk to the dealer about the trailer towing package, which will of course dictate the type of engine/drivetrain available. Then we get into the amenities, bells and whistles, etc. But first, the function of the thing is going to be what I want to get nailed down first. You know I wa reading a similar subject to yoru reading on choices recently, an economics professor from MIT on how our choices impact the economic decisions we make, touching on ethics, all sorts of stuff like that. Called Predictably Irrational. Can't recall author's name right now, but it, and his companion piece "the up side of irrationality" are both interesting reads on the subject. When I was operating a fixed location studio and I'd have a songwriter coming in for demos or a group I'd always ask my first question which was "In your mind's ear, when you hear your song fully arranged and produced, what does it sound like? Bring me an example of production already recorded that fits what your mind's ear hears." This way, I could choose the right capture techniques, such as how i"d place instruments, how I'd mic drums, etc. "Why do I have to specify this parameter before that parameter? They are independant yet you are forcing me to pick a certain one before the other. How did you decide that this is the only way it should be done?" Another reason I like configuring software with text files if I can get it. I can look through the configuration file, set options I'm sure of, and do some more poring over the docs to understand further waht needs to be defined. snip Wanna bet that most vendors just throw choices at the user in whatever order is convenient for the vendor??! I.e., if we know what file format he wants, then we can refine the sample rates and data formats to those that are supported *in* that file format. "Piece of cake!" Yep, that's my whole way of looking at this sort of thing. What do i want/ What will support what I want, i.e. sample rate I wish, interplatform portability, etc. Why can't you let the user decide what is important to him and *then* refine your offerings?! The technical problem in implementing this is exactly the same! But, the attitude conveyed to the user is entirely different! *He* drives the device instead of the device driving *him*! Yeup, my point exactly. See below. Right, which is another reason I'm leery of a lot of the digital mixing console offerings for my remote truck right now. Were I working with the same act doign the same show, or pretty close to, i could save my preferred working setup on whatever storage media it uses, and if it crumps druing the gig, all I've got to remember is the keystrokes to get it to load it back upfor me. But, a remote truck might be working a variety of things, and every time it goes out is different. OK, I think I follow your reasoning -- though have no firsthand experience in that application domain (so I can't comment on how I would react when faced with the same issues) YEp, this one might be a sporting event, the next might be a festival with all sorts of acts coming on and off stage, the next event, capture of a gospel revival type event for broadcast. Then there's the old what i do if I've got two of us working the console, one of us is flying in an effects cue with an aux send, and the other one working with the faders for the percussions section. How do we decide who gets waht menu up? IF they solve the ergonomics to my liking though I'd sure rather run cat5 from venue to truck, or even better, fiber. Yes, part of that stumbling block is blindness related (see other post) but it's a combination of factors, the blindness, as well as the fluid working environment. That, and I like my reliability. I'm so familiar with analog consoles of various types that when the "oh ****" moment hits, i fall back on what I've learned, and don't have the anxiety of wondering if this thing's going to crump in a way that i can't get it back to getting usable work done when it's for the money. Can't multiple menus be displayed concurrently? For example, there are many desktop GUIs that will let you "pin" (think: thumbtack) a menu or a dialog to the desktop so that it is "persistent". When you want to remove the object, you remove the "pin" and the object goes away. There's the rub. I've seen two approaches with a lot of these. One approach gives you banks of channel strips, say 1-16, 17-32, etc. Possibly even in 8 channel banks, so make for a smaller footprint. SO, if I'm wanting to do a line check on channel 24 let's say, and we've got 8 channel banks, i've got a choice, disrupt the work of the mixer mixing the show while I do that line check, or not. The other approach, limited actual controls, and your menu selects whether those controls are faders, pan controls, aux sends, etc. There's my main stumbling block. With my old analog iron all the aux sends are there, bus assignments, vca groups, all are right there. yEs it means sometimes the mixer is working at full extension of his body to reach that control, but that control is there, and I can manipulate it, or somebody else can while I'm doing something else. One of the biggest praises you'll hear sung of a lot of the new digital consoles is the smaller footprint, no more working at full extension to reach that control. But, for me, that smaller footprint is in exchange for reliability, and the familiar interface. After all, I've been interacting with analog consoles now for decades. But again, some parts of that might be as simple as my ee friend when I was asking him about digital audio metering when he basically gave me the "free your mind instead" comment. Btw, this was a blind electrical engineer. HE reminded me that the work flow, and the development of a "house standard" was probably more important for me to keep to on every project. I settled on the usual 0vu = -18dbfs because it seems to be acceptable most places digital audio might go. I just got in the habit of calibrating the system to that, and printing some 1 khz tone at 0vu -18dbfs on anything that was going away, either for mastering or for broadcast. What I'd really want to do before plunking down the dollars for a digital console was actually work with it for a couple of days first, and get my mind around some of the concepts, then decide if that one fits in my working environment. You might also look into what are called "pie menus". With these, you open the menu and find yourself in the center of a circular "pie". From there, you pick a direction to select a specific item from the menu. Think of the menu as slices of a pie and you are just deciding which slice you want -- always from the known reference point in the center of the pie! INteresting concept. DOn't know if they offer that sort of thing though. This is why I'd really need to sit down in front of one for a day or two, at least with multitracks of prerecorded material up to really put the thing through its paces, and why often I'm reluctant to buy a new device from the in-store demo. It's taken me a long time to decide on one of those little recorders like the zoom, etc. But, thanks to reviews in this group I think the Tascam is in my very near future. i asked the reviewer specifically to use the thing thinking about how easy it was to interact with sans looking at the device. Of course, menus have to be designed to keep the number of choices small. Much easier to pick from 6 or 8 "slices" than 16 or 18! rotfl Then there's that. YOu can always offer me related choices on a submenu. Regards, Richard -- | Remove .my.foot for email | via Waldo's Place USA Fidonet-Internet Gateway Site | Standard disclaimer: The views of this user are strictly his own. |
#366
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
William Sommerwerck writes:
As the wings go forwards through the air, they twist the air downwards behind them. They pull the air above them down as they go by. As they try to pull the air down, the air tries to push the wings up, and that's what holds the plane up in the air. That is very far removed from the common explanation. The "common" explanation is incorrect. This explanation is correct. |
#367
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
Don Y wrote:
Hi Neil, On 5/25/2012 1:04 PM, Neil Gould wrote: Don Y wrote: The difference is between knowing the "facts" that describe the thing in question vs. being able to "internalize" your understanding of it. "Grok", if you understand the reference. Often, the latter may be an incredibly dumbed down "feel" for what's going on vs. a highly technical rationalization for it. E.g., a wing provides lift because the THICKER air under it pushes it up through the THINNER air flowing over it! :-/ ??!!?? Aside from that being completely wrong, I hope it's not an example of "Grokking" the topic! ;-) It's not a "technical explanation" but, rather, a way of internalizing what is happening. Why would one want to "internalize" a completely incorrect notion of how something works? What is the value of that? Would it not be better to "internalize" a valid explanation? How would *you* explain lift to a 5 year old? There are things in life that a 5 year old can't understand. Still, as one who built flying model planes from earlier than that age, it is possible to help a 5 year old work with the principles without their having to understand the technical details. It is a pointess and possibly harmful setback to the child to give explanations that are completely wrong. -- best regards, Neil |
#368
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
Don Pearce wrote:
On Fri, 25 May 2012 15:38:54 -0700, "William Sommerwerck" wrote: As the wings go forwards through the air, they twist the air downwards behind them. They pull the air above them down as they go by. As they try to pull the air down, the air tries to push the wings up, and that's what holds the plane up in the air. That is very far removed from the common explanation. Bernoulli must be spinning in his grave. Like a helicopter blade. There are two explanations in common use. Neither is right or wrong, both are just a way of looking at things. One considers pressure and resulting force, the other moving air mass and Newtonian reaction. The maths of both works out fine. If you are doing wing design, the Navier-Stokes equations (which use the first model) are tried and tested. Thanks for saving me a bit of time with your excellent summation! ;-) -- best regards, Neil d |
#369
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
Hi Richard,
Ha! Excellent! I don't know if I ever knew his last name. But, a quick google for images turned up lots of photos of him that I could believe to be "what he looks like, 35 years later!" Yep. Mike was sort of a local hero for folks who worked in the wtc on 9/11 helping lead a bunch of them out. Obviously long after my experiences with him. Must have been a doubly terrifying experience for him. That was the feeling I would get whenever setting up a new machine at a new site. You'd get things running properly. Some representative from the client agency would sit down to use the machine -- invariably visually impaired -- and you'd watch them cringe as they tried to make sense of that god-awful voice! Literally *squinting* as if that would somehow improve their hearing skills! Yeah I know, when Ray went eventually with Digital Equipment's Dectalk it improved on the original voice quite a bit. The only way to get more natural sounding synthesized speech than dectalk is the way At&T does it, with capturing the phonemes of an actual speaker, easy to do when the vocabulary required is rather limited, such as numbers and a few words. Exactly. But of extremely limited use! MOst of what the public encounters with telephone systems is this latter type. As I ntoed, NOaa for a long time when they first went digital with their vhf weather broadcasts was using the Dectalk I suspect you could rework the weather broadcasts to use a limited vocabulary and, thus, better speech quality. On the other hand, it means that you have to be able to anticipate EVERYTHING that you might need to say over that medium. For example, you might not be prepared to use it to announce an alien invasion! grin I have a trimmed down synthesizer that I fall back on if the primary synthesizer is unavailable in one of the products I'm designing, currently. It needed to be robust -- so that I could count on it working regardless of what might have broken in the system. I could have opted for a better quality limited vocabulary design -- but, didn't want to have to set that vocabulary in stone and discover, later, that I needed to be able to say something that I couldn't. voices. I'm using a Doubletalk card here, not quite as natural to the uninitiated, but still good enough, and, at the time, half the priceg. I also liked the serial port doubletalk, small package, powered from a 9 volt cell. The DECtalk express suffers from the sin of requiring a special rechargeable battery. Another fault, in my opinion, for an assistive technology device (where do you buy that replacement battery -- today??) There is also the hazzard of making a device that defines how it must be used. Even if that is the way that 99% of the user base is likely to use it, it forces 100% of users to follow that prescription -- even if it isn't a necessary condition for the device's operation! YEah there's that. Liked your discussion of choices. To me color is one of the last things, were i buying a new vehicle I might want to think about. First off, I'd probably want to talk to the dealer about the trailer towing package, which will of course dictate the type of engine/drivetrain available. Then we get into the amenities, bells and whistles, etc. But first, the function of the thing is going to be what I want to get nailed down first. Exactly. So, for a web site to ask you to pick a color, first, isn't helpful. Especially if, later, you realize that your choice has ruled out something else that you really want most! "I'm sorry but the automatic transmission option that you selected disqualifies the choice of 7 liter diesel. Would you like to start over?" Imagine if it isn't even smart enough to tell you of that constraint! You get to the point where you expect to select an engine and find the engine not listed! You know I wa reading a similar subject to yoru reading on choices recently, an economics professor from MIT on how our choices impact the economic decisions we make, touching on ethics, all sorts of stuff like that. Called Predictably Irrational. Can't recall author's name right now, but it, and his companion piece "the up side of irrationality" are both interesting reads on the subject. Dan Ariely. We were "required" to take eight courses in The Humanities to graduate. I guess they didn't want a bunch of engineers with no appreciation of other aspects of life and education let loose on the unsuspecting masses. grin I recall selecting American History as one of my courses thinking I had already had two years of that in High School so it would be a recent memory, for me! The professor was an economist. So, I relearned all that history with an entirely different spin than the noble presentation to which I'd previously been subjected. Fascinating! So, I've enjoyed reading books by economists that touch on these sorts of subjects. _The Price of Everything_ discusses all of our actions -- social and otherwise -- in terms of economic transactions. E.g., a woman selling uterine services in a marriage transaction. _The Art of Choosing_ describes how we "value" choice in different societies and how it impacts our decisions. For example, how much we will "spend" to keep choices available even if they aren't choices of which we would want to avail ourselves. _How We Decide_ and _Predictably Irrational_ looked at how easily we are manipulated and con ourselves in our behavioral choices, etc. How we can actually think an $10 pill is better than an identical $0.50 pill, etc. How we *don't* have a "Market" in which consumers and producers compromise on price but, rather, how Producers manipulate our expectations of price to a point that they are happy with, etc. By far, the experiments that have been concocted and presented in the texts are the most fascinating. And, they make you laugh at the snobbery that you often see around you -- the folks who couldn't differentiate an $80 bottle of wine from a $2 bottle of wine -- yet, when confronted with the $80 price tag ON THE $2 BOTTLE, would *swear* it tastes a LOT better than the $80 bottle that has been mislabeled as $2! When I was operating a fixed location studio and I'd have a songwriter coming in for demos or a group I'd always ask my first question which was "In your mind's ear, when you hear your song fully arranged and produced, what does it sound like? Bring me an example of production already recorded that fits what your mind's ear hears." This way, I could choose the right capture techniques, such as how i"d place instruments, how I'd mic drums, etc. Good point! I would never buy consumer kit from specs. Rather, how it sounded to me when reproducing the sorts of program material I was listening to at that point in my life. "Why do I have to specify this parameter before that parameter? They are independant yet you are forcing me to pick a certain one before the other. How did you decide that this is the only way it should be done?" Another reason I like configuring software with text files if I can get it. I can look through the configuration file, set options I'm sure of, and do some more poring over the docs to understand further waht needs to be defined. The problem with that approach comes when two option choices are interdependant. There is nothing preventing you from asking for a set of incompatible options -- until some program examines your choices and complains. Can't multiple menus be displayed concurrently? For example, there are many desktop GUIs that will let you "pin" (think: thumbtack) a menu or a dialog to the desktop so that it is "persistent". When you want to remove the object, you remove the "pin" and the object goes away. There's the rub. I've seen two approaches with a lot of these. One approach gives you banks of channel strips, say 1-16, 17-32, etc. Possibly even in 8 channel banks, so make for a smaller footprint. SO, if I'm wanting to do a line check on channel 24 let's say, and we've got 8 channel banks, i've got a choice, disrupt the work of the mixer mixing the show while I do that line check, or not. The other approach, limited actual controls, and your menu selects whether those controls are faders, pan controls, aux sends, etc. There's my main stumbling block. With my old analog iron all the aux sends are there, bus assignments, vca groups, all are right there. yEs it means sometimes the mixer is working at full extension of his body to reach that control, but that control is there, and I can manipulate it, or somebody else can while I'm doing something else. Understood. The same sort of thing is true with theatrical lighting panels, video switchers (the video equivalent of an audio mixer), etc. To make everything visible and accessible or just some selectable subset of it. I have the same sort of problem when authoring multimedia presentations (though those aren't done in real time) That's why I thought the push-pin approach might be a good compromise -- let you decide which parts of the interface you want to have access to. But you are still constrained by the physical size of the display. What I'd really want to do before plunking down the dollars for a digital console was actually work with it for a couple of days first, and get my mind around some of the concepts, then decide if that one fits in my working environment. I would imagine that might give you an 80% idea of what the change would be like. But, someday, you'd find yourself facing a problem that had to be solved NOW and scurrying to sort out how to get to the solution you want in that environment. Sort of like a surgeon doing a laproscopic procedure and suddenly everything going to ****. Drop the tools, grab a big knife and cut the patient open. You're not going to fix the problem through that tiny incision -- unless you are incredibly skilled and fluent with the technology! You might also look into what are called "pie menus". With these, you open the menu and find yourself in the center of a circular "pie". From there, you pick a direction to select a specific item from the menu. Think of the menu as slices of a pie and you are just deciding which slice you want -- always from the known reference point in the center of the pie! INteresting concept. DOn't know if they offer that sort of thing though. grin Because they haven't had to think about the full range of users that might sit behind their kit! As I said, I've put a lot of thought into what you really need to interact with a given device and how to minimize the cognitive loading on the user. You don't want to require 100% of his attention. Especially if it is to perform some low grade task! Imagine if you had to type the word "ANSWER" on a keyboard to answer your phone. Ridiculous, right? It would require too much focused attention for a task that should be trivial. This is why I'd really need to sit down in front of one for a day or two, at least with multitracks of prerecorded material up to really put the thing through its paces, and why often I'm reluctant to buy a new device from the in-store demo. It's taken me a long time to decide on one of those little recorders like the zoom, etc. But, thanks to reviews in this group I think the Tascam is in my very near future. i asked the reviewer specifically to use the thing thinking about how easy it was to interact with sans looking at the device. Of course, menus have to be designed to keep the number of choices small. Much easier to pick from 6 or 8 "slices" than 16 or 18! rotfl Then there's that. YOu can always offer me related choices on a submenu. Exactly. With use, you develop a sort of muscle memory as your hands are accustomed to making certain motions to do certain things. If, instead, you have to coordinate your eyes and hands to *pick* a particular option from a linear list, you have to rely on that visual feedback to ensure you are at the right point in that list before making your selection. E.g., one of the devices that I am developing uses speech for its sole output medium and a touchpad for its sole input medium. You issue "gestures" on the touchpad to initiate commands and selections. And, hear the results of those commands. So, for example, you might drag your fingertip across the touchpad from left to right to cause the mechanism to move "to the right" -- while you are watching it! Then, tap the touchpad to cause it to stop. "Draw" a circular motion counterclockwise to cause the grabber to open. Drag your fingertip from top to bottom to cause it to be lowered. Tap, again, to stop. Draw a clockwise circle to command the grabber to close. etc. All of these are trivial actions that you could easily memorize. None of which requires any precision on your part. They can all be performed while your eyes are busy with another task. And, none of them DISTRACT you from that task. Repeat the example with some task that does NOT require vision to see the significance of this approach. |
#370
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
On Fri 2012-May-25 20:46, Don Y writes:
The asterisks spoke, because I've configured the screen reader to speak them, as people use them to set text apart, or the dreaded footnotes of course. OK, so my use of them for emphasis detracts from your comprehension instead of adding to it! But, the all caps didn't, at least when just reading. NOw, were I editing, capital letters are spoken with a bit of a raised inflection, but, when read as words they're not. When just in the reader, not editing, for example, reading usenet articles, a book that's text or similar I have most punctuation disabled so sentences sound normal. Understood. But, would you have any cues that there might be some punctuation that you might want to see which has been silenced. For example, the cartoonish way of showing a pejorative as a jumbled sequence of ad hoc punctuation marks like $(&*^@#$! I note i saw those in my reader,when reading, not editing this reply. SO obviously I had at some point made them exceptions, probably because of the common use of many of those symjbols elsewhere, such as @ in email addresses, # as pound symbol, etc. etc. I probably have much more enabled in a usenet/mail/bbs reader application than I would in just a straight text reader that I'd use to read a novel g. On a different note, I assume you are also victimized by spelling errors? For example, I tend to end up transposing pairs of letters simply because one finger finds its way to a key before the other -- which should have preceded it. Like teh instead of the. I'm victimized more by makign them g. Cutting to the chase on some of this, colors aren't spoken at all. I can have the screen reader, and most screen readers, monitor a portion of the screen, say a status line, for either a change in text displayed there, or a change in attributes. Does it simply speak each change encountered? What if the time to speak the information exceeds the time between changes? For example, a timer counting down seconds remaining until a task's completion. But, "one minute forty nine seconds" takes longer to speak than the time for the display to change to "one minute forty eight seconds". Likewise, is the screen reader preoccupied with this task or can it also let you wander around to other parts of the screen while it is monitoring that section? Usually I'll tell the screen reader app to remain silent, and park speech cursor over that display. IF that display is going to be changing a lot I tell the screen reader to ignore that line entirely, and only force it to go there when I want to look at it. Usually I'll use monitoring of a status line to tell ti when to change configurations. E.g. when status line changes to x from y load a configuration which tracks a light bar as your focus of where you are, etc. etc. Rather complex, and probably extremely boring to the folks here. We should probably take this line to email g.. snip big snip one reason braille will always be superior, the ability to skim. This is directly analagous to reading printed text. You can cherry pick through large amounts of information with relatively little effort. Indeed, and, believe it or not, i retain what i read better, as well as read fast using it. IN most cases, synthesized speech is the most cost effective and most effective in other ways, compromise that can be achieved. Braille displays are clunky, hard to maintain and don't achieve good reading speed, or efficient work flow, unless you're a customer service rep dealing with both the computer and customers on the phone. After all, they take your hands away from the keyboard, they can display a very limited amount of text at one shot, all them little solenoid springs and mechanical parts ... aaargh Yup. This is the dark side of illegal copying. It forces authors to waste effort protecting their works. And, screws legitimate users out of the ability to use the product "fairly". For example, if I have three computers but only use one at a time, the morally correct thing is to have one license. But, how does the author ensure that I really *am* using just one at a time? How does the author ensure that the "second computer" isn't a friend's computer? Exactly waht I ran into with it. My mom didn't want a screenreader, bu was glad to have my help maintaining her system, often without her having to stand over my shoulder and play screenreader. The two machines at the studio, I'd only be using one of them at a time, and the owner of said studio didn't want a screen reader either. I didn't use the product at all at home, I was beta testing a competitor's screen reader for the gui environment in fact. I just didn't think it was fair to my employer to use a beta at work. The Mercator project (now defunct) tried to layer a speech interface *under* (not ON TOP OF!) the GUI in UNIX. I.e., it replaced the standard GUI libraries with speech-enabled ones with which the "screen reader" could interact. snip YEah there were a couple like that, had heard of that one, or the "speaqualizer project. Both were failures in the marketplace. MErcator may have never made it to market, but they tried with Speaqualizer for awhile. Mercator was an academic project. Yet another example of people thinking that there would be a "simple" way to address this problem. Yep, and the Speaqualizer tried to do it as an integral part of hardware, you get speech as soon as the machine boots, giving you access to bios, etc. The only simple way to address the problem is NOT to provide a visual interface! So, applications ALL have to rely on the same non-visual interface to interact with their users! If you provide a nonvisual OPTION, then applications will only give token support -- if any -- to it. On the other hand, if the only way to get information out of a device is via that option, then they don't have a choice! This is the approach I have been taking, lately. Pick an output modality that addresses everyone in the target audience and force everything to use that single mechanism! iNdeed, and this is what we're finding with a lot of web portals that do things that are only usable with vision. Anyone who's doing web development should looik at a series of articles discussing just this issue in this month's Braille MOnitor, available in text, no doubt from www.nfb.org. Festival -- a free package probably available under Linux -- has a lot of "context modules" that try to alter teh rules for pronunciation based on context. I.e., so email addresses are pronounced as "Richard dot Webb dot my dot foot at ..." snip YEp, which was I think why so many complained when the National WEather service went with dectalk speech synths for their vhf radio forecasts. The backup speech synthesizer in one of my products has similar quality issues as Klatt's DECtalk (he wrote it while a student and DEC commercialized it). It's biggest advantage is that it is pretty lean when it comes to resources -- which translates directly to implementation costs and reliability. Did you ever check out that kid a few years ago that made a dectalk sing? HE spent some serious time coding that, iirc the kid was only 16 years old or so when he did this one. I used to have a url for it, but it disappeared in Katrina. I can't even recall his name it's been so long. I have a DECtalk DTC01 and a DECtalk Express. Plus a few of the Artic/Votrax-based synthesizers. All have the same basic advantage -- and the same robotic speech quality! Yep, as does my doubletalk. I had, before Katrina, two doubletalk internal cards, a doubletalk lite external, and an audaptor. On the other hand, Festival has a huge footprint. And, is considerably easier to crash than DECtalk. Where DECtalk and the other "simple" synthesizers will take a stab at pronouncing damn near anything you throw at them, Festival will chew on it for a fair bit of time before commiting to a pronunciation -- which can be just as wrong as the other products! This is why a lot of the screenreader developers, and speech synth developers sort of "shared the load" you might say. Common pronunciation, at the phoneme level is often handled by rom within the synthesizer itself, exceptions and the like are handled by the software on your hard disk. OK. Any particular reason why you're married to that machine? I'd like to have the raid array for server, and yes, once we've relocated net connected server is part of battle plan. Raid would be nice. i've got another box which is going to be dedicated to firewall/router duties, but would like to keep that one as server, which was what it did in its former life. Does the chassis force you to use a certain type of disk drives? E.g., because of disk carriers? Many older RAID offerings require SCSI disks. Would you be happy with RAID in some other form? I doubt the chasis does, I'd have to look inside the box. MOst i did with it when it was given to me was boot it up once. IF we could get raid in some other form, that would be cool too. Would lose those two big scsi hard drives then and have to do somethign else with them, but ... Just trying to use what's existing in the box with minimal $$$ outlay, if possible. Still it might be worth doing that to get totally away from windows as server app. I'll have to way pros and cons of that one when we get to it. Right now that machine is sitting in storage unit with some of those little deseccant packages inside the case g. DY [sightless V U meter] Ah, OK. Clever. YEp, for some plans and simple designs, goto ski.org and download sktf.zip, it's about a 2 mb zip file, multiple directories, but text files on lots of things, home brewing adaptive vu solutions, soldering jigs, all sorts of stuff. OK. OK. So, this is "yet another DEVICE" that you have. Like a tactile wris****ch, braille slate, talking calculator, etc. I.e., it is designed for ONE PURPOSE. snip Yeah, I had given a lot of thought to how you provide a means for letting folks review their calculations with being able to view a "tape" Believe it or not, I've done a lot of this in batch scripts g. Re menus in devices ... Maybe, but some devices, such as ROland's sound modules like to remember where you were last time, and heck, it might be a week before i want to delve into its menus again, and I might not remember where I was last time. Ah, OK. No, I think a device should remember what you did "last time" -- but, only while you are actively and continuously using it. If you want to be able Y to return to a certain set of options some days later, Y you should be able to save those options and explicitly Y restore them. If you turn the device off and start Y over tomorrow, then everything should resort to some default -- perhaps even one that YOU have defined instead of that which the manufacturer has defined. Yeah sounds like a good compromise. If I go back in there during the same 'session" it remembers where I last was. OTherwise, it goes to the start. DY [computer interface with speech in a live environment] What can you suggest as an alternative? Is the problem the quality of the voice? Or the masking effects of all that music in the background? Yep, the music, and I'm supposed to be giving my ears to the audio. Also, you can't amplify speech in an earbud loud enough often unless you're doing bad things to the ear canal. Understood. You want a different communication channel to interact with the device instead of having to share the audio channel that you are devoting to the task at hand. That's it exactly. I'm accustomed, as i said in another post this thread, to not having to do anything butdirectly communicate with the device. IF I want that channel strip assigned to a certain bus or certain vca group, push the button. IF I want audio from that channel on aux bus 3, i adjust that aux send. I don't have to play where the f*$@ am I? IT's automatic like buttoning your coat, zipping your pants or tying your shoes. I've used the same rationale to argue in favor of using non-visual channels for visual tasks! I.e., those cases where your eyes are busily engaged in some activity and shouldn't have to be pulled away just so you could see which virtual button you were pressing on your iPhone! Uh huh! Just my point with some of these complex devices that are made for people to operate while driving, etc. Give them an auditory channel for the info, keep the eyes on the road, and the hands upon the wheel please. Regards, Richard -- | Remove .my.foot for email | via Waldo's Place USA Fidonet-Internet Gateway Site | Standard disclaimer: The views of this user are strictly his own. |
#371
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
Hi Richard,
[status line and dynamic information displays] Usually I'll tell the screen reader app to remain silent, and park speech cursor over that display. IF that display is going to be changing a lot I tell the screen reader to ignore that line entirely, and only force it to go there when I want to look at it. Usually I'll use monitoring of a status line to tell ti when to change configurations. E.g. when status line changes to x from y load a configuration which tracks a light bar as your focus of where you are, etc. etc. Rather complex, and probably extremely boring to the folks here. We should probably take this line to emailg.. OK. I'll pick a suitable subject line so my mail is recognizable -- though probably not today as I am busy getting my other half ready for a trip. This is directly analagous to reading printed text. You can cherry pick through large amounts of information with relatively little effort. Indeed, and, believe it or not, i retain what i read better, as well as read fast using it. IN most cases, synthesized speech is the most cost effective and most effective in other ways, compromise that can be achieved. Braille displays are clunky, hard to maintain and don't achieve good reading speed, or efficient work flow, unless you're a customer service rep dealing with both the computer and customers on the phone. After all, they take your hands away from the keyboard, they can display a very limited amount of text at one shot, all them little solenoid springs and mechanical parts ...aaargh Not to mention the expense! I have a Braille N Speak here (I think that is the name). There is an Italian firm that makes a Braille display that is pzieoelectric (spelling?) which should be easier to keep running. But, I think they want $700/cell -- or something equally outrageous! Maybe if the eurozone crumbles, you can pick these up for a song! grin [copy protection] Exactly waht I ran into with it. My mom didn't want a screenreader, bu was glad to have my help maintaining her system, often without her having to stand over my shoulder and play screenreader. The two machines at the studio, I'd only be using one of them at a time, and the owner of said studio didn't want a screen reader either. I didn't use the product at all at home, I was beta testing a competitor's screen reader for the gui environment in fact. I just didn't think it was fair to my employer to use a beta at work. Borland used to have a "like a book" license. I.e., you could move the license around. But, still not as flexible as it could be. The problem is, people think there's no "cost", there, and let that confuse their idea of "value"! Just because you didn't pay for something, doesn't make it valueless! I.e., if you think there is no value to that second copy of the software, then live without it -- you should not experience any *costs* if it had no *value*! If you provide a nonvisual OPTION, then applications will only give token support -- if any -- to it. On the other hand, if the only way to get information out of a device is via that option, then they don't have a choice! This is the approach I have been taking, lately. Pick an output modality that addresses everyone in the target audience and force everything to use that single mechanism! iNdeed, and this is what we're finding with a lot of web portals that do things that are only usable with vision. Anyone who's doing web development should looik at a series of articles discussing just this issue in this month's Braille MOnitor, available in text, no doubt from www.nfb.org. I've not looked at their site in a long time. To be honest, I'm a bit set off by the "evangelism". Perhaps it is necessary to bring the "message" to the "unwashed masses". But, in my case, it feels like preaching to the choir. Sort of like telling a smoker he should quit. I'm sure he already knows that The backup speech synthesizer in one of my products has similar quality issues as Klatt's DECtalk (he wrote it while a student and DEC commercialized it). It's biggest advantage is that it is pretty lean when it comes to resources -- which translates directly to implementation costs and reliability. Did you ever check out that kid a few years ago that made a dectalk sing? HE spent some serious time coding that, iirc the kid was only 16 years old or so when he did this one. I used to have a url for it, but it disappeared in Katrina. I can't even recall his name it's been so long. The Votrax VS6.3 was capable of singing (poorly). As well as multilingual speech. I recall hearing one speak German. But, the extra capabilities don't really translate into better "regular speech". So, why pay for them? At Kurzweil, we had frequent failures in the Votrax subsystem. All of the boards were potted -- to discourage copying -- so when one of the four boards died, it was irreparable. Yet another case of someone going out of their way to protect their market share. Funny, I don't hear the name Votrax bandied about anymore so I guess they wasted their efforts clinging to the past instead of embracing the future! I have a DECtalk DTC01 and a DECtalk Express. Plus a few of the Artic/Votrax-based synthesizers. All have the same basic advantage -- and the same robotic speech quality! Yep, as does my doubletalk. I had, before Katrina, two doubletalk internal cards, a doubletalk lite external, and an audaptor. While it is unfortunate (the **** poor quality of the speech), it is still surprisingly easy to get used to the oddities of these "dialects". Especially when the alternative may be to be deprived of some interactions! On the other hand, Festival has a huge footprint. And, is considerably easier to crash than DECtalk. Where DECtalk and the other "simple" synthesizers will take a stab at pronouncing damn near anything you throw at them, Festival will chew on it for a fair bit of time before commiting to a pronunciation -- which can be just as wrong as the other products! This is why a lot of the screenreader developers, and speech synth developers sort of "shared the load" you might say. Common pronunciation, at the phoneme level is often handled by rom within the synthesizer itself, exceptions and the like are handled by the software on your hard disk. Festival approaches everything from the top down. It tries to understand the context of the material. From that, the appropriate pronunciation rules. And, finally, actually synthesizing the speech waveforms. But, the implementation never focused on performance issues. Rather, they wrote it so that it would be easier to write and maintain. So, it takes a fair bit of resources just to say "Hello". Those resources translate into dollars in anything other than a PC environment (i.e., your talking calculator would speak nicer but cost more!) Does the chassis force you to use a certain type of disk drives? E.g., because of disk carriers? Many older RAID offerings require SCSI disks. Would you be happy with RAID in some other form? I doubt the chasis does, I'd have to look inside the box. I know many of the Dell machines that I've used over the years had special drive carriers for the RAID drives. Often so you could hot-swap them, etc. Others just hid the array in the bowels of the machine. Note that RAID can be a huge bellyache! When a drive fails, you usually don't have many options other than replacing it as is in the working set. To do this, many machines have special BIOS extensions that you have to work with *in* the BIOS (which may be an issue accessing, for you) to add the drive to the working set, format it, rebuild its contents, etc. I.e., you may be better served by a regular disk with a second one that you keep "off-line" and copy the stuff you want to preserve. I've torn down all my RAID arrays or reconfigured them as JBOD's (Just a Bunch of Disks) because it makes it so much easier for me to handle them. E.g., I can remove a disk and put it in another machine. Doing this with a RAID array often has the disk recognized as "foreign" and the controller immediately wants to reformat it. "No!!! I just want you to let me access the files on it!!" MOst i did with it when it was given to me was boot it up once. IF we could get raid in some other form, that would be cool too. Would lose those two big scsi hard drives then and have to do somethign else with them, but ... Just trying to use what's existing in the box with minimal $$$ outlay, Understood. if possible. Still it might be worth doing that to get totally away from windows as server app. I'll have to way pros and cons of that one when we get to it. Right now that machine is sitting in storage unit with some of those little deseccant packages inside the caseg. I wonder how much spiders like dessicants?? grin [defaults vs. defaults vs. defaults] Yeah sounds like a good compromise. If I go back in there during the same 'session" it remembers where I last was. OTherwise, it goes to the start. Exactly. I spent a fair bit of time on this subject, recently. It's especially significant when it isn't easy to *review* the settings in place -- e.g., when you can't just glance up at a screen and reassure yourself that everything seems "about right". Understood. You want a different communication channel to interact with the device instead of having to share the audio channel that you are devoting to the task at hand. That's it exactly. I'm accustomed, as i said in another post this thread, to not having to do anything butdirectly communicate with the device. IF I want that channel strip assigned to a certain bus or certain vca group, push the button. BECAUSE YOU RELY ON MEMORY! IMO, the singles biggest asset a visually impaired user has is memory. Remembering where you last put something. Remembering how you set a parameter. etc. Blind man with altzheimer's has got to be shear terror! IF I want audio from that channel on aux bus 3, i adjust that aux send. I don't have to play where the f*$@ am I? IT's automatic like buttoning your coat, zipping your pants or tying your shoes. Great analogies! They are low skill tasks so why require lots of attention to perform them? I've used the same rationale to argue in favor of using non-visual channels for visual tasks! I.e., those cases where your eyes are busily engaged in some activity and shouldn't have to be pulled away just so you could see which virtual button you were pressing on your iPhone! Uh huh! Just my point with some of these complex devices that are made for people to operate while driving, etc. Give them an auditory channel for the info, keep the eyes on the road, and the hands upon the wheel please. Exactly. But there are lots of other cases that aren't as dramatic. Why do I need to look at my iPod to use it? Or, my telephone? Why can't I read my appointment calendar while I'm taking a walk around the neighborhood (I have to stop moving in order to read all the details on that silly little display as it "bounces around" too much when I am walking)? Why should a soldier have to take his eyes off The Enemy just to consult some fancy piece of high-tech kit? Or, an opthamalogist have to remove his eyes from peering into yours just to see what some device is trying to convey to him? |
#372
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
On Sat 2012-May-26 14:15, Don Y writes:
Yep. Mike was sort of a local hero for folks who worked in the wtc on 9/11 helping lead a bunch of them out. Obviously long after my experiences with him. Must have been a doubly terrifying experience for him. Yes, and what amazes me is that his dog guide didn't totally freak out. I've seen dog guides act rather unpredictably in crowds of people who are either panicked, or big crowds of blind folks who aren't as careful about don't step on doggie etc. iT takes a rather disciplined human/canine team to keep the dog calm and on mission at such times. I've trained my own a couple of times, but never did the dog go to the bandstand or the gig with me. I don't believe a dog should be subjected to the constant environment of loud music, etc. Dogs are, after all, color blind, their eyes aren't that great to begin with, and its their noses and ears that tell them about their environment. Subjecting a dog to that sort of environment on a regular basis is cruelty imho. Ymmv of course. snip Yeah I know, when Ray went eventually with Digital Equipment's Dectalk it improved on the original voice quite a bit. The only way to get more natural sounding synthesized speech than dectalk is the way At&T does it, snip Exactly. But of extremely limited use! YEs,the vocabulary of the device is quite limited, but, if it can work for your application it's much easier for average folks to understand. snip I suspect you could rework the weather broadcasts to use a limited vocabulary and, thus, better speech quality. On the other hand, it means that you have to be able to anticipate EVERYTHING that you might need to say over that medium. For example, you might not be prepared to use it to announce an alien invasion! grin Right, and with the same alerting system it might have to announce all sorts of things, which is why I think noaa went originally with dectalks. I think they've changed some of that now, but there for awhile I heard Dectalk's "huge Harry" voice quite a bit. I think i commented, some not quite as familiar asserted noaa was using their "perfect Paul" but I beg to differ. sounded more like the former to me. I have a trimmed down synthesizer that I fall back on if the primary synthesizer is unavailable in one of the products I'm designing, currently. It needed to be robust -- so that I could count on it working regardless of what might have broken in the system. I could have opted for a better quality limited vocabulary design -- but, didn't want to have to set that vocabulary in stone and discover, later, that I needed to be able to say something that I couldn't. That's the weakness of the captured phoneme model. Memory intensive, and limited in utility. I'm using a Doubletalk card here, not quite as natural to the uninitiated, but still good enough, and, at the time, half the priceg. I also liked the serial port doubletalk, small package, powered from a 9 volt cell. The DECtalk express suffers from the sin of requiring a special rechargeable battery. Another fault, in my opinion, for an assistive technology device (where do you buy that replacement battery -- today??) Yeah I know, and that rechargeable is going to eventually fail to recharge, and it might decide to do so just when I really want it, and no, I don't want to wait to mail order another one, and then be told that the "other one" I ordered doesn't function in the way I've become accustomed to, or won't work in my environment at all. There is also the hazzard of making a device that defines how it must be used. snip YEah there's that. Liked your discussion of choices. To me color is one of the last things, were i buying a new vehicle snip Exactly. So, for a web site to ask you to pick a color, first, isn't helpful. Especially if, later, you realize that your choice has ruled out something else that you really want most! YEs, and I'm probably more willing to compromise color, bells and whistles or something similar to get what I really want. Function though trumps almost everything else in my world, if it won't do the job I won't pay any price for it. snip You know I was reading a similar subject to your reading on choices recently, an economics professor from MIT on how our choices impact the economic decisions we make, touching on ethics, all sorts of stuff like that. Called Predictably Irrational. Can't recall author's name right now, but it, and his companion piece "the up side of irrationality" are both interesting reads on the subject. Dan Ariely. We were "required" to take eight courses in The Humanities to graduate. I guess they didn't want a bunch of engineers with no appreciation of other aspects of life and education let loose on the unsuspecting masses. YEp, that's the man. Dan's an interesting read on the subject. I was thinking I'd place money that you'd read him or was familiar. I recall selecting American History as one of my courses thinking I had already had two years of that in High School so it would be a recent memory, for me! The professor was an economist. So, I relearned all that history with an entirely different spin than the noble presentation to which I'd previously been subjected. Fascinating! Gives you a different perspective doesn't it? I'd suggest Dan to audio engineers, marketing people, pretty much any profession. Some real thought provoking stuff there about the way we operate in everyday life, how we interact with our customers/clients, etc. For the uninitiated that aren't bored to tears and are still following along, it's not all dry tome, there are some great moments in those two books that will tickle your funny bone. So, I've enjoyed reading books by economists that touch on these sorts of subjects. _The Price of Everything_ discusses all of our actions -- social and otherwise -- in terms of economic transactions. E.g., a woman selling uterine services in a marriage transaction. Good analogy!!! _The Art of Choosing_ describes how we "value" choice in different societies and how it impacts our decisions. For example, how much we will "spend" to keep choices available even if they aren't choices of which we would want to avail ourselves. INteresting. I'll have to see if it's in braille or bug the library system to make it so. _How We Decide_ and _Predictably Irrational_ looked at how easily we are manipulated and con ourselves in our behavioral choices, etc. How we can actually think an $10 pill is better than an identical $0.50 pill, etc. How we *don't* have a "Market" in which consumers and producers compromise on price but, rather, how Producers manipulate our expectations of price to a point that they are happy with, etc. Indeed, we hoodwink ourselves often without really thinking about it. By far, the experiments that have been concocted and presented in the texts are the most fascinating. And, they make you laugh at the snobbery that you often see around you -- the folks who couldn't differentiate an $80 bottle of wine from a $2 bottle of wine -- yet, when confronted with the $80 price tag ON THE $2 BOTTLE, would *swear* it tastes a LOT better than the $80 bottle that has been mislabeled as $2! Indeed. i liked Dan's description of "the trust game" in Predictably irrational, and what happens when trust is violated and the opportunity for revenge is offered. That had me slapping my leg for awhile. When I was operating a fixed location studio and I'd have a songwriter coming in for demos or a group I'd always ask my first question which was "In your mind's ear, when you hear your song fully arranged and produced, what does it sound like? Bring me an example of production already recorded that fits what your mind's ear hears." This way, I could choose the right capture techniques, such as snip Good point! I would never buy consumer kit from specs. Rather, how it sounded to me when reproducing the sorts of program material I was listening to at that point in my life. I never do, I want to hear it, listen to it. Now if I were wearing the producer's hat I'd argue for my vision of the finished producct if it didn't fit what you heard, but I at least need a starting point if I'm the engineer. What you want to hear at the other end is going to govern how I approach it from the beginning, just as the green color scheme might not be available with the trailer towing package. "Why do I have to specify this parameter before that parameter? They are independant yet you are forcing me to pick a certain one before the other. How did you snip Another reason I like configuring software with text files if I can get it. I can look through the configuration file, set options I'm sure of, and do some more poring over the docs to understand further what needs to be defined. The problem with that approach comes when two option choices are interdependant. There is nothing preventing you from asking for a set of incompatible options -- until some program examines your choices and complains. INdeed it does, and many who offer software configurable in this way will tell you that these two options are mutually exclusive, you can not have both. Can't multiple menus be displayed concurrently? For example, there are many desktop GUIs that will let you "pin" (think: thumbtack) a menu or a dialog snip There's the rub. I've seen two approaches with a lot of these. One approach gives you banks of channel strips, say 1-16, 17-32, etc. Possibly even in 8 channel banks, so make for a smaller footprint. SO, if I'm wanting to do a line check on channel 24 let's say, and we've got 8 channel banks, i've snip There's my main stumbling block. With my old analog iron all the aux sends are there, bus assignments, vca groups, all are right there. Yes it means sometimes the mixer is working at full extension of his body to reach that control, but that control is there, and I can manipulate it, or somebody else can while I'm doing something else. Understood. The same sort of thing is true with theatrical lighting panels, video switchers (the video equivalent of an audio mixer), etc. To make everything visible and accessible or just some selectable subset of it. I have the same sort of problem when authoring multimedia presentations (though those aren't done in real time) There ya go. IF it's there I don't have to think about where am I, how i get there. That's why I thought the push-pin approach might be a good compromise -- let you decide which parts of the interface you want to have access to. But you are still constrained by the physical size of the display. Might, so long as that channel to my brain doesn't force me to take my attention from the primary task, which is paying attention to the audio. What I'd really want to do before plunking down the dollars for a digital console was actually work with it for a couple of days first, and get my mind around some of the concepts, snip I would imagine that might give you an 80% idea of what the change would be like. But, someday, you'd find yourself facing a problem that had to be solved NOW and scurrying to sort out how to get to the solution you want in that environment. YEs, and I need that 80% first. Because within that 80% are going to be every day situations. Sort of like a surgeon doing a laproscopic procedure and suddenly everything going to ****. Drop the tools, grab a big knife and cut the patient open. You're not going to fix the problem through that tiny incision -- unless you are incredibly skilled and fluent with the technology! Again, a very good analogy. As I said, I've put a lot of thought into what you really need to interact with a given device and how to minimize the cognitive loading on the user. You don't want to require 100% of his attention. Especially if it is to perform some low grade task! Indeed, something that I believe is becoming lost to many of our product designers. Imagine if you had to type the word "ANSWER" on a keyboard to answer your phone. Ridiculous, right? It would require too much focused attention for a task that should be trivial. Right, and they forget that no matter the channel, there's only so much bandwidth it can support, and that has more relevance than just your internet connection. This is why I'd really need to sit down in front of one for a day or two, at least with multitracks of prerecorded material up to really put the thing through its paces, and why often I'm reluctant to buy a new device from the in-store demo. It's taken me a long time to decide on one of those little recorders like the zoom, etc. But, snip Exactly. With use, you develop a sort of muscle memory as your hands are accustomed to making certain motions to do certain things. If, instead, you have to coordinate your eyes and hands to *pick* a particular option from a linear list, you have to rely on that visual feedback to ensure you are at the right point in that list before making your selection. INdeed, and whether it be visual or auditory, forcing you to give your attention to the device to choose might distract you from more important work. HOw many people look at the dtmf pad on their touchtone phone when dialing a number? Once you drive a car for awhile, when it starts to rain you automaticallly know wehre to find the controls for the windshield wipers. Get rid of that car, buy another, and for awhile muscle memory is going to fail you, until you get used to the new one. Interesting discussion of your touch screen controlled device. Thanks! Regards, Richard -- | Remove .my.foot for email | via Waldo's Place USA Fidonet-Internet Gateway Site | Standard disclaimer: The views of this user are strictly his own. |
#373
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
Hi Richard,
[World Trade Center] Yes, and what amazes me is that his dog guide didn't totally freak out. I've seen dog guides act rather unpredictably in crowds of people who are either panicked, or big crowds of blind folks who aren't as careful about don't step on doggie etc. iT takes a rather disciplined human/canine team to keep the dog calm and on mission at such times. I got a lecture from Michael, one time, about the nature of that relationship. I, of course, see dogs as pets or companions. Michael stressed that this was not the case with service dogs. They *can't* make mistakes. If my dog runs out into the road to chase a car, worst case, I may lose a dog. A service dog doing the same thing could result in losing your life! I've trained my own a couple of times, but never did the dog go to the bandstand or the gig with me. I don't believe a dog should be subjected to the constant environment of loud music, etc. Dogs are, after all, color blind, their eyes aren't that great to begin with, and its their noses and ears that tell them about their environment. Subjecting a dog to that sort of environment on a regular basis is cruelty imho. Ymmv of course. I recall visiting the "Guide Dogs for the Blind" facility in Palo Alto (?). Amazing to see the effort that goes into raising and training a service dog! It's truly an "investment", not a "pet". I suspect you could rework the weather broadcasts to use a limited vocabulary and, thus, better speech quality. On the other hand, it means that you have to be able to anticipate EVERYTHING that you might need to say over that medium. For example, you might not be prepared to use it to announce an alien invasion!grin Right, and with the same alerting system it might have to announce all sorts of things, which is why I think noaa went originally with dectalks. I think they've changed some of that now, but there for awhile I heard Dectalk's "huge Harry" voice quite a bit. I think i commented, some not quite as familiar asserted noaa was using their "perfect Paul" but I beg to differ. sounded more like the former to me. All of the voices are just tweeks to a core set of parameters in the waveform generator. E.g., the "backup" synthesizer that I designed is conceptually modeled largely on the Klatt synthesizer -- which was the basis of DECtalk. But, no matter how much you tweek the parameters, it still sounds like the same voice. I.e., as if they all shared the same genes! I have a trimmed down synthesizer that I fall back on if the primary synthesizer is unavailable in one of the products I'm designing, currently. It needed to be robust -- so that I could count on it working regardless of what might have broken in the system. I could have opted for a better quality limited vocabulary design -- but, didn't want to have to set that vocabulary in stone and discover, later, that I needed to be able to say something that I couldn't. That's the weakness of the captured phoneme model. Memory intensive, and limited in utility. There are different approaches with different resource and capability tradeoffs. At one end of the spectrum, you can prerecord canned utterances and just play them back to the user. Or, assemble them from smaller phrases that have been carefully recorded with inflection that seems to fit together, seemlessly. At the other end, something that builds waveforms from mathematical models -- like KlattTalk. Or, even articulatory synthesizers that try to model the entire vocal tract! In the middle, you can have things like diphone synthesizers where you take speech samples (from real people) and carefully cut them into "units" which you then assemble dynamically to make whole phonemes and, eventually, words and utterances. Since patching phonemes together results in artificial transitions between adjacent phonemes (i.e., transitioning from an "ah" sound to a "th" sound like in "father"), diphone synthesis deals with just the transitions and pieces transitions together! For example, a phoneme based synthesizer would have an "f" phoneme, an "ah" phoneme, a "th" phoneme, etc. that it would glue together (smoothing out the bumps between them!) to speak "father". A diphone synthesizer would have recordings of the silence-to-f transition, the f-to-ah transition, the ah-to-th transition, etc. So, it would piece together these transitions and have them *meet* in the middle of real phonemes! I.e., the silence-to-f glues to the f-to-ah in the *middle* of the "f" sound. It's easier to smooth the start of an "f" with the end of an "f" than it is to smooth an "f" to an "ah". But, there are a boatload more transitions than there are phonemes! For example, if you have just 4 phonemes, you might have 12 or 16 transitions: 1-2 1-3 1-4 2-1 2-3 2-4 3-1 3-2 3-4 4-1 4-2 4-3 plus, perhaps, 1-1 2-2 3-3 and 4-4. Imagine if you have 40 or 60 phonemes! The DECtalk express suffers from the sin of requiring a special rechargeable battery. Another fault, in my opinion, for an assistive technology device (where do you buy that replacement battery -- today??) Yeah I know, and that rechargeable is going to eventually fail to recharge, and it might decide to do so just when I really want it, and no, I don't want to wait to mail order another one, and then be told that the "other one" I ordered doesn't function in the way I've become accustomed to, or won't work in my environment at all. Or, "Sorry, we no longer sell those parts. But, could we interest you in our new model? It's only $295..." Dan Ariely. We were "required" to take eight courses in The Humanities to graduate. I guess they didn't want a bunch of engineers with no appreciation of other aspects of life and education let loose on the unsuspecting masses. YEp, that's the man. Dan's an interesting read on the subject. I was thinking I'd place money that you'd read him or was familiar. Many of the texts I mentioned led to each other. I.e., if the subject matter is interesting, you tend to look for similar texts to wade through. The reference librarian at the local library branch tends to know what sort of titles might interest me so she feeds them to me as she comes across them. Another interesting read was _The Compass of Pleasure_ which deals with how we perceive pleasure! I.e., the similarities between the rush you get from an epiphany, sex, drugs, food, etc. Unfortunately, it relied too heavily on neurochemistry and neurophysiology to explain what was going on in the brain in each of these scenarios. So, it was a harder read than it needed to be. But, it explains how you can become desensitized to certain experiences, etc. While reading it, I had a flashback to a video game design from a friend. I asked him why there was so much variation in the quality/impact of some of the special effects. I.e., why not ALWAYS put the most spectacular versions out there? His reply, "then they cease to be spectacular!" I've learned to adopt a similar approach in many of the things that I do routinely. For example, I bake a fair bit (cookies, pastry, deserts, etc.). I used to strive for consistency in each batch. All the cookies the same size, texture, etc. Now, I intentionally vary parts of the batch in different ways. Make some cookies larger or smaller. Bake some a bit less, others a bit more. *Burn* some, etc. And, people nibbling on them tend to notice them, more -- less likely to get into a mindless eating mode where you just shovel them down without noticing what -- or how many -- you are eating! I recall selecting American History as one of my courses thinking I had already had two years of that in High School so it would be a recent memory, for me! The professor was an economist. So, I relearned all that history with an entirely different spin than the noble presentation to which I'd previously been subjected. Fascinating! Gives you a different perspective doesn't it? Disturbing. Also makes you feel like a real *sap* for buying into much of that naive patriotism. frown I'd suggest Dan to audio engineers, marketing people, pretty much any profession. Some real thought provoking stuff there about the way we operate in everyday life, how we interact with our customers/clients, etc. For the uninitiated that aren't bored to tears and are still following along, it's not all dry tome, there are some great moments in those two books that will tickle your funny bone. Agreed. My current read (_The Art of Choosing_ -- actually on its way back to the library, tonight) also exposed me to many differences in cultures that I probably never would have been aware of -- even if I had visited some of these cultures! For example, in parts of Europe, doctors make care decisions and just tell patients and family what those will be. Very different from how things are done in the U S A. And, there are psychological consequences to these differences for the folks in those situations! So, I've enjoyed reading books by economists that touch on these sorts of subjects. _The Price of Everything_ discusses all of our actions -- social and otherwise -- in terms of economic transactions. E.g., a woman selling uterine services in a marriage transaction. Good analogy!!! Disturbing analogy! Especially that someone would think about this sort of "activity" in those terms! _How We Decide_ and _Predictably Irrational_ looked at how easily we are manipulated and con ourselves in our behavioral choices, etc. How we can actually think an $10 pill is better than an identical $0.50 pill, etc. How we *don't* have a "Market" in which consumers and producers compromise on price but, rather, how Producers manipulate our expectations of price to a point that they are happy with, etc. Indeed, we hoodwink ourselves often without really thinking about it. Or, how marketers exploit these behaviors to get you to increase what you were willing to pay, originally. frown There's my main stumbling block. With my old analog iron all the aux sends are there, bus assignments, vca groups, all are right there. Yes it means sometimes the mixer is working at full extension of his body to reach that control, but that control is there, and I can manipulate it, or somebody else can while I'm doing something else. Understood. The same sort of thing is true with theatrical lighting panels, video switchers (the video equivalent of an audio mixer), etc. To make everything visible and accessible or just some selectable subset of it. I have the same sort of problem when authoring multimedia presentations (though those aren't done in real time) There ya go. IF it's there I don't have to think about where am I, how i get there. For a better example, imagine being lost in phone menu hell! "Where am I? How do I get to where I want to be? Should I just hang up and start over?" Exactly. With use, you develop a sort of muscle memory as your hands are accustomed to making certain motions to do certain things. If, instead, you have to coordinate your eyes and hands to *pick* a particular option from a linear list, you have to rely on that visual feedback to ensure you are at the right point in that list before making your selection. INdeed, and whether it be visual or auditory, forcing you to give your attention to the device to choose might distract you from more important work. HOw many people look at the dtmf pad on their touchtone phone when dialing a number? Exactly. How many iPhone users can dial with their hand *in* a purse, etc.? Once you drive a car for awhile, when it starts to rain you automaticallly know wehre to find the controls for the windshield wipers. Get rid of that car, buy another, and for awhile muscle memory is going to fail you, until you get used to the new one. Yup. The same is true of a different keyboard layout. For example, my UNIX machines tend to have different keys in different places -- and different functionality -- than my PC's. So swiveling my desk chair to type on one keyboard or another means I have to mentally switch my typing patterns to a different layout -- then, back again when I swivel the chair to its original orientation. Interesting discussion of your touch screen controlled device. Thanks! That and the multimedia / home-automation system are my most ambitious undertakings to rely on alternative display and control technologies. The touchpad is particularly different because it requires the user to think in terms of shapes and how they differ. But, without *looking* at them! It has to be a more internalized, intuitive "feel". And, I have to use lots of completely bogus terms to describe characteristics of those shapes. For example, an 'S' is "curvier" than a 'C' while a 'Z' is "jagged-er" than an 'S', etc. Note that not all shapes can be easily correlated with letters. So, you need a lexicon that users can relate to so that they know how a particular shape is likely to be defined. E.g., imagine a W standing on its side. Or, a 'C' lying face down. Or a "box open on bottom" (imagine a square 'U' upside down). Then, consider geometric operators applied to those shapes. For example, one shape suggesting "do this" while it's mirror image says "do the opposite". To be successful, you can't have what appear to be arbitrary symbols with arbitrary meanings. It has to be something that a user can readily relate to WITHOUT THINKING -- like zipping their fly grin |
#374
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
On Sat 2012-May-26 15:12, Don Y writes:
snip Usually I'll tell the screen reader app to remain silent, and park speech cursor over that display. IF that display is going to be changing a lot I tell the screen reader to ignore that line entirely, and only force it to go there when I want to look at it. Usually I'll use monitoring of a status line to tell ti when to change configurations. E.g. when status line changes to x from y load a configuration which tracks a light bar as your focus of where you are, etc. etc. Rather complex, and probably extremely boring to the folks here. We should probably take this line to emailg.. OK. I'll pick a suitable subject line so my mail is recognizable -- though probably not today as I am busy getting my other half ready for a trip. Probably good for general access technology discussions g. Be sure to despamproof the address g. This is directly analagous to reading printed text. You can cherry pick through large amounts of information with relatively little effort. Indeed, and, believe it or not, i retain what i read better, as well as read fast using it. IN most cases, synthesized speech is the most cost effective and most effective in other ways, compromise that can be achieved. Braille displays are clunky, hard to maintain and don't achieve good reading speed, or efficient work flow, unless you're a customer service rep dealing with both the computer and customers on the phone. After all, they take your hands away from the keyboard, they can display a very limited amount of text at one shot, all them little solenoid springs and mechanical parts ...aaargh Not to mention the expense! I have a Braille N Speak here (I think that is the name). There is an Italian firm that makes a Braille display that is pzieoelectric (spelling?) which should be easier to keep running. But, I think they want $700/cell -- or something equally outrageous! YEp, and that's expensive. I've haerd over the years piezzoelectric might be the wave of the future in those things, but right now they're electromechanical clunkety clackety clickety bang bang devices. Maybe if the eurozone crumbles, you can pick these up for a song! grin rotfl I want the one that gives me the whole 25 line 40 cell page g. yeah dream on. [copy protection] Exactly what I ran into with it. My mom didn't want a screenreader, but was glad to have my help maintaining her system, often without her having to stand over my shoulder and play screenreader. snip Borland used to have a "like a book" license. I.e., you could move the license around. But, still not as flexible as it could be. YEah and that depends on the honor of those paying for it. The problem is, people think there's no "cost", there, and let that confuse their idea of "value"! Just because you didn't pay for something, doesn't make it valueless! I.e., if you think there is no value to that second copy of the software, then live without it -- you should not experience any *costs* if it had no *value*! Yep, that's the whole point. I know a friend of mine offered to get me a cracked copy of jaws a couple years ago, and I said thanks but no thanks. The more ethically honest approach is to not buy it in the first place if you don't like it. State rehab bought my copy years ago but would not spend any more dough for another key disk, and I refused to spend any more for another key because they told me they didn't trust me, and I wasn't welcome to service other peoples' machines even on a volunteer basis. This means that a person wanting to go into repairing computers for others and maintaining them can't buy the software with confidence either, so hence I don't own a copy. If you provide a nonvisual OPTION, then applications will only give token support -- if any -- to it. On the other hand, if the only way to get information out of a device is via that option, then they don't have a choice! This is the approach I have been taking, lately. Pick an output modality that addresses everyone in the target audience and force everything to use that single mechanism! Indeed, and this is what we're finding with a lot of web portals that do things that are only usable with vision. Anyone who's doing web development should looik at a series of articles discussing just this issue in this month's Braille MOnitor, available in text, no doubt from www.nfb.org. I've not looked at their site in a long time. To be honest, I'm a bit set off by the "evangelism". Perhaps it is necessary to bring the "message" to the "unwashed masses". But, in my case, it feels like preaching to the choir. YEah I know, I get a little of that too, but i remind myself they're always trying to expose newcomers, so a good bit of that is necessary. But, I mention it in this group because there are a lot of folks here developing web content, often for others. Sort of like telling a smoker he should quit. I'm sure he already knows that YEah this is true, but you're also having to grab that newcomer and overcome conditioning that has been with him all his life, so a bit of total immersion is necessary, though us old hands get rather tired of the proselytizing g. The backup speech synthesizer in one of my products has similar quality issues as Klatt's DECtalk (he wrote it while a student and DEC commercialized it). It's biggest advantage is that it is pretty lean when it comes to resources -- which translates directly to implementation costs and reliability. Did you ever check out that kid a few years ago that made a dectalk sing? HE spent some serious time coding that, iirc the kid was only 16 years old or so when he did this one. snip The Votrax VS6.3 was capable of singing (poorly). As well as multilingual speech. I recall hearing one speak German. But, the extra capabilities don't really translate into better "regular speech". So, why pay for them? YEah I'd heard it could. I used the Votrax while working at a large vending site, it was coupled with a coin sorter via a serial cable. At Kurzweil, we had frequent failures in the Votrax subsystem. All of the boards were potted -- to discourage copying -- so when one of the four boards died, it was irreparable. Yet another case of someone going out of their way to protect their market share. Funny, I don't hear the name Votrax bandied about anymore so I guess they wasted their efforts clinging to the past instead of embracing the future! YEah I know, other than that coin sorter I've never seen a Votrax box anywhere else. I have a DECtalk DTC01 and a DECtalk Express. Plus a few of the Artic/Votrax-based synthesizers. All have the same basic advantage -- and the same robotic speech quality! Yep, as does my doubletalk. I had, before Katrina, two doubletalk internal cards, a doubletalk lite external, and an audaptor. While it is unfortunate (the **** poor quality of the speech), it is still surprisingly easy to get used to the oddities of these "dialects". Especially when the alternative may be to be deprived of some interactions! YEs, and you can grow accustomed to it. My lady is starting to understand this one fairly well, but then she should after living with me for over a decade g. snip Festival approaches everything from the top down. It tries to understand the context of the material. From that, the appropriate pronunciation rules. And, finally, actually synthesizing the speech waveforms. I'd heard that one elsewhere. I think an electrical engineering friend was telling me about that one a few years ago. But, the implementation never focused on performance issues. Rather, they wrote it so that it would be easier to write and maintain. So, it takes a fair bit of resources just to say "Hello". Those resources translate into dollars in anything other than a PC environment (i.e., your talking calculator would speak nicer but cost more!) INdeed, and there's that principle of no free lunch again. You need fairly quick response to be interactive, so you gotta sacrifice something to get it g. Does the chassis force you to use a certain type of disk drives? E.g., because of disk carriers? Many older RAID offerings require SCSI disks. Would you be happy with RAID in some other form? snip I know many of the Dell machines that I've used over the years had special drive carriers for the RAID drives. Often so you could hot-swap them, etc. Others just hid the array in the bowels of the machine. Have seen that in those too. This is of course a tower, might be Dell, might be another. Note that RAID can be a huge bellyache! When a drive fails, you usually don't have many options other than replacing it as is in the working set. To do this, many machines have special BIOS extensions that you have to work with *in* the BIOS (which may be an issue accessing, for you) to add the drive to the working set, format it, rebuild its contents, etc. This is true also, and i"m debating on that issue too, whether raid is overkill, or i can really get by with a script that might run during periods of low activity to copy files to a drive that's offline. So, it's not written in stone, but i mentally like the idea of raid, but, when relocated and able to get reasonably priced broadband that allows me to run my own servers and doesn't force me to subsidize Rupert and Disney I'll be at the point of decision g. snip machine is sitting in storage unit with some of those little deseccant packages inside the caseg. I wonder how much spiders like dessicants?? grin rotflmao!!! snip Understood. You want a different communication channel to interact with the device instead of having to share the audio channel that you are devoting to the task at hand. That's it exactly. I'm accustomed, as i said in another post this thread, to not having to do anything butdirectly communicate with the device. IF I want that channel strip assigned to a certain bus or certain vca group, push the button. BECAUSE YOU RELY ON MEMORY! IMO, the singles biggest asset a visually impaired user has is memory. Remembering where you last put something. Remembering how you set a parameter. etc. Blind man with altzheimer's has got to be shear terror! Indeed, I've told my kids if that happens bring the gun! IF I want audio from that channel on aux bus 3, i adjust that aux send. I don't have to play where the f*$@ am I? IT's automatic like buttoning your coat, zipping your pants or tying your shoes. Great analogies! They are low skill tasks so why require lots of attention to perform them? Beyond that, see comments my previous post on bandwidth. The primary mission is the audio, that's what I'm paid for. There is already the intercom situation trying to occupy some of that bandwidth, and we all know that every channel, in every aspect of life, is bandwidth limited. The am station that carries Rush has less available bandwidth than does the TV channel carrying the Simpsons. The only way to get more bandwidth on a channel is to increase the size of the channel. To get more water flow to your house increase the size of the main. Selecting a bus assignment, etc. are normally low bandwidth things, depress the switch, then either your finger tells you the switch is depressed, or the idiot light tells you, etc. Trying to listen to syntehsized speech while trying to give my attention to selecting which bank of channels I'm on, or whether those controls are aux send 3 or aux send 4 while trying to listen to the audio I'm paid to listen for, and the guy running the spotlight talkign about the chick with the big knockers in the front row is just ... a bit overwhelming. Somethin's got to give somewhere. I've used the same rationale to argue in favor of using non-visual channels for visual tasks! I.e., those cases where your eyes are busily engaged in some activity and shouldn't have to be pulled away just so you could see which virtual button you were pressing on your iPhone! Uh huh! Just my point with some of these complex devices that are made for people to operate while driving, etc. Give them an auditory channel for the info, keep the eyes on the road, and the hands upon the wheel please. Exactly. But there are lots of other cases that aren't as dramatic. Why do I need to look at my iPod to use it? Or, my telephone? Why can't I read my appointment calendar while I'm taking a walk around the neighborhood (I have to stop moving in order to read all the details on that silly little display as it "bounces around" too much when I am walking)? True, although the developers might wonder why you need to look at your calendar when you're walking around your neighborhood in the first place. But, it's well within the realm of possibility that, for whatever reason, you might wish to do so, maybe because you can't quite recall whether you've an appointment this morning, and if you can refresh your memory you'll know whether you ahve the time to stop and chat with the old boy walking his dog you see all the time, and buy a cup of coffee to sit and chat awhile. Why should a soldier have to take his eyes off The Enemy just to consult some fancy piece of high-tech kit? Or, an opthamalogist have to remove his eyes from peering into yours just to see what some device is trying to convey to him? Good points as well, and we dont' think. I've been glad to see that the gps folks such as Tomtom realize this, and give the driver instructions with a voice. Regards, Richard -- | Remove .my.foot for email | via Waldo's Place USA Fidonet-Internet Gateway Site | Standard disclaimer: The views of this user are strictly his own. |
#375
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
Hi Richard,
On 5/26/2012 7:20 PM, Richard Webb wrote: On Sat 2012-May-26 15:12, Don Y writes: Rather complex, and probably extremely boring to the folks here. We should probably take this line to email OK. I'll pick a suitable subject line so my mail is recognizable -- though probably not today as I am busy getting my other half ready for a trip. Probably good for general access technology discussionsg. Be sure to despamproof the addressg. OK. I'll reply to this and any further USENET posts by you to your email address. |
#376
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
Richard Webb wrote:
Uh huh, which i hope will suddenly occur to all these producers and others who think that the audio we consume *must* be squeezed to get maximum level, hence destroying the music. When you listen to a symphony, or a good jazz piece, the dynamics vary. Those variations are part of what makes it interesting and causes you to think of the music as something more than background noise. Variations in texture, size, etc. cause people to actually think about what they're "consuming' and actually derive enjoyment from it. I recorded a school orchestra playing a while ago, taking care to set the peaks as close to zero dB as I could, but maintaining all the dynamic range. Listening to the pseudo soundfield recording, it feels as if you're there on headphones, and sounds excellent on speaker. The first comment of the conductor on hearing it was "It's too quiet..." I just sent them the master copies and left them to ask one of the students to compress it to taste. No pressure on me, it was a freebie anyway. -- Tciao for Now! John. |
#377
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
On 5/27/2012 9:42 AM, John Williamson wrote:
Richard Webb wrote: Uh huh, which i hope will suddenly occur to all these producers and others who think that the audio we consume *must* be squeezed to get maximum level, hence destroying the music. When you listen to a symphony, or a good jazz piece, the dynamics vary. Those variations are part of what makes it interesting and causes you to think of the music as something more than background noise. Variations in texture, size, etc. cause people to actually think about what they're "consuming' and actually derive enjoyment from it. I recorded a school orchestra playing a while ago, taking care to set the peaks as close to zero dB as I could, but maintaining all the dynamic range. Listening to the pseudo soundfield recording, it feels as if you're there on headphones, and sounds excellent on speaker. The first comment of the conductor on hearing it was "It's too quiet..." I just sent them the master copies and left them to ask one of the students to compress it to taste. No pressure on me, it was a freebie anyway. The question is ... what was the piece of music? Doug McDonald |
#378
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
On Sat 2012-May-26 22:03, Don Y writes:
Yes, and what amazes me is that his dog guide didn't totally freak out. I've seen dog guides act rather unpredictably in crowds of people who are either panicked, or big crowds of blind folks who aren't as careful about don't step on doggie etc. snip iT takes a rather disciplined human/canine team to keep the dog calm and on mission at such times. I got a lecture from Michael, one time, about the nature of that relationship. I, of course, see dogs as pets or companions. Michael stressed that this was not the case with service dogs. They *can't* make mistakes. If my dog runs out into the road to chase a car, worst case, I may lose a dog. A service dog doing the same thing could result in losing your life! Yep, which is why I've lectured blind musicians about bringing their partners into such environments. i'd be just as critical of the blind guy who insisted that his dog coexist with him in the noisy machine shop, etc. That relationship should be nurtured, and respect shown the animal part of said team. After all, we ask the public to respect it and are always admonishing them "when doggie is in harness don't pet doggie." re speech synthesis: All of the voices are just tweeks to a core set of parameters in the waveform generator. E.g., the "backup" synthesizer that I designed is conceptually modeled largely on the Klatt synthesizer -- which was the basis of DECtalk. Indeed, very similar voices, some differences in pitch, but timbres are very similar, for the musicians among us g. But, no matter how much you tweek the parameters, it still sounds like the same voice. I.e., as if they all shared the same genes! Good analogy! For some reason though dectalk is one of the most easily understood by the neophyte to that world. I remember helping a blind vendor computerize his bookkeeping and inventory a few years ago (wehre I ran into the Votrax) and trying various synthesizers with him, including ARtic's offerings, RC systems' doubletalk series that I'm fond of, etc. We ended up going with Dectalk because he could understand it best. I have a trimmed down synthesizer that I fall back on if the primary synthesizer is unavailable in one of the products I'm designing, currently. It needed to be robust -- so that I could count on it working regardless of what might have broken in the system. snip That's the weakness of the captured phoneme model. Memory intensive, and limited in utility. There are different approaches with different resource and capability tradeoffs. Indeed. YOur discussion of constructing sounds from phonemes and various building blocks was probably very instructive to some here. For those with a daw you can illustrate this to yourself quite easily, zoom in on the waveforms quite closely with your daw. Imagine your world of constructing the piece of music from various takes, being sure that the tempo is the same, or close enough, etc. etc. Many of us know about the battle with getting that smooth seamless punch in. This then is an exercise in the smooth seamless punch attempt from hell. Analog guys try this one with your razor blade g. snip But, there are a boatload more transitions than there are phonemes! For example, if you have just 4 phonemes, you might have 12 or 16 transitions: 1-2 1-3 1-4 2-1 2-3 2-4 3-1 3-2 3-4 4-1 4-2 4-3 plus, perhaps, 1-1 2-2 3-3 and 4-4. Yep, and the more varied you want the vocabulary to be, the more the mathematical possibilities expand. This is why you'll never get fully natural sounding speech no matter what type of syntehsis you use. YOu could apply a supercomputer to the task and still not get it right reliably. YOu could get closer, but ... The DECtalk express suffers from the sin of requiring a special rechargeable battery. Another fault, in my opinion, for an assistive technology device (where do you buy that replacement battery -- today??) Yeah I know, and that rechargeable is going to eventually fail to recharge, and it might decide to do so just when I really want it, and no, I don't want to wait to mail order another one, and then be told that the "other one" I ordered doesn't function in the way I've become accustomed to, or won't work in my environment at all. Or, "Sorry, we no longer sell those parts. But, could we interest you in our new model? It's only $295..." Yep, and that's the one that doesn't function in my environment at all. This is why often these days i look for gear that uses off the shelf rechargeable batteries. obsolete this! Dan Ariely. We were "required" to take eight courses in The Humanities to graduate. I guess they didn't want a bunch of engineers with no appreciation of other aspects of life and education let loose on the unsuspecting masses. YEp, that's the man. Dan's an interesting read on the subject. I was thinking I'd place money that you'd read him or was familiar. Many of the texts I mentioned led to each other. I.e., if the subject matter is interesting, you tend to look for similar texts to wade through. I'm sure they did. Having just been exposed recently I found it fascinating. Makes one think about a lot of things, how we are manipulated, and manipulate ourselves when it comes to our expectations. hEre again, look at the example of the vibrating vu indicator I put in my pocket on gigs. Before the vibrating silent pager made those little vibrator motors ubiquitous I never expected that I'd be able to have a silent vu indicator. IF I got one, it would be impractical to use because it would force me to devote one hand to monitoring it all the time, leaving me with one hand to operate the mixing console. But, they were some of the first things I arranged to recreate after Katrina because they're inexpensive to build, and had i not the facilities to gin them up with off the shelf parts and some project boxes I would have paid good money to acquire them again. The reference librarian at the local library branch tends to know what sort of titles might interest me so she feeds them to me as she comes across them. Always handy. Had one at library for the blind in Iowa who was good about that, but these days my braille library service comes from a multistate center, and although they might be trained in library science they're more akin to a warehouse filling orders. Another interesting read was _The Compass of Pleasure_ which deals with how we perceive pleasure! I.e., the similarities between the rush you get from an epiphany, sex, drugs, food, etc. Unfortunately, it relied too heavily on neurochemistry and neurophysiology to explain what was going on in the brain in each of these scenarios. So, it was a harder read than it needed to be. INdeed. i have heard of that one but never seen it. About a year ago I did "this is your brain on music" though. I still wonder though how much the strange environment of the mri and all that doesn't mess with the results however. After all, we don't have sex, or listen to music or other such activities when crammed into a metal tube in a noisy sterile environment. If you could capture those brain images of a guy listening to Duke Ellington or the Beatles while he's kicked back in his favorite chair without the apparatus you might get far different results. But, it explains how you can become desensitized to certain experiences, etc. While reading it, I had a flashback to a video game design from a friend. I asked him why there was so much variation in the quality/impact of some of the special effects. I.e., why not ALWAYS put the most spectacular versions out there? His reply, "then they cease to be spectacular!" I"ve heard this in a lot of endeavors, or a version of. After awhile too much "hey wow" desnsitizes the mind to it. We see it in every day life. Drifting off topic for the group again sorry guys. I'm sure that the customer service rep is glad to have that braille display to do his job interacting with customers on the phone instead of being forced to rely on speech, but as for me, I know about MOore's law and how wonderful the technology has become just in my time on this earth, Ray's reading machine, information at my fingertips, etc. etc. For the money I'd just as soon wait for that next generation of braille displays that uses maybe a combination of piezzoelectricity and electrochemical reactions to give me a page of refreshable braille grin. But, back to topic. Why do we go to so many live concerts and come away dsiappointed with the sound, because we, and the folks putting on the show have been conditioned to think that's the way it always is. We got used to live sound being the way it was back in the days of those awful sounding voice of the theater cabinets, etc. That was what live amplified music was supposed to sound like in our heads, and even though we've got better signal processing, better speaker arrays, etc. WE still go for that sound because, it's the way live msuic requiring amplification sounded to us. I've learned to adopt a similar approach in many of the things that I do routinely. For example, I bake a fair bit (cookies, pastry, deserts, etc.). I used to strive for consistency in each batch. All the cookies the same size, texture, etc. Now, I intentionally vary parts of the batch in different ways. Make some cookies larger or smaller. Bake some a bit less, others a bit more. *Burn* some, etc. And, people nibbling on them tend to notice them, more -- less likely to get into a mindless eating mode where you just shovel them down without noticing what -- or how many -- you are eating! Uh huh, which i hope will suddenly occur to all these producers and others who think that the audio we consume *must* be squeezed to get maximum level, hence destroying the music. When you listen to a symphony, or a good jazz piece, the dynamics vary. Those variations are part of what makes it interesting and causes you to think of the music as something more than background noise. Variations in texture, size, etc. cause people to actually think about what they're "consuming' and actually derive enjoyment from it. I recall selecting American History as one of my courses thinking I had already had two years of that in High School so it would be a recent memory, for me! The professor was an economist. So, I relearned all that history with an entirely different spin than the noble presentation to which I'd previously been subjected. Fascinating! Gives you a different perspective doesn't it? Disturbing. Also makes you feel like a real *sap* for buying into much of that naive patriotism. frown INdeed it does, and reinforces some lessons you've already learned in life intuitively, but makes you think about them a bit. This old hippie learned to think about economists in a different light thanks to DAn. Before, I'd read Paul Krugman and these other guys, and put less stock in waht they said than I did in the guy that spun up the weather forecast i heard this morning. I'd suggest Dan to audio engineers, marketing people, pretty much any profession. Some real thought provoking stuff there about the way we operate in everyday life, how we interact with our customers/clients, etc. For the uninitiated that aren't bored to tears and are still following along, it's not all dry tome, there are some great moments in those two books that will tickle your funny bone. Agreed. My current read (_The Art of Choosing_ -- actually on its way back to the library, tonight) also exposed me to many differences in cultures that I probably never would have been aware of -- even if I had visited some of these cultures! For example, in parts of Europe, doctors make care decisions and just tell patients and family what those will be. Very different from how things are done in the U S A. And, there are psychological consequences to these differences for the folks in those situations! INdeed, and I don't think I'd find that acceptable at all. I've had to do battle with the medical professionals responsible for my lady's care, and, a year ago switched her primary care physician over an issue wehre he endangered her life just for an unnecessary test. So, I've enjoyed reading books by economists that touch on these sorts of subjects. _The Price of Everything_ discusses all of our actions -- social and otherwise -- in .. terms of economic transactions. E.g., a woman selling uterine services in a marriage transaction. Good analogy!!! Disturbing analogy! Especially that someone would think about this sort of "activity" in those terms! Indeed it is, but true nonetheless. AS I commented above, I'm sort of newly converted to that one. Re mixers ... There ya go. IF it's there I don't have to think about where am I, how i get there. For a better example, imagine being lost in phone menu hell! "Where am I? How do I get to where I want to be? Should I just hang up and start over?" I've done that, just backed out and started over. But, on a gig that's not a viable option. Exactly. With use, you develop a sort of muscle memory as your hands are accustomed to making certain motions to .. do certain things. If, instead, you have to coordinate your eyes and hands to *pick* a particular option from a linear list, you have to rely on that visual feedback to ensure you are at the right point in that list before making your selection. INdeed, and whether it be visual or auditory, forcing you to . give your attention to the device to choose might distract you from more important work. HOw many people look at the dtmf pad on their touchtone phone when dialing a number? Exactly. How many iPhone users can dial with their hand *in* a purse, etc.? Few i'd bet. But muscle memory manifests itself in lots of interesting ways. YEars ago one of my side jobs was throwing about 300 newspapers every morning. My daughter folowed along one morning, and noted as we got to one place, i was looking at her while walking, carrying on a conversation. my hand dipped into the bag at my side, launched the newspaper which landed on the stopp right by the customer's front door. My kid says 'wow dad, you weren't even looking when you threw that!" I didn't need to, when I got to that spot my hand just automatically threw the paper, while gauging the weight and adjusting the throw so as to put it right wehre i wanted it. had it been a lighter paper, such as mOnday morning, I'd want to toss more gently. For the larger Wednesday edition, put some more oomph behind it. Once you drive a car for awhile, when it starts to rain you automatically know wehre to find the controls for the windshield wipers. Get rid of that car, buy another, and for awhile muscle memory is going to fail you, until you get used to the new one. DYYup. The same is true of a different keyboard layout. For example, my UNIX machines tend to have different keys in different places -- and different functionality -- than my PC's. So swiveling my desk chair to type on one keyboard or another means I have to mentally switch my typing patterns to a different layout -- then, back again when I swivel the chair to its original orientation. Can relate to that one. Thansk for the interesting discussions. We probably should take general blindness and screen access stuff to email, jsut so as not to offend was my only point elsewhere. Great to have another intelligent voice in this forum! Regards, Richard -- | Remove .my.foot for email | via Waldo's Place USA Fidonet-Internet Gateway Site | Standard disclaimer: The views of this user are strictly his own. |
#379
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
Doug McDonald wrote:
On 5/27/2012 9:42 AM, John Williamson wrote: Richard Webb wrote: Uh huh, which i hope will suddenly occur to all these producers and others who think that the audio we consume *must* be squeezed to get maximum level, hence destroying the music. When you listen to a symphony, or a good jazz piece, the dynamics vary. Those variations are part of what makes it interesting and causes you to think of the music as something more than background noise. Variations in texture, size, etc. cause people to actually think about what they're "consuming' and actually derive enjoyment from it. I recorded a school orchestra playing a while ago, taking care to set the peaks as close to zero dB as I could, but maintaining all the dynamic range. Listening to the pseudo soundfield recording, it feels as if you're there on headphones, and sounds excellent on speaker. The first comment of the conductor on hearing it was "It's too quiet..." I just sent them the master copies and left them to ask one of the students to compress it to taste. No pressure on me, it was a freebie anyway. The question is ... what was the piece of music? The programme included a scaramouche with solo saxophone, a percussion piece, and Saint Saen's Symphony No. 9 with organ using the Grande Orgue at Rouen cathedral. This last was "interesting" for me to balance and for the orchestra to time, as there was a thirty or forty yard gap between the organ and orchestra. I don't have the full list on this computer, but I'll post it and links to .wav files of my finished product later if anyone's interested. I don't know what the orchestra did with it after I handed them their copy. Constructive criticism is welcomed. All recorded using the internal microphones on a Zoom H2 in surround mode. -- Tciao for Now! John. |
#380
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
Hi Richard,
[eliding much to try to get back within the charter of the newsgroup] I have a trimmed down synthesizer that I fall back on if the primary synthesizer is unavailable in one of the products I'm designing, currently. It needed to be robust -- so that I could count on it working regardless of what might have broken in the system. snip That's the weakness of the captured phoneme model. Memory intensive, and limited in utility. There are different approaches with different resource and capability tradeoffs. Indeed. YOur discussion of constructing sounds from phonemes and various building blocks was probably very instructive to some here. For those with a daw you can illustrate this to yourself quite easily, zoom in on the waveforms quite closely with your daw. Imagine your world of constructing the piece of music from various takes, being sure that the tempo is the same, or close enough, etc. etc. The advantage of diphone synthesis is that the point at which you are pasting things together has a greater chance (theoretically) of being the same "sound". I.e., the silence-to-f and f-to-ah diphones should be meeting in the middle of an "f" sound. The amount of processing required to glue the front end of an "f" (from the silence-to-f diphone) to the back end of an "f" sound (from the f-to-ah diphone) is less than the effort required to transition from an "f" to an "ah" -- or an "f" to an "oo" or an "oh", etc. Many of us know about the battle with getting that smooth seamless punch in. This then is an exercise in the smooth seamless punch attempt from hell. Analog guys try this one with your razor bladeg. Ha! But, there are a boatload more transitions than there are phonemes! For example, if you have just 4 phonemes, you might have 12 or 16 transitions: 1-2 1-3 1-4 2-1 2-3 2-4 3-1 3-2 3-4 4-1 4-2 4-3 plus, perhaps, 1-1 2-2 3-3 and 4-4. Yep, and the more varied you want the vocabulary to be, the more the mathematical possibilities expand. This is why you'll never get fully natural sounding speech no matter what type of syntehsis you use. YOu could apply a supercomputer to the task and still not get it right reliably. YOu could get closer, but ... Imagine trying to piece together sound samples of a trombone as it slides from one note to another -- and making it natural sounding. If you start with samples of the actual notes, you have to do a bit of crunching to "compute" the waveform as it transitions to the next note (i.e., as the slide is moved in or out). On the other hand, if you have samples of ALL the various transitions -- C to A, C to D, etc. -- you can more readily paste two consecutive transitions together! Also, the fact that you are working with samples of actual speech when doing diphone synthesis means you can truly get different "voices" from the synthesizer. They need not share the same "genes" as the voices in DECtalk, for example. E.g., you could make a synthesizer that sounds like a particular speaker -- by design! Another interesting read was _The Compass of Pleasure_ which deals with how we perceive pleasure! I.e., the similarities between the rush you get from an epiphany, sex, drugs, food, etc. Unfortunately, it relied too heavily on neurochemistry and neurophysiology to explain what was going on in the brain in each of these scenarios. So, it was a harder read than it needed to be. INdeed. i have heard of that one but never seen it. About a year ago I did "this is your brain on music" though. I still wonder though how much the strange environment of the mri and all that doesn't mess with the results however. After all, we don't have sex, or listen to music or other such activities when crammed into a metal tube in a noisy sterile environment. If you could capture those brain images of a guy listening to Duke Ellington or the Beatles while he's kicked back in his favorite chair without the apparatus you might get far different results. Some of the experiments cited in these texts were disturbing in the extent to which the subjects were tested. E.g., the "pleasure" book cited an experiment where a gay man was "pleasured" by a female prostitute while sitting in an fMRI. The "goal" being to understand his pleasure response and "cure" him of his homosexuality. Similar tests imaged the brains of pedophiles as they viewed pictures of children. And, even the suggestion of using this sort of device as an ultimate lie detector! But, back to topic. Why do we go to so many live concerts and come away dsiappointed with the sound, because we, and the folks putting on the show have been conditioned to think that's the way it always is. We got used to live sound being the way it was back in the days of those awful sounding voice of the theater cabinets, etc. That was what live amplified music was supposed to sound like in our heads, and even though we've got better signal processing, better speaker arrays, etc. WE still go for that sound because, it's the way live msuic requiring amplification sounded to us. "Unprocessed" live music often disappoints because it seems like it's "not enough (sound)". As if the aural equivalent of "non-fat milk" -- it just doesn't seem like it is really milk! Now, I intentionally vary parts of the batch in different ways. Make some cookies larger or smaller. Bake some a bit less, others a bit more. *Burn* some, etc. And, people nibbling on them tend to notice them, more -- less likely to get into a mindless eating mode where you just shovel them down without noticing what -- or how many -- you are eating! Uh huh, which i hope will suddenly occur to all these producers and others who think that the audio we consume *must* be squeezed to get maximum level, hence destroying the music. When you listen to a symphony, or a good jazz piece, the dynamics vary. Those variations are part of what makes it interesting and causes you to think of the music as something more than background noise. Variations in texture, size, etc. cause people to actually think about what they're "consuming' and actually derive enjoyment from it. I think most audio is consumed from canned reproductions. People hear the same piece performed the same way each time they listen to it. Anything that doesn't sound exactly that same way seems (to many) to be "wrong" or "broken" in some way. Thansk for the interesting discussions. We probably should take general blindness and screen access stuff to email, jsut so as not to offend was my only point elsewhere. Great to have another intelligent voice in this forum! I've already started composing some "deeper" queries for "private consumption". Be a while before I can get them mailed, though -- problems with my mail server, currently. |
#381
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
Hi Richard,
On 5/27/2012 6:59 PM, Richard Webb wrote: The advantage of diphone synthesis is that the point at which you are pasting things together has a greater chance (theoretically) of being the same "sound". I.e., the silence-to-f and f-to-ah diphones should be meeting in the middle of an "f" sound. The amount of processing required to glue the front end of an "f" (from the silence-to-f diphone) to the back end of an "f" sound (from the f-to-ah diphone) is less than the effort required to transition from an "f" to an "ah" -- or an "f" to an "oo" or an "oh", etc. Yep, lots of combinations, lots of possibilities. A good collection of historical synthesizers, he http://www.cs.indiana.edu/rhythmsp/ASA/Contents.html including several of the DECtalk voices. Unfortunately, it wasn't created from a common set of text applied to the different synthesizers so you can't readily compare one technology to another. For example, you'd need to be pretty familiar with phoneme based synthetic speech to note teh differences with the diphone sample presented there. On the other hand, if you have samples of ALL the various transitions -- C to A, C to D, etc. -- you can more readily paste two consecutive transitions together! Yep, but you'e got to store all them, and select them, and that takes, there again, storage and raw processing power. THe U.S Coast guard still seems to be using the dectalk or very similar for their synthesis for high seas weather broadcasts on hf single sideband, but last I heard wlo radio out of MObile Alabama they were using a different syntehsis technique, with a female voice iirc. As for me, when I read them for a ham radio network I usually braille them first, or did before my embosser crumped enough to need a trip to the embosser doctor in Florida eventually. NOw I justgrab the warnings by listening to the synth and taking shorthand notes with a slateg. Dots 1-3-5, 1-2-5 Dots 2-3-4, 1-2-5, 2-4, 2-3-4-5, 2-3-5 Also, the fact that you are working with samples of actual speech when doing diphone synthesis means you can truly get different "voices" from the synthesizer. They need not share the same "genes" as the voices in DECtalk, for example. E.g., you could make a synthesizer that sounds like a particular speaker -- by design! Indeed, if you've got the computing power for it you could mimick just about any voice. But it doesn't really take much to get 80% of the personality of a voice. For example, if you have ~40 unique phonemes, then, conceivably, you have ~1600 possible diphones. In reality, often 15% less than that. And, with those ~1400 diphones, you can "say anything" (unlimited vocabulary). In practice, however, you usually have to include several different copies of vowel sounds to convey different sorts of stress. So, you start with maybe ~55-60 phonemes which would suggest ~3600 diphones (in practice, you only use ~2200 of those). Still, it's a very large unit database! And, you still have to splice the diphones together (not trivial). [OTOH, much less computationally expensive than synthesizing the actual waveform from a mathematical model!] I think most audio is consumed from canned reproductions. People hear the same piece performed the same way each time they listen to it. Anything that doesn't sound exactly that same way seems (to many) to be "wrong" or "broken" in some way. True, and most of those reproductions have been dynamically processed before being delivered to them, by the broadcast air chain if not elsewhere in the production chain. They never feel that great crescendo from very soft to in your face, in your ears, can't get away. It's an experience that is lost on most without formal music training who are under about age 50. I don't think my mother has listened to music that wasn't brought to her ears via processing transducers and amplifiers since I was playing trumpet in high school. I'd be willing to wager that I can't find a dozen people within a mile of me who have gone to a totally unamplified musical performance in the last two years. I think, for some artists who rely heavily on "processing" their work (e.g., "one man bands") that the option for a live performance isn't remotely possible! Thanks for the interesting discussions. We probably should take general blindness and screen access stuff to email, just so as not to offend was my only point elsewhere. Great to have another intelligent voice in this forum! I've already started composing some "deeper" queries for "private consumption". Be a while before I can get them mailed, though -- problems with my mail server, currently. Can relate. That's why I use more than one these days. FUn with emailg.. I'm one of the loudest bitchers about folks straying from the charter, especially on religion and politics, so figure what's good for the goose ... Agreed. I'd have taken this offlist sooner but for the request/interest that was expressed in "eavesdropping". |
#382
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
On Sun 2012-May-27 16:21, Don Y writes:
re speech syntehsis ... There are different approaches with different resource and capability tradeoffs. Indeed. YOur discussion of constructing sounds from phonemes and various building blocks was probably very instructive to some here. For those with a daw you can illustrate this to yourself quite easily, zoom in on the waveforms quite closely with your daw. Imagine your world of constructing the piece of music from various takes, being sure that the tempo is the same, or close enough, etc. etc. The advantage of diphone synthesis is that the point at which you are pasting things together has a greater chance (theoretically) of being the same "sound". I.e., the silence-to-f and f-to-ah diphones should be meeting in the middle of an "f" sound. The amount of processing required to glue the front end of an "f" (from the silence-to-f diphone) to the back end of an "f" sound (from the f-to-ah diphone) is less than the effort required to transition from an "f" to an "ah" -- or an "f" to an "oo" or an "oh", etc. Yep, lots of combinations, lots of possibilities. Many of us know about the battle with getting that smooth seamless punch in. This then is an exercise in the smooth seamless punch attempt from hell. Analog guys try this one with your razor bladeg. Ha! I got pretty good with an editing block back in the day but doubt I'd ever be that good ,g.. But, there are a boatload more transitions than there are phonemes! For example, if you have just 4 phonemes, you might have 12 or 16 transitions: 1-2 1-3 1-4 2-1 2-3 2-4 3-1 3-2 3-4 4-1 4-2 4-3 plus, perhaps, 1-1 2-2 3-3 and 4-4. Yep, and the more varied you want the vocabulary to be, the more the mathematical possibilities expand. This is why you'll never get fully natural sounding speech no matter what type of syntehsis you use. YOu could apply a supercomputer to the task and still not get it right reliably. YOu could get closer, but ... Imagine trying to piece together sound samples of a trombone as it slides from one note to another -- and making it natural sounding. If you start with samples of the actual notes, you have to do a bit of crunching to "compute" the waveform as it transitions to the next note (i.e., as the slide is moved in or out). Yep, exactly, you either need lots of memory to store those samples, lots of horsepower to process them, or both. Look out for Moore's law though, it's probably on the horizon g. On the other hand, if you have samples of ALL the various transitions -- C to A, C to D, etc. -- you can more readily paste two consecutive transitions together! Yep, but you'e got to store all them, and select them, and that takes, there again, storage and raw processing power. THe U.S Coast guard still seems to be using the dectalk or very similar for their synthesis for high seas weather broadcasts on hf single sideband, but last I heard wlo radio out of MObile Alabama they were using a different syntehsis technique, with a female voice iirc. As for me, when I read them for a ham radio network I usually braille them first, or did before my embosser crumped enough to need a trip to the embosser doctor in Florida eventually. NOw I justgrab the warnings by listening to the synth and taking shorthand notes with a slate g. Also, the fact that you are working with samples of actual speech when doing diphone synthesis means you can truly get different "voices" from the synthesizer. They need not share the same "genes" as the voices in DECtalk, for example. E.g., you could make a synthesizer that sounds like a particular speaker -- by design! Indeed, if you've got the computing power for it you could mimick just about any voice. Another interesting read was _The Compass of Pleasure_ which deals with how we perceive pleasure! I.e., the similarities between the rush you get from an epiphany, sex, drugs, food, etc. Unfortunately, it relied too heavily on neurochemistry and neurophysiology to explain what was going on in the brain in each of these scenarios. So, it was a harder read than it needed to be. INdeed. i have heard of that one but never seen it. About a year ago I did "this is your brain on music" though. I still wonder though how much the strange environment of the mri and all that doesn't mess with the results however. After all, we don't have sex, or listen to music or other such activities when crammed into a metal tube in a noisy sterile environment. If you could capture those brain images of a guy listening to Duke Ellington or the Beatles while he's kicked back in his favorite chair without the apparatus you might get far different results. Some of the experiments cited in these texts were disturbing in the extent to which the subjects were tested. yEs, indeed, so have been some of them I've read about, I've read of the pedophile test elsewhere, and some of the other testing of sexual arousal etc. I believe in Playboy. Still the environment of the fMRI or whatever they're using has to skew things a bit imho. snip But, back to topic. Why do we go to so many live concerts and come away disappointed with the sound, because we, and the folks putting on the show have been conditioned to think that's the way it always is. We got used to live sound being the way it was back in the days of those awful sounding voice of the theater cabinets, etc. That was what live amplified music was supposed to sound like in our heads, and even though we've got better signal processing, better speaker arrays, etc. WE still go for that sound because, it's the way live music requiring amplification sounded to us. "Unprocessed" live music often disappoints because it seems like it's "not enough (sound)". As if the aural equivalent of "non-fat milk" -- it just doesn't seem like it is really milk! Indeed, and I'm glad i had the opportunity to "cut my teeth" as it were on unprocessed live music. See related thread of a couple weeks back in this newsgroup discussing a bad sounding mastering job. Iirc I asserted in that one that too many young folks haven't had the opportunity to experience live music that wasn't brought to their ears via at least one transducer. Now, I intentionally vary parts of the batch in different ways. Make some cookies larger or smaller. Bake some a bit less, others a bit more. *Burn* some, etc. And, people nibbling on them tend to notice them, more -- less likely to get into a mindless eating mode where you just shovel them down without noticing what -- or how many -- you are eating! Uh huh, which i hope will suddenly occur to all these producers and others who think that the audio we consume *must* be squeezed to get maximum level, hence destroying the music. When you listen to a symphony, or a good jazz piece, the dynamics vary. Those variations are part of what makes it interesting and causes you to think of the music as something more than background noise. Variations in texture, size, etc. cause people to actually think about what they're "consuming' and actually derive enjoyment from it. I think most audio is consumed from canned reproductions. People hear the same piece performed the same way each time they listen to it. Anything that doesn't sound exactly that same way seems (to many) to be "wrong" or "broken" in some way. True, and most of those reproductions have been dynamically processed before being delivered to them, by the broadcast air chain if not elsewhere in the production chain. They never feel that great crescendo from very soft to in your face, in your ears, can't get away. It's an experience that is lost on most without formal music training who are under about age 50. I don't think my mother has listened to music that wasn't brought to her ears via processing transducers and amplifiers since I was playing trumpet in high school. I'd be willing to wager that I can't find a dozen people within a mile of me who have gone to a totally unamplified musical performance in the last two years. Thanks for the interesting discussions. We probably should take general blindness and screen access stuff to email, just so as not to offend was my only point elsewhere. Great to have another intelligent voice in this forum! I've already started composing some "deeper" queries for "private consumption". Be a while before I can get them mailed, though -- problems with my mail server, currently. Can relate. That's why I use more than one these days. FUn with email g.. I'm one of the loudest bitchers about folks straying from the charter, especially on religion and politics, so figure what's good for the goose ... Regards, Richard -- | Remove .my.foot for email | via Waldo's Place USA Fidonet-Internet Gateway Site | Standard disclaimer: The views of this user are strictly his own. |
#383
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
John Williamson wrote:
Doug McDonald wrote: On 5/27/2012 9:42 AM, John Williamson wrote: Richard Webb wrote: Uh huh, which i hope will suddenly occur to all these producers and others who think that the audio we consume *must* be squeezed to get maximum level, hence destroying the music. When you listen to a symphony, or a good jazz piece, the dynamics vary. Those variations are part of what makes it interesting and causes you to think of the music as something more than background noise. Variations in texture, size, etc. cause people to actually think about what they're "consuming' and actually derive enjoyment from it. I recorded a school orchestra playing a while ago, taking care to set the peaks as close to zero dB as I could, but maintaining all the dynamic range. Listening to the pseudo soundfield recording, it feels as if you're there on headphones, and sounds excellent on speaker. The first comment of the conductor on hearing it was "It's too quiet..." I just sent them the master copies and left them to ask one of the students to compress it to taste. No pressure on me, it was a freebie anyway. The question is ... what was the piece of music? The programme included a scaramouche with solo saxophone, a percussion piece, and Saint Saen's Symphony No. 9 with organ using the Grande Orgue at Rouen cathedral. This last was "interesting" for me to balance and for the orchestra to time, as there was a thirty or forty yard gap between the organ and orchestra. I don't have the full list on this computer, but I'll post it and links to .wav files of my finished product later if anyone's interested. I don't know what the orchestra did with it after I handed them their copy. Constructive criticism is welcomed. All recorded using the internal microphones on a Zoom H2 in surround mode. Links now he- www.oysterbroadcast.co.uk/click_2.html There are both .wav files and mp3 files linked to. -- Tciao for Now! John. |
#384
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
On Sun 2012-May-27 21:56, Don Y writes:
snip I.e., the silence-to-f and f-to-ah diphones should be meeting in the middle of an "f" sound. The amount of processing required to glue the front end of an "f" (from the silence-to-f diphone) to the back end of an "f" sound (from the f-to-ah diphone) is less than the effort required to transition from an "f" to an "ah" -- or an "f" to an "oo" or an "oh", etc. Yep, lots of combinations, lots of possibilities. A good collection of historical synthesizers, he http://www.cs.indiana.edu/rhythmsp/ASA/Contents.html including several of the DECtalk voices. Unfortunately, it wasn't created from a common set of text applied to the different synthesizers so you can't readily compare one technology to another. That would have been cool had they done their comparison with the same text. I think I helped the assistive tech folks back in IOwa do a tape recording of common speech synths of the day, and we used the same passage for each, iirc. The passage in Gone with the Wind where whatshishead tells Scarlet Frankly he doesn't give a damn. I think we had an Accent SA, a Dectalk, an Artic and a Doubletalk in that one. I don't know if it survives or not, was recorded to cassette iirc. snip On the other hand, if you have samples of ALL the various transitions -- C to A, C to D, etc. -- you can more readily paste two consecutive transitions together! Yep, but you've got to store all them, and select them, and that takes, there again, storage and raw processing power. THe U.S. Coast guard still seems to be using the dectalk or very similar for their synthesis for high seas weather broadcasts on hf single sideband, but last I heard wlo radio out of MObile Alabama they were using a different syntehsis technique, with a female voice iirc. As for me, when I read them for a ham radio network I usually braille them first, or did before my embosser crumped enough to need a trip to the embosser doctor in Florida eventually. NOw I justgrab .. the warnings by listening to the synth and taking shorthand notes with a slateg. Dots 1-3-5, 1-2-5 Dots 2-3-4, 1-2-5, 2-4, 2-3-4-5, 2-3-5 YEah that's what i thought when my embosser quit reproducing dot 1. For the uniinitiated he said "oh ****." Also, the fact that you are working with samples of actual speech when doing diphone synthesis means you can truly get different "voices" from the synthesizer. They need not share the same "genes" as the voices in DECtalk, for example. E.g., you could make a synthesizer that sounds like a particular speaker -- by design! Indeed, if you've got the computing power for it you could mimick just about any voice. But it doesn't really take much to get 80% of the personality of a voice. For example, if you have ~40 unique phonemes, then, conceivably, you have ~1600 possible diphones. In reality, often 15% less than that. And, with those ~1400 diphones, you can "say anything" (unlimited vocabulary). Again true! Hadn't really thought of that, but makes sense. In practice, however, you usually have to include several different copies of vowel sounds to convey different sorts of stress. So, you start with maybe ~55-60 phonemes which would suggest ~3600 diphones (in practice, you only use ~2200 of those). And there's the rub! expression is a big part of speech, and that usually happens with the diphones that are our vowell sounds. Screenreaders try a bit to change the inflection if the sentenced terminates with a question mark, for example. Still, it's a very large unit database! And, you still have to splice the diphones together (not trivial). Yep, and that's where the storage and memory intensive comes in. [OTOH, much less computationally expensive than synthesizing the actual waveform from a mathematical model!] Right. But, as any of us know who've tried to synthesize musical instruments, getting all those combinations is the fun part. I think most audio is consumed from canned reproductions. People hear the same piece performed the same way each time they listen to it. Anything that doesn't sound exactly that same way seems (to many) to be "wrong" or "broken" in some way. True, and most of those reproductions have been dynamically processed before being delivered to them, by the broadcast air chain if not elsewhere in the production chain. They snip I'd be willing to wager that I can't find a dozen people within a mile of me who have gone to a totally unamplified musical performance in the last two years. I think, for some artists who rely heavily on "processing" their work (e.g., "one man bands") that the option for a live performance isn't remotely possible! True enough. Back a few years ago I used to do music on hold and some music beds with midi sound modules. People commented frequently about how my instruments felt more real than a lot of synthesized music they ehard. This was because I play a variety of instruments, and I never tried to arrange the music where instruments were doing things they odn't naturally do. snip I'm one of the loudest bitchers about folks straying from the charter, especially on religion and politics, so figure what's good for the goose ... Agreed. I'd have taken this offlist sooner but for the request/interest that was expressed in "eavesdropping". True, including my colleague Frank. Some of the minutiae of access technology though probably crowded the edge grin. Regards, Richard -- | Remove .my.foot for email | via Waldo's Place USA Fidonet-Internet Gateway Site | Standard disclaimer: The views of this user are strictly his own. |
#385
Posted to rec.audio.pro
|
|||
|
|||
FLAC or other uncompressed formats, which is best?
On Thu, 24 May 2012 18:00:29 -0700, Don Y wrote:
"Those who can, do. Those who CAN'T, troll!" Thank you. That's made my day! -- Anahata --/-- http://www.treewind.co.uk +44 (0)1638 720444 |
Reply |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
HELP needed understanding AIFF & FLAC "lossless" formats | Pro Audio | |||
Uncompressed Digital Video vs. Uncompressed Digital Audio | Tech | |||
Flac Vs. Wav | Tech | |||
Source for uncompressed CDs? | Pro Audio | |||
need converter from dp3 or dp4 formats to wav or ses formats | Pro Audio |