I'm trying to use FMOD to develop an application that is expected to be able to play audio more slowly than normal so that the user could hear the audio more clearly. In my code, I called Channel::setFrequency like this:
float normal_frequency;
channel->getFrequency(&normal_frequency);
channel->setFrequency(normal_frequency * speedSelected);
If the value of speedSelected is lower than 1, for example 0.8, the audio will indeed be played more slowly than normal, but the voice sounds really odd. Playing slowly doesn't enable me to hear audio more clearly at all.
By contrast, Microsoft's Windows Media Player works perfectly when it plays audio more slowly than normal.
Is there a way to solve this problem?
If by "sounds really odd" you mean the pitch has been altered then this is the expected outcome. If you want to correct the pitch while adjusting the speed you will need to use the pitch shifter DSP.
Related
Recently, I discover that my tutorial videos could be seen at 1.5x playback speed without losses in quality (they are actually better to see, as I normally speak slowly). My problem is that if I change the speed of the video when using a video editor, like Kdenlive, the audio becomes distorted and turns into a mess (higher pitch, I believe).
How could I obtain the same quality as VLC "playback fast" and Youtube "playback speed 1.5" for the audio track? I'm a layman in audio/video editing, so I'm also satisfied with partial answers, like the identification of which terms I should search for in this case.
It might be better to take your audio track and use something like Sound Forge to automatically remove silence. Just be sure to add a pad to that (built into sound forge) otherwise the speech will sound way to chopped and fast.
Aside from that, you could also use Vegas to (then) chop the video to keep pace with your new speech rate. Vegas is a video editing program that is best for this kind of down and dirty editing.
Ok. So either there is something I don't grasp about audio signals in general, or there is something up there.
Problem: Every time audio sound is played, the spectrum displayed (in demo) ends differently.
Try1:
drum snare
Try2:
same drumsnare
Shouldn't they be identical? Is this concidered in the terms of error "same" spectrum? Or am I missing something?
I am working on an idea, where I need a thorough sound analysis, which includes FFT, therefore would need as precise data as I can get. Any insights?
I'm trying to make a video tutorial, so i decided to record the speeches using a TTS online service.
I use Audacity to capture the sound, and the sound was clear !
After dinning, i wanted to finish the last speeches, but the sound wasn't the same anymore, there is a background noise(parasite) which is disturbing, i removed it with Audacity, but despite this, the voice isn't the same ...
You can see here the difference between the soundtrack of the same speech before and after the occurrence of the problem.
The codec used by the stereo mix peripheral is "IDT High Definition Codec".
Thank you.
Perhaps some cable or plug got loose? Do check for this!
If you are using really cheap gear (built-in soundcard and the likes) it might very well also be a problem of electrical interference, anything from ...
Switching on some device emitting a electro magnetic field (e.g. another monitor close by)
Repositioning electrical devices on your desk
Changes in CPU load on your computer (yes i'm serious!)
... could very well cause some kinds of noises with low-fi sound hardware.
Generally, if you need help on audio sounding wrong make sure that you provide a way to LISTEN to the files, not just a visual representation.
Also in your posted waveform graphics i can see that the latter signal is more compressed, which may point to some kind of automated levelling going on somewhere in the audio chain.
I'm working on a simple music visualization. Probably not relevant, but I am doing the sound processing using the new WebKit Audio Data API and the dsp.js library.
I want to make a text vibrate (grow/shrink) to the rhythm of the music. What is the best way to do this?
What I've done so far is ran the signals through a FFT. I look at the bottom 10% of frequencies (bass notes?) and when the amplitude surpasses a certain threshold, I animate the text.
Does this sound right? Or am I completely off?
You say you've done it, and then you ask if you are way off? Well, you tell us: does it work for your application?
One potential problem is that the FFT is slow, both in that there may be a lag between your input and output and there will be a lot of CPU used. I don't expect this will matter for your application, but, in general, you are better off using a low-pass filter. When the output of the low-pass goes above some level, you can use that to trigger something for some short amount of time.
Another issue is simply that this is only a very basic beat detection algorithm. It might work for bass-heavy "four on the floor" music, but you'll need to figure out where the threshold goes and how to keep it moving when the bass stops or something. You may want to research beat detection algorithms. The open source aubio has some.
http://aubio.org/
I did some of my own research and found out that SID-chips had only few hardware supported synthesizing features. Including three audio oscillators with four possible waveforms (sawtooth, triangle, pulse, noise), with ADSR envelopes and ring modulators. Accompanied with oscillator sync and ring modulators. Also read there was a way to play single PCM sound as well.
It is all so little, but still I heard lots of different sounds from my TV sets. How were they combined to produce all that variety of audio?
To give some specifics, I'd like to know how to combine those components to produce guitar, piano or drum -like audio? Another interesting things would be different buzzes and sounds specific to C64.
I used to write music on the C64 for games, demos and even services (I wrote the official QuantumLink theme, even). As for your question, the four different waveforms were typically overlaid with the sync and ring mods (less often ring, because it was unpredictable on different versions of the SID chip), and sometimes used cleanly.
For example, a typical 'snare' sound would be composed of a noise waveform with a very fast attack and sustain, and depending on whether you wanted a drumstick or brush sound, either a very fast decay and moderately short release, or a short decay and slower release.
Getting the right sound was typically trial and error, and the limitations were pretty heavy. You really never got to the point of piano or guitar sound due to the simple waveforms without overlayable harmonic waveforms, about the best you could get was things that sounded beepy, things that sounded marimba-y, and things that sounded like a snare drum.
One of the tricks used most often to extend sound was done with fast machine code playback routines that could change the played notes on voices so quickly as to give the impression of a fuller, harmonic tone. We just called it arpeggiation, although at 10 to 12 note changes a second it sounded more like a buzzy chord.
As for the sampled waveforms, they were only available as single bit and later 4 bit samples. These sounded terrible despite our best attempts, because basically the method of playback for a sample on the 64 was to play a white noise waveform and rapidly alter the volume on the SID chip to produce a rising and falling wave. Do it fast enough and it sort of sounds like the original sound, poorly tuned in on a staticky radio.
I suggest you grab hold of a C64 emulator for the PC (CCS64 is a good one) and a 64 BASIC programming guide and just play around.... the SID chip is entirely manipulatable from BASIC.
To sum up, how did we get all of those piano and guitar sounds on a C64? We didn't, really.
Take a look at some of these docs related to producing music on the C64:
http://sid.kubarth.com/articles.html
This type of music you are describing falls into the category of "chiptunes". I'd recommend checking out some modern trackers like MilkyTracker, which are used to create music in this style. There are libraries like libmodplug that allow you to play tracker in your software.
Check out some of the C64 emulators out there. I've read that some of them are 100% accurate in ther sound reproduction, true to the original.