I have thousands of GSM wav files, generated by a phone recording system. I need to run these through a speech-to-text engine (Nuance) and this appears to work only with PCM files.
I know nothing about these formats, but would need a programmatic (scripting) way to convert GSM to PCM.
Any ideas?
Sox can do it. You need to supply the sample rate and channel count of the gsm file though because it doesn't contain any header information. Something like this:
sox input.gsm -r 8000 -c 1 -w -s ouput.wav
Related
I am trying to do some audio debugging on my Linux system.
I learned how to record the sound of the current playing media but how can I get the PCM data without DAC/ADC?
I mean, just like wireshark or tcpdump tool, is there some sort of alsadump that I can make use of?
I want to do bit-exact comparison of the output PCM data to make sure the audio processing algorithm (which is an executable binary) worked correctly.
Thanks a lot.
Say I have a bunch of mp3 files. How would I go about using an audio software command-line tool to decrease the volume completely on one side of the audio file (right), leaving on the left side of the audio file complete? I would then like to save this file to a new mp3 file. This needs to be done entirely over the command line.
As an another approach. Is it possible to use a command line audio file tool to convert a stereo mp3 file to mono, then to merge this mono file with a "silent" track of the same length, creating a left-headphone track with sound and a right-headphone track with silence?
In this SO question, there seems to be a number of approaches to a rather eccentric end goal. In the first possible solution, I just want to decrease the volume of the right side. In the second possible solution, I want to combine a few more common steps to achieve the same end result.
The problems here are that:
I can't find a good command-line tool for modifying audio files, even to do the second approach which should be a more common request.
I'm expecting that I'll first need to convert the mp3 file to wav, using a similar or second tool
This query is eccentric so there aren't many links about it on the web.
Thanks for any help. Audacity would be my go-to normally, but it appears to be GUI only.
SoX lets you do this very easily.
The first case, muted right channel:
sox test.mp3 test-rmuted.mp3 remix 1 0
The second case, summed mono on left channel:
sox test.mp3 test-lmono.mp3 remix 1,2 0
To batch process you could just do a simple for loop.
Muted right channel:
for f in *.mp3
do
basename="${f%.*}"
echo "$basename"
sox "$f" -t wav - remix 1 0 | \
lame --preset standard - "00-${basename}-rmute".mp3
done
Summed mono on left channel only:
for f in *.mp3
do
basename="${f%.*}"
echo "$basename"
sox "$f" -t wav - remix 1,2 0 | \
lame --preset standard - "00-${basename}-lmono".mp3
done
You can forgo LAME and do the encoding with SoX as in the first two examples, but I find this method simpler and more flexible.
As suggested in a comment you should be able to use FFmpeg to process your audio files. Dropping one channel completely would produce a different result than doing conversion to mono first. However, I think either could be achieved with the pan filter in FFMpeg.
https://trac.ffmpeg.org/wiki/AudioChannelManipulation
https://ffmpeg.org/ffmpeg-filters.html#pan
Attenuation of one channel
Decode mp3 file to wav
Create a new stereo wav file using the pan filter 100% to one channel
Encode the resulting wav file to mp3
Mixing both channels evenly in one channel, then attenuating the other channel
Decode mp3 file to wav
Create a new wav file using the pan filter with one channel 50% from left and 50% right, and the other channel with 0 gain
Encode the resulting wav file to mp3
Would like to do following four things (separately), and need a bit of help understanding how to approach this,
Dump audio data (from a serial-over-USB port), encoded as PCM, 16-bit, 8kHz, little-endian, into a file (plain binary data dump, not into any container format). Can this approach be used:
$ cat /dev/ttyUSB0 > somefile.dat
Can I do a ^C to close the file writing, while the dumping is in progress, as per the above command ?
Stream audio data (same as above described kind), directly into ffmpeg for it to play out ? Like this:
$ cat /dev/ttyUSB0 | ffmpeg
or, do I have to specify the device port as a "-source" ? If so, I couldn't figure out the format.
Note that, I've tried this,
$ cat /dev/urandom | aplay
which works as expected, by playing out white-noise..., but trying the following doesn't help:
$ cat /dev/ttyUSB1 | aplay -f S16_LE
Even though, opening /dev/ttyUSB1 using picocom # 115200bps, 8-bit, no parity, I do see gibbrish, indicating presence of audio data, exactly when I expect.
Use the audio data dumped into the file, use as a source in ffmpeg ? If so how, because so far I get the impression that ffmpeg can read a file in standard containers.
Use pre-recorded audio captured in any format (perhaps .mp3 or .wav) to be streamed by ffmpeg, into /dev/ttyUSB0 device. Should I be using this as a "-sink" parameter, or pipe into it or redirect into it ? Also, is it possible that in 2 terminal windows, I use ffmpeg to capture and transmit audio data from/into same device /dev/ttyUSB0, simultaneously ?
My knowledge of digital audio recording/processing formats, codecs is somewhat limited, so not sure if what I am trying to do qualifies as working with 'raw' audio or not ?
If ffmpeg is unable to do what I am hoping to achieve, could gstreamer be the solution ?
PS> If anyone thinks that the answer could be improved, please feel free to suggest specific points. Would be happy to add any detail requested, provided I have the information.
I have a Google Hangouts app and I am trying to let the user play a sound that I provide.
Google has this covered, with its Audio Resource, but it only accepts specifically encoded sound files, PCM 16 wav files.
I have been trying to encode my files using ffmpeg, but it does not seem to be working.
Any idea as to what I am doing wrong?
Here is my ffmpeg command line :
ffmpeg -i sound.mp3 -map_metadata -1 -flags bitexact sound.wav
Thanks for your help
I just wrote hangout app that used audio and I noticed I had to use 44.1KHz sample rate on my 16 bit PCM WAV files or it wouldn't work. See if you can add an option to change the sample rate to that.
I have a telephony modem (SIM5320EVB) which gives voice data on ttyUSB0 as PCM with 1600 bytes each 100ms.Iam able to see the data on minicom. How to capture the PCM data in linux (i use ubuntu)and hear it live on the fly or atleast save and play the data? Is there any application available or API? If the approach atleast is suggested I will try developing one..
cat /dev/ttyUSB0 > my_cap_file
# make some noise for 5s for example, then hit ^C
then get Audacity and try to open your file with it, trying different input formats. You should be able to hear the sound you produced if you will guess the right format.
Install sox for the play command and use: play -r 8000 -c 1 -t raw -e signed-integer -b 16 /dev/ttyUSB0. That is: bit rate 8KHz, 1 channel (mono), raw data (PCM), format is signed integer 16 bits wide, and data can be read from ttyUSB0.
That requires sox to be able to play audio on your system; I've had success with pulseaudio for the underlying sound system.
You may need to modify the buffer size for play. By default, it buffers data which creates a small but very noticeable delay.