Converting 8kHz mulaw to PCM 16kHz - node.js

Im trying to receive a conversation streaming from Twilio in 8kHz mulaw and I want to convert it to 16kHz PCM for some processing ( that doesnt support 8kHz mulaw format), I tried this method but without success :
- convert the string payload to base64 buffer.
- convert the buffer to Uint8Array with this package: buffer-to-uint8array.
- convert the Uint8Array to Int16Array with this pacakge: alawmulaw.
- then use wav library to write the results.
I am still unable to get a valid audio file following this process, Can someone tell me what i am doing wrong ? or guide me to achieve this ?

I've had good luck using the WaveFile lib (https://www.npmjs.com/package/wavefile)
const wav = new WaveFile();
wav.fromScratch(1, 8000, '8m', Buffer.from(payload, "base64"));
wav.fromMuLaw();
// You can resample.
wav.toSampleRate(16000);
// You can write this straight to a file (will have the headers)
const results = wav.toBuffer();
// Or you can access the samples without the WAV header
const samples = wav.data.samples;
Hope that helps!

Related

fluent-ffmpeg: Convert wav to alac .m4a with streams

On my node.js server I want to convert a wav file to apple lossless .m4a
Using fluent-ffmpeg, I got this so far:
const transcoder = ffmpeg(fs.createReadStream(`${__dirname}/convertTest.wav`));
transcoder
.withAudioCodec('alac')
.addOutput(fs.createWriteStream(`${__dirname}/test2.m4a`))
.run()
;
But it throws me the following error:
Error: ffmpeg exited with code 1: Could not write header for output file #0 (incorrect codec parameters ?): Invalid argument
Error initializing output stream 0:0 --
Conversion failed!
I read that mp4 container needs a seekable file an therefore doesn't work with streams. So this actually works:
const transcoder = ffmpeg(`${__dirname}/convertTest.wav`);
transcoder
.withAudioCodec('alac')
.save(`${__dirname}/test2.m4a`)
.run()
;
Since I have all files as streams and not physical files, I am looking for way to somehow abstract this away to make it work with streams. Is this possible with fluent-ffmpeg?
The alac codec and .m4a format is non optional, so I need it to work with those formats.
Turns out that the ALAC codec does not support streams because the file head has to be read at different times. So I had to use it without streams.

How to input audio as bytes in moviepy

I have audio as bytes in the form of:
b'ID3\x04\x00\x00\x00\x00\x00#TSSE\x00\x00\x00\x0f\x00\x00\x03Lavf57.71.100\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xff\...
That I got from Amazon web services:
import boto3
client = boto3.client('polly')
response = client.synthesize_speech(
Engine='neural',
LanguageCode='en-GB',
OutputFormat='mp3',
SampleRate='8000',
Text='hey whats up this is a test',
VoiceId='Brian'
)
And I want to input it into moviepy audiofile using
AudioFileClip()
AudioFileClip takes filename or an array representing a sound. I know I can save the audio as a file and read it, but I would like to have AudioFileClip take the bytes output I showed above.
I tried:
AudioFileClip(response['AudioStream'].read())
But this gives the error:
TypeError: endswith first arg must be bytes or a tuple of bytes, not
str
What can I do?
You need to convert the stream of audio to a different type. (Thats why its called TypeError). You are putting it as a string and it wants a byte format.
You can convert a str to a byte by using the bytearrayfunction!
https://docs.python.org/3/library/functions.html#func-bytearray
You can also look at this question:
Best way to convert string to bytes in Python 3?
For more help just comment on this anwser, and Ill try to help you as soon as possible.
Hope this can help you on your project,
PythonMasterLua

Convert from PCM to WAV. Is it Possible?

I have an application for iPAD.
This application records the voice of the microphone.
The audio formats of the item must be PCM, MP3 and WAV files. The MP3 file I get it starting from the original raw file and then convert using LAME.
Unfortunately I have not found any example that allows me to convert a PCM file to a WAV file.
I just noticed that if I put the file extension to WAV format, starting from the raw application saves without problems, so I think that there is no type conversion from PCM WAV files.
Correct?
PS: Sorry for my english ... I use Google Translate
WAV is some kind of a box. PCM is in the box. There are many container formats like MP4. MP4 can contain audio, video or both. It can also contain multiple video or audio streams. Or zip files. Zip files can contain text files. But zip files can also contain images, pdfs,... But you can't say "how can I convert a zip file to the text file inside the zip".
If you want to convert PCM data to a WAVE file you should not many problems because WAV files are quite simple files. Take a look at this:
(See also WAVE PCM soundfile format.)
You first need that header and after you can just append all your pcm data (see the data field).
Converting PCM to WAV isn't too hard. PCM and WAV both format contains raw PCM data, the only difference is their header(wav contains a header where pcm doesn't). So if you just add wav header then it will do the tricks. Just get the PCM data and add the wav header on top of the PCM data. To add wav header with PCM data, check this link.
I was working on a system where it accepts only wav files, but the one I was receiving from amazon Polly was pcm, so finally did this and got my issue resolved. Hope it helps someone. This is an example of nodejs.
// https://github.com/TooTallNate/node-wav
const FileWriter = require('wav').FileWriter
let audioStream = bufferToStream(res.AudioStream);
var outputFileStream = new FileWriter(`${outputFileFolder}/wav/${outputFileName}.wav`, {
sampleRate: 8000,
channels: 1
});
audioStream.pipe(outputFileStream);
function bufferToStream(binary) {
const readableInstanceStream = new Stream.Readable({
read() {
this.push(binary);
this.push(null);
}
});
return readableInstanceStream;
}

extract each frame from rtsp (mp4) stream

Im trying to extract each frame from a rtsp mp4 stream, and convert that into a jpeg/gif using ffmpeg. I'm getting the sdp header from 000001b0.....000001b5, and adding that into an byte array then capturing a frame starting from 000001b6 and appending it to the byte array.
When I flush it to a file (.mpg) and use ffmpeg it throws errors and not converting.
my header looks like 000001B008000001B58913000001000000012000C488BA98514043C1463F and after this I'm appending a frame (starting from 000001b6).
I did something similar with FFMPEG, and it seems that the frame data you get from FFMPEG already contains the frame header, which is all you need to transcode the data. Please make sure that you decode the mp4 data to a raw format (RGB24 for instance), then encode it to the pixelformat the JPEG/GIF encoder expects (probably a YUV format) using libswscale, before passing the data to the encoder.
Depending on the Codec you may not have to add anything or you may have to add a lot..
This is referred to as de-packetization and MPEG4-ES has no packetization model... H264 has many depending on the profile.
Check out the RFC..
Either 3016 or 3640 should help you.
https://www.rfc-editor.org/rfc/rfc3640
https://www.rfc-editor.org/rfc/rfc3016

Does openSL support "returning decode audio buffer" for MP3/AAC?

Does openSL support "returning decode audio buffer" for MP3/AAC, as Open Max IL does?
I am creating an app that has mp3/aac as input and want to use openSL as a decoder, not the player. I need decoded PCM data back to my app, and I want to play / do something else with that buffer later.
I can't find any related APIs for this in opensl spec.
I haven't tried this, but perhaps you could set up a data source using as a URI data locator using MIME format with an Android audio buffer as the sink, then access the decoded data that way?

Resources