Amazon Alexa Audio Encoding- Few audios are not playing

Amazon Alexa Audio Encoding- Few audios are not playing - audio

I am encoding audio for Alexa audio using ffmpeg like below,
ffmpeg -i <input-file> -ac 2 -codec:a libmp3lame -b:a 48k -ar 16000 <output-file.mp3>
The problem is few of the audios are playing properly but few are not. I am using same Project Rate and Quality (Project Rate 16000 and Quality to 48 kbps) for all audios which needs to be converted. Anybody knows is there any basic quality for source.mp3 to encode to Project Rate 16000 and Quality to 48 kbps?
The response I am getting from alexa for faulty file is, "There is a problem with skill response".

Related

Dealing with problems in FLAC audio files with ffmpeg

I have gotten a set of FLAC (audio) files from a friend. I copied them to my Sonos music library, and got set to enjoy a nice album. Unfortunately, Sonos would not play the files. As a result I have been getting to know ffmpeg.
Sonos' complaint with the FLAC files was that it was "encoded at an unsupported sample rate". With rolling eyes and shaking head, I note that the free VLC media player happily plays these files, but the product I've paid for (Sonos) - does not. But I digress...
ffprobe revealed that the FLAC files contain both an Audio channel and a Video channel:
$ ffprobe -hide_banner -show_streams "/path/to/Myaudio.flac"
Duration: 00:02:23.17, start: 0.000000, bitrate: 6176 kb/s
Stream #0:0: Audio: flac, 176400 Hz, stereo, s32 (24 bit)
Stream #0:1: Video: mjpeg (Progressive), yuvj444p(pc, bt470bg/unknown/unknown), 450x446 [SAR 72:72 DAR 225:223], 90k tbr, 90k tbn, 90k tbc (attached pic)
Metadata:
comment : Cover (front)
Cool! I guess this is how some audio players are able to display the 'album artwork' when they play a song? Note also that the Audio stream is reported at 176400 Hz! Apparently I'm out of touch; I thought that 44.1khz sampling rate effectively removed all of the 'sampling artifacts' we could hear. Anyway, I learned that Sonos would support a max of 48kHz sampling rate, and this (the 176.4kHz rate) is what Sonos was unhappy about. I used ffmpeg to 'dumb it down' for them:
$ ffmpeg -i "/path/to/Myaudio.flac" -sample_fmt s32 -ar 48000 "/path/to/Myaudio48K.flac"
This seemed to work - at least I got a FLAC file that Sonos would play. However, I also got what looks like a warning of some sort:
[swscaler # 0x108e0d000] deprecated pixel format used, make sure you did set range correctly
[flac # 0x7feefd812a00] Frame rate very high for a muxer not efficiently supporting it.
Please consider specifying a lower framerate, a different muxer or -vsync 2
A bit more research turned up this answer which I don't quite understand, and then in a comment says, "not to worry" - at least wrt the swscaler part of the warning.
And that (finally) brings me to my questions:
1.a. What framerate, muxer & other specifications make a graphic compatible with a majority of programs that use the graphic?
1.b. How should I use ffmpeg to modify the Video channel to set these specifications (ref. Q 1.a.)?
2.a. How do I remove the Video channel from the .flac audio file?
2.b. How do I add a Video channel into a .flac file?
EDIT:
I asked the above (4) questions after failing to accomplish a 'direct' conversion (a single ffmpeg command) from FLAC at 176.4 kHz to ALAC (.m4a) at 48 kHz (max supported by Sonos). I reasoned that an 'incremental' approach through a series of conversions might get me there. With the advantage of hindsight, I now see I should have posted my original failed direct conversion incantation... we live and learn.
That said, the accepted answer below meets my final objective to convert a FLAC file encoded at 176.4kHz to an ALAC (.m4a) at 48kHz, and preserve the cover art/video channel.

What framerate, muxer & other specifications make a graphic compatible with a majority of programs that use the graphic?
A cover art is just a single frame so framerate has no relevance in this case. However, you don't want a video stream, it has to remain a single image, so -vsync 0 should be added. Muxer is simply the specific term for the packager as used in media file processing. It is decided by the choice of format e.g. FLAC, WAV..etc. What's important is the codec for the cover art; usually, it's PNG or JPEG. For FLAC, PNG is the default codec.
How do I remove the Video channel from the .flac audio file
ffmpeg -i "/path/to/Myaudio.flac" -vn -c copy "/path/to/Myaudio48K.flac"
(All this does is skip any video in the input and copy everything else)
How do I add a Video channel into a .flac file?
To add cover art to audio-only formats, like MP3, FLAC..etc, the video stream has to have a disposition of attached picture. So,
ffmpeg -i "/path/to/Myaudio.flac" -i coverimage -sample_fmt s32 -ar 48000 -disposition:v attached_pic -vsync 0 "/path/to/Myaudio48K.flac"
For direct conversion to ALAC, use
ffmpeg -i "/path/to/Myaudio.flac" -i coverimage -ar 48000 -c:a alac -disposition:v attached_pic -vsync 0 -c:v png "/path/to/Myaudio48K.m4a"

Convert audio from old video game

I have some audiofiles from an old video game in a very rare format,
22050 hz, 2 channel, 4 bit, PCM (not ADPCM).
Is there any tool around to convert that in any modern format?
I tried ffplay -ac 2 -acodec adpcm_ima_apc -i $audiofile but that did not work out (it played back, but in a terrible quality)
I uploaded one of the files here (Be sure to use the right download-button, not the one from the add):
https://www.file-upload.net/download-13698824/Morningmood.wav.html

Optimised FFMPEG parameters

How do I optimize the following to the specifications. The current one has random lags & pause while streaming.
ffmpeg -re -y -i FILENAME.mp4 -vcodec libx264 -b:v 600k -filter:v yadif -ac 1 -ar 44100 -f flv "rtmp://..."
Video Format
Maximum 720p (720 x 1280) resolution, at 30 frames per second. (or 1
key frame every 2 seconds).
You must send an I-frame (keyframe) at least once every two seconds
throughout the stream..
Recommended max bit rate is 4000 Kbps.
The accepts H264 encoded video and AAC encoded audio only.
Advanced Settings
Pixel Aspect Ratio: Square.
Frame Types: Progressive Scan.
Audio Sample Rate: 44.1 KHz.
Audio Bitrate: 128 Kbps stereo.
Bitrate Encoding: CBR.
My file is in mp4 generated with iMovies. Thanks in advance!

Use VBV by adding -maxrate and -bufsize. Use a -maxrate value that is below your maximum upload rate and leave some room for overhead.
Use the -g option for your keyframe interval with a value that is double that of your frame rate.
Make sure you're not using an ancient version of ffmpeg and x264.
Sore more details at FFmpeg Wiki: Encoding for Streaming Sites.

These look like the specs for the Facebook Live API.
Doubtful that the random lags and pauses are on your end (or at least not in the encoding), but do check your network connection to ensure you have an excessive amount of bandwidth and relatively low latency. Facebook Live operates with low latency, requiring your connection to be truly flawless.

How to convert MP3's to constant bitrate using FFMPEG

I have found that MP3's encoded with variable bit rate cause the currentTime property to be reported incorrectly, especially when scrubbing. That has wreaked havok on my app and has been a nightmare to debug.
I believe I need to convert all my MP3's to constant bitrate. Can FFMPEG (or something else) help me do that efficiently?
Props to Terrill Thompson for attempting to pin this down*

I also had issues with HTML5 being inaccurate for large mp3s. Since quality was not a big issue for my audio, I converted to constant bit rate of 8kbps, sample rate 8k, mono and it solved my issues.
You can convert to a contant bit rate for a few files using Audacity (export > save to mp3 > constant bit rate).
Or, using FFMPEG:
ffmpeg -i input.wav -codec:a libmp3lame -b:a 8k output.mp3
If you also want to reduce to mono and a 8k sample rate:
ffmpeg -i input.wav -codec:a libmp3lame -b:a 8k -ac 1 -ar 8000 output.mp3
Using the second compressed an hour of audio to under 5MB.

Something else is going on. currentTime should not be influenced by the fact that you are using variable-bit rate MP3s.
Perhaps the context sampleRate is not the same as the sample rate as the MP3s? That will mess up timing of the audio samples because WebAudio will resample the MP3s to the context sample rate.

Azure live streaming with already encoded content

I have been looking into Azure live streaming features and very impressed with the features they offer.
What I would like to know is, if we can live stream from an already encoded video asset rather than a live recording.
For an example if I want to stream an event on a specific time with existing VOD content on Azure.
Not sure if there is any support on Wirecast to do this.
Any help or suggestion would be appreciated.
Thanks

I tested right after I read your question, but currently I failed to publish out existing/already-uploaded Asset as a streaming source in Azure Media Service solely.
In case of WireCast, it can serve media files for streaming as the manual describes in page 36.
Wirecast uses the concept of a shot to construct presentations. A shot contains media,
along with the settings for that media. In its simplest form, a shot contains one piece of
media such as a photo or a video clip. But it can also be something more complex, like a
live camera with a title, and background music, or even a Playlist of shots.
But, if you only want to serve a file without editing, you can use simple encoder program like FFmpeg from your computer (or virtual machine) for transmitting as below documentation suggests.
https://azure.microsoft.com/ko-kr/blog/azure-media-services-rtmp-support-and-live-encoders/
At above link, FFmpeg command line example is as below;
C:\tools\ffmpeg\bin\ffmpeg.exe -v verbose -i MysampleVideo.mp4 -strict -2 -c:a aac -b:a 128k -ar 44100 -r 30 -g 60 -keyint_min 60 -b:v 400000 -c:v libx264 -preset medium -bufsize 400k -maxrate 400k -f flv rtmp://channel001-streamingtest.channel.media.windows.net:1935/live/a9bcd589da4b42409936940/mystream1

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string