Is there a case where a video file could contain both mjpeg frames and a sound layer? I know originally, people used to place a 8khz PCM uncompressed track along with their mjpeg movie since it is streamed/decoded/played frame by frame with no motion prediction needed. Can some decoder accept an Mjpeg with a more recent audio format?
[EDIT 1]
What I'll first try is to check if ffmpeg handles the conversion of Audio/Video movies to MJpeg with audio, and I'll explore the header and the layers with an hex editor.
[EDIT 2]
OK. I've studied a Mjpeg with audio:
ffmpeg -i some_movie_with_music.mp4 -f avi -acodec mp3 -vcodec mjpeg mjpegWithSound.aviĀ
And there's an MP3 file splitted into the total number of frames under each jpeg plus some changes in the header. So it's easy to implement in a context where a mobile application would offer to the user the opportunity to add an MP3 files to a serie of jpeg or to a movie. So, one more reason to use Mjpeg when a platform has no encoder yet.
It's fun to watch your application take shape. :-) I'm going to assume this is a follow-on to your last question and that you want to write C# code to accomplish this task. Are you still writing this into an AVI container? AVI stands for "audio/video interleaved" and is designed to transport both audio and video.
So, yes, you should be able to write both MJPEG and audio into an AVI file.
Guess what! You have lots of options for audio codecs too. We haven't cataloged quite as many audio codecs as video codecs (but close). Good news, though: Implementing a basic audio encoder in pure C# should be much simpler than trying to port even an MPEG-1 video encoder. Alternatively, check around to see if you can find an MP3 encoder written in pure C#. AVI accommodates MP3. If not, try IMA ADPCM. It's easy to implement and gives you 4:1 compression. Thus, if you have a monophonic, 44100 Hz, 16-bit stream, that requires 88200 bytes/sec. IMA ADPCM will give you roughly 22050 bytes/sec (plus small overhead).
Related
I am recording audio from HTML and it is getting stored as .webm format.
I feeding that audio to google speech api to get the transcript from it.
I found out that .flac is lossless so I converted it from webm to flac using FFMPEG.
But i am having one doubt, converting audio from webm to flac increases the size of file but if an audio is already lossy with webm format converting to flac will still be lossy because the information is already lost.
Am i wrong with this assumption ?
Am i wrong with this assumption ?
No. FLAC conversion will only preserve the data in the source file. Any data lost during original conversion to WebM codec (Opus/vorbis) is gone.
I'm looking to convert some audio files into spectrograms. I'm wondering what the difference is between an m4a and wav file. If I have two of the same audio recording, one saved as wav and the other as m4a, will there be a difference in the spectrogram representations of both?
Both WAV and M4A are container formats, with options for how exactly audio data is encoded and represented inside the file. WAV file has one audio track with variety of encoding options including those possible for M4A format. However most often typically WAV refers to having uncompressed audio inside, where data is contained in PCM format.
M4A files are MP4 (MPEG-4 Part 14) files with an implication that there is one audio track inside. There are much less encoding options even though they still include both compressed and uncompressed ones. Most often M4A has audio encoded with AAC encoding, which is a lossy encoding. Depending on that loss, roughly on how much of information was lost during the encoding, your spectrogram could be different from the one built on original data.
The m4a format uses a lossy compression algorithm, so there may be differences, depending on compression level, and the resolution and depth of the spectrogram. The .wav format can also be lossy, due to quantization of the sound by an A/D or any sample format/rate conversions. So the difference may be in the noise floor, or in the portions of the sound's spectrum that are usually inaudible (due to masking effects and etc.) to humans.
I'm making an mp3 from a flac file with ffmpeg. This is usually hum-dum for me.
Tonight, for some reason, the converted audio is distorting when I use the same commands I've always used. After troubleshooting, it appears the problem is the "-out_sample_rate" flag.
My command:
ffmpeg -i input.flac -write_id3v1 1 -id3v2_version 3 -dither_method modified_e_weighted -out_sample_rate 44.1k -b:a 320k output.mp3
The audio in the mp3 is then incredibly distorted by a jacked gain resulting in digital clipping.
I've tried updating ffmpeg, and then problem remains. I've tried converting various sample rates (44.1k source files, 48k source files, 96k source files) to both 44.1k and 48k mp3s, problem remains whenever there's a conversion.
I'm on macOS, and I installed ffmpeg via homebrew.
Any ideas?
Are you sure the distortion comes from resampling?
Even the poorest resampling algorithm doesn't distort. More typical artifacts from poor resampling are harsh high frequencies due to aliasing and quantisation noise.
FFmpeg's resampler isn't the very best but it isn't bad at all. It shouldn't lead to distortion at all. Enough for average use.
How much headroom does the source file have?
If not enough, the resampling or the MP3 conversion may lead to clipping. The MP3 encoder removes frequencies from the signal (even at 320kbps) so the waveform will alter.
So reimport the encoded MP3 into an audio editor and look for clipping.
If not sure, which step the distortion comes from, split the command up and look, which step leads to clipping:
ffmpeg -i input.flac -write_id3v1 1 -id3v2_version 3 -dither_method modified_e_weighted -out_sample_rate 44.1k intermediate.flac
ffmpeg -i intermediate.flac -b:a 320k output.mp3
There should be a headroom of at least 1dB left before it gets converted to MP3. If not, lower the gain before.
If the resampling of the intermediate.flac leads to a significant gain in amplitude, the original input.flac is poorly mastered. If so (and the quality is really important), do the SR conversion in an audio editor (i.e. Audacity, it does a better resampling job than FFMpeg) and apply a limiter between the resampling and dithering step to lower the few strong peaks nicely.
If that doesn't help: What exactly does input.flac contain? Music? Noise? Speech? and is it selfmade or taken from something else?
i need the audio file for the yuv test video sequences like foreman.yuv, akiyo.yuv
,etc. none of the yuv sequences online have the audio file.
any other yuv sequences with the audio file suitable for encoder analysis also works.
I think there is no audio available for the sequences you mention.
But have a look at https://media.xiph.org/video/derf/, at the bottom of the page you have full sequences with flac audio.
When merging an audio and video with ffmpeg, the quality of the resultant video goes down. How to we improve the quality(esp. the audio quality)?
The audio quality gets degraded to quite an extent with random screeches in between.
I've tried converting the audio to mp3 and video to mp4 before merging them, tried various audio-video parameters like bitrate, sample-rate, qscale etc but still unsuccessful.
Any help would be greatly appreciated!
The -acodec copy command line option should just copy the audio stream and not re-encode it. For the video stream -vcodec copy. Also, you can used the -sameq option for the video stream.
See this answer for a little more detail.
i try using the -sameq, yeah it gives a good quality of the video but what I notice was the file size increases.
Using -q:v 1 applies the best possible resolution. Number between 1 and 35. One is the best quality.