I have a file .m4a audio file of size 805 kb which I wanted to convert it to wav. My purpose is to (1) get highest quality .wav audio file and (2) get 320 kbps .mp3 audio file.
I used an online service (https://www.online-convert.com) to convert it. When I converted directly without optional settings, the file size increased to 11.6 mb and when I converted the same audio with optional settings like, changing bit resolution to 32 Bit and changing sampling rate to 96000 Hz, the file size jumped to 50.7 mb
There are only two optional settings in the web service
Bit resolution - no change, 8 bit, 16 bit, 24 bit, 32 bit
Sampling rate - no change, 1000 hz, 8000, 11025, 16000, 22050, 24000, 32000, 44100, 48000, 96000 hz
And one raido button for Normalize audio that can be checked and unchecked
Can someone explain why the file size increases and what settings I must keep to get the highest quality from the original 805 kb audio?
Thanks
Related
I have been looking for a long time how to find sampleCount, but there is no answer. It is possible to say an algorithm or formula for calculation. It is known 850ms , the file weight is 37 KB, the resolution of the wav file , sampleRate is 48000.... I can check , you should get sampleCount equal to 40681 as I have in the file . this is necessary so that I can calculate sampleCount for other audio files.I am waiting for your help
I found and I get 40800 . I multiplied the rate with the time in seconds
Yes, the sample count is equal to the sample rate, multiplied by the duration.
So for an audio file that is exactly 850 milliseconds, at 48 kHz sample rate:
850 * 48000 = 40800 samples
Now, with MP3s you have to be careful. There is some padding at the beginning of the file for cleanly initializing the decoder, and the amount of padding can vary based on the encoder and its configuration. (You can read all about the troubles this has caused on the Wikipedia page for "gapless playback".) Additionally, your MP3 duration will be determined on MP3 frame boundaries, and not arbitrary PCM boundaries... assuming your decoder/player does not support gapless playback.
In my country we ever use the 25fps(PAL) for video, and for audio.
Yesterday I record a tv movie with vdr(mpeg-ts format) and mediainfo report this for audio and video
Audio is mp2, video h264
Audio
Format : MPEG Audio
Format version : Version 1
Format profile : Layer 2
Codec ID : 4
Duration : 3 h 58 min
Bit rate mode : Constant
Bit rate : 128 kb/s
Channel(s) : 2 channels
Sampling rate : 48.0 kHz
Frame rate : 41.667 FPS (1152 SPF)
Compression mode : Lossy
Delay relative to video : -406 ms
Stream size : 219 MiB (6%)
Video
Format : AVC
Format/Info : Advanced Video Codec
Format profile : High#L3
Format settings : CABAC / 4 Ref Frames
Format settings, CABAC : Yes
Format settings, Reference frames : 4 frames
Format settings, picture structure : Frame
Codec ID : 27
Duration : 3 h 58 min
Bit rate : 1 915 kb/s
Width : 720 pixels
Height : 576 pixels
Display aspect ratio : 16:9
Frame rate : 25.000 FPS
How is possible audio/video are in sync with a FPS of about 50fps on audio?
If I want to recode it, I have to recode audio on 25fps?
TLDR: You don't have to worry about it. Two different meanings for "frames per second".
MP3 is an interesting file format. It doesn't have a global header that represents the entire file. Instead MP3 is a concatenation of small individual files called "frames". Each frame is a few milliseconds in length. That's why you can often just chop an MP3 file in half and the second half plays just fine. It's what also enables VB3 MP3 to exist. The sample rate or encoding parameters can change at any point in the file.
So your particular MP3 has a "frame rate" of 41.667 frames per second. Now notice the SPF value of 1152 in parentheses. That's "samples per frame". If you do the math: 1152 samples/frame * 41.667 frames/second` is almost exactly 48000 samples per second. Identical to the sampling rate presented by the mediainfo tool.
When a media player plays a video file, it will basically render the video stream separate from the audio stream, so there's very little effort it needs to keep the different sample rates in sync.
As to your question about resampling for video. The encoding tool you use will do the right thing. The FPS for MP3 is completely orthogonal to the video FPS.
I am trying to use a program called arss to create a spectrogram from a wav file. I have 2 wav files, one works and the other does not (it was converted to wav from mp3).
The error that arss throws at me is:
This WAVE file is not currently supported.
Which is fine, but I have no idea what parts of my wav file to change so that it will be supported. The docs don't help here (as far as I can tell)
When I run mediainfo on both wav files, I get the following specs:
working wav:
General
Complete name : working.wav
Format : Wave
File size : 1.15 MiB
Duration : 6 s 306 ms
Overall bit rate mode : Constant
Overall bit rate : 1 536 kb/s
Audio
Format : PCM
Format settings : Little / Signed
Codec ID : 1
Duration : 6 s 306 ms
Bit rate mode : Constant
Bit rate : 1 536 kb/s
Channel(s) : 2 channels
Sampling rate : 48.0 kHz
Bit depth : 16 bits
Stream size : 1.15 MiB (100%)
not working wav:
General
Complete name : not_working.wav
Format : Wave
File size : 5.49 MiB
Duration : 30 s 0 ms
Overall bit rate mode : Constant
Overall bit rate : 1 536 kb/s
Writing application : Lavf57.83.100
Audio
Format : PCM
Format settings : Little / Signed
Codec ID : 1
Duration : 30 s 0 ms
Bit rate mode : Constant
Bit rate : 1 536 kb/s
Channel(s) : 2 channels
Sampling rate : 48.0 kHz
Bit depth : 16 bits
Stream size : 5.49 MiB (100%)
Comparing the audio specs of both files, I can't tell any difference between anything other than the file size and duration. I even updated the Sampling rate of the non-working wav using ffmpeg so that it would match the working one at 48.0kHz, but no luck.
Any idea?
Both wav files are available here.
FFmpeg, by default, writes a LIST chunk, with some metadata, before the data chunk. ARSS has a rigid parser and expects the data chunk to start at a fixed byte offset (0x24). FFmpeg can be told to skip writing the LIST chunk using the bitexact option.
ffmpeg -i not_working.wav -c copy -bitexact new.wav
Note that ARSS doesn't check for sampling rate, only that WAVs have little endian PCM.
Here's a related Q, not quite a duplicate, linked for future readers:
ffmpeg - Making a Clean WAV file
I have an older version of ffmpeg and the -bitextact option alone did not help much. That is, it did remove some of the data from the hunk, but the LIST hunk was still present with other data.
You may have to also ask for no metadata at all with the -map_metadata option like so:
ffmpeg ... -i <input> ... -flags +bitexact -map_metadata -1 ... <output>
(The ... represent location with other command line options as required in your case)
By adding the -map_metadata -1, it really removed everything and the LIST hunk is now fully gone.
We use FreePBX to record a conference line. This Line appears not to have disconnected and it created a continuous WAV file for 209 hours.
[matt#ait-debian ~/SLP ]$ mediainfo 7000-7000-always-20170823-162901-
1503469728.35757-1503469748.wav
General
Complete name : 7000-7000-always-20170823-162901-
1503469728.35757-1503469748.wav
Format : Wave
File size : 11.2 GiB
Duration : 209 h
Overall bit rate mode : Constant
Overall bit rate : 128 kb/s
Audio
Format : PCM
Format settings, Endianness : Little
Format settings, Sign : Signed
Codec ID : 1
Duration : 209 h
Bit rate mode : Constant
Bit rate : 128 kb/s
Channel(s) : 1 channel
Sampling rate : 8 000 Hz
Bit depth : 16 bits
Stream size : 11.2 GiB (100%)
But when I check with sox (Sound Exchange) it shows only 60hours worth of audio. VLC shows the same when listening to the file.
[matt#ait-debian ~/SLP ]$ soxi 7000-7000-always-20170823-162901-1503469728.35757-1503469748.wav
Input File : '7000-7000-always-20170823-162901-1503469728.35757-
1503469748.wav'
Channels : 1
Sample Rate : 8000
Precision : 16-bit
Duration : 60:22:32.63 = 1738821024 samples ~ 1.63014e+07 CDDA sectors
File Size : 12.1G
Bit Rate : 444k
Sample Encoding: 16-bit Signed Integer PCM
The issue is that some timer after the 60 hours, at the about 72 hour mark another conference call was made that I need the recording for.
Now I would have thought that the conference continued to record so it should have recorded this audio.
Issue is. VLC, SOX don't see it. But mediainfo says there is 209h worth. So which is correct. I would think that VLC, SOX should show 109h duration.
Can anyone help or advise what happened?
I also posted this to reddit - https://www.reddit.com/r/linuxquestions/comments/6xbz8u/wav_file_missing_audio/
And was able to get an answer:
Using Audacity
Import the WAV file as RAW Data
Set Encoding to Signed 16-bit PCM Set
Set the Start Offset to 44 bytes
Set the Sample Rate 8000 Hz
I have this data:
Bit speed: 276 kilobytes/seconds
File size: 6.17 MB
Channels: 2
Layer: 3
Frequency: 44100 HZ
How can I retrieve the audio duration in seconds or milliseconds?
You can't. To get the duration you need the sampling rate in samples per second but also the number of channels (mono, stereo, etc.), and the sample length in bytes (1 to 3 usually). And unless it is a raw audio there is also additional data that takes some space. 276kpbs does not help here. If it is a mP3 the file is compressed, you simply can't just by looking at the file size.