Wav file becomes too large by converting from m4a file - audio

I have a 1.6MB m4a file which is only 1 min 43. And when I converted it to wav file, it became to a 19.8MB file.
Why did the file become so large?
m4a:
https://drive.google.com/file/d/1Mc3SmBWZEFW9ZZoCFEz8Xxwy2P-oQs2t/view
wav: https://drive.google.com/file/d/1XFRD5f51BxBtQIBaiYNpTZZifCQLw3Km/view

Simple Answer : you are comparing between a zip and non zip file of media.
wav : PCM data - which is uncompressed data. ( un zip file )
m4s: AAC data - AAC is compressed data. ( zip file)
Advanced Audio Coding (AAC) is an audio coding standard for lossy digital audio compression.

WAV is a raw uncompressed file format. M4A is a compressed format. So if you want to keep quality, the WAV file will have to be bigger.

Related

how many maximum no. of channels in an audio file we can create with FFMPEG amerge filter?

how many maximum no. of channels in an audio file we can create with FFMPEG amerge filter?
We have a requirement to merge multiple single channel audio files into multi channel single audio file.
Each channel represents the speaker in the audio file.
I tried amerge filter and could do it upto 8 files. I am getting blank audio file when I try to do it for 10 audio files, and I think the FFMPEG amerge filter command doesn't produce any error either.
Can I create N no. of multi-channel audio files with N no. of files? Here N may be 100+? Is it possible?
I am new to this audio api etc. so any guidance is appreciated.
how many maximum no. of channels in an audio file we can create with FFMPEG amerge filter? We have a requirement to merge multiple single channel audio files into multi channel single audio file.
Max inputs is 64. According to ffmpeg -h filter=amerge:
inputs <int> ..F.A...... specify the number of inputs (from 1 to 64) (default 2)
Or look at the source code at libavfilter/af_amerge.c and refer to SWR_CH_MAX.
Can I create N no. of multi-channel audio files with N no. of files? Here N may be 100+? Is it possible?
Chain multiple amerge filters with a max of 64 inputs per filter. Or use the amix filter that has a max of 32767.

How to maintain normalization when converting normalized WAV files to mp3?

I have a script that uses sox to first normalize a bunch of wav files. It then takes the normalized wav files and converts them to mp3. I use the max amplitude stat to check how 'normalized' the files are. The max amp stats of the normalized files are within the same range. When I look at the max amplitude stats of the mp3 files, they are not maintaining the same close range. How can I maintain normalization when converting from wav to mp3?
The command I use to normalize the files:
sox file.wav --norm=-1 norm.wav
The command I use to convert the files to mp3:
sox norm.wav -c 1 newFile.mp3

Does Unzip algorithm runs over whole compressed data? or over certain number of bytes?

I have to unzip a file which is being downloaded from a server. I have gone through the zip file structure. What I want to understand is, if compressed data is constructed using how many bytes of a data bytes? Is the compression algorithm runs over all the file and generates output or compression algorithm runs on let's say 256 bytes, output result, and select next 256 bytes.
Similarly, do I need to download the whole file before running uncompressing algorithm? or I can download 256 bytes ( for example) and run the algorithm on it?

Merge multiple audio files into one file

I want to merge two audio files and produce one final file. For example if file1 has length of 5 minutes and file2 has length of 4 minutes, I want the result to be a single 5 minutes file, because both files will start from 0:00 seconds and will run together (i.e overlapping.)
You can use the APIs in the Windows.Media.Audio namespace to create audio graphs for audio routing, mixing, and processing scenarios. For how to create audio graphs please reference this article.
An audio graph is a set of interconnected audio nodes. The two audio files you want to merge supply the "audio input nodes", and "audio output nodes" are the destination single file for audio processed by the graph.
The scenario 4 of AudioCreatio official sample - Submix, just provide the feature you want. Provide two files it will output the mixed audio, but change the output node to AudioFileOutputNode for saving to a new file since the sample create AudioDeviceOutputNode for playing.

splitting a flac image into tracks

This is a follow up question to Flac samples calculation.
Do I implement the offset generated by that formula from the beginning of the file or after the metadata where the stream starts (here)?
My goal is to programmatically divide the file myself - largely as a learning exercise. My thought is that I would write down my flac header and metadata blocks based on values learned from the image and then the actual track I get from the master image using my cuesheet.
Currently in my code I can parse each metadata block and end up where the frames start.
Suppose you are trying to decode starting at M:S.F = 3:45.30. There are 75 frames (CDDA sectors) per second, and obviously there are 60 seconds per minute. To convert M:S.F from your cue sheet into a sample offset value, I would first calculate the number of CDDA sectors to the desired starting point: (((60 * 3) + 45) * 75) + 30 = 16,905. Since there are 75 sectors per second, assuming the audio is sampled at 44,100 Hz there are 44,100 / 75 = 588 audio samples per sector. So the desired audio sample offset where you will start decoding is 588 * 16,905 = 9,940,140.
The offset just calculated is an offset into the decompressed PCM samples, not into the compressed FLAC stream (nor in bytes). So for each FLAC frame, calculate the number of samples it contains and keep a running tally of your position. Skip FLAC frames until you find the one containing your starting audio sample. At this point you can start decoding the audio, throwing away any samples in the FLAC frame that you don't need.
FLAC also supports a SEEKTABLE block, the use of which would greatly speed up (and alter) the process I just described. If you haven't already you can look at the implementation of the reference decoder.

Resources