WAV to AIF conversion - difference between recorded AIF - audio

In audio terms there is no difference between AIF and WAV because they're both uncompressed audio. The only difference is the byte order (endianness).
My question is, can any software tell the difference between an AIF that is recorded as such and an AIF that was recorded WAV and converted? I've looked at a hex editor and there appears to be a difference in the chunks - the recorded AIF has more empty space in the COMM and SSND chunks, it would seem.
Is there a reason for this?
Many Thanks

"...the recorded AIF has more empty space in the COMM and SSND chunks, it would seem."
That might be a problem with the specific recorder you use.
In general there is no size difference in the uncompressed PCM data. I've tested a 10-second AAC file converted into WAVE and also into AIFF, result is both formats have the PCM data at 1572864 bytes long.
Also explain "more empty space in the COMM and SSND chunks" since...
COMM only holds 10 bytes worth of metadata, but in WAV file there'll be up to 84 bytes for metadata.
SSND is 16 bytes followed by PCM data, in .wav the DATA chunk is 8 bytes followed by PCM.

Related

incorrect determination of the number of channels when decoding the opus

I am sending audio chunks to the opus decoding function in C++. I was sending chunks of 1024 bytes, but in this case the decoder incorrectly determined the number of channels and, it seems, worked out incorrectly. For some audio, you need to send ~6000 bytes a piece, for others ~2000 bytes. I went through and found these values. What could be the reason for this behavior?
I have read the opus documentation. The codec header can't be that big.
How I think it works. The decoder reads the header of the first chunk and returns the number of channels. But for different audio, it needs a lot more bytes of audio to determine the number of channels.
The number of bytes needed to initially decode Ogg Opus files most likely depends on the size of the Ogg pages (up to 64KB). During past decoding tests, I noticed that Ogg files with lower bitrates and frame sizes generated smaller pages, and thus decoded sooner with fewer bytes. It's worth inspecting your Ogg Opus files with opusinfo (see opus-tools) to find the correlation.

Batch amplification of PCM audio using sox

I have a large number of .PCM files (248 total) that are all encoded as:
Encoding: Signed 16-bit uncompressed PCM
Byte order: Little-endian
Channels: 2 channel (stereo)
Sample rate: 44100 Hz
8 Byte header
I need to apply a -7.5 db amplification (deamplification?) to every single one of these files.
The problem I have is that all of these tracks are looped, and I need to preserve the loop data (contained in the 8-byte header).
I've yet to see a batch audio editing problem that sox couldn't handle, so I'm hoping someone on here would know how to use sox to accomplish this, or failing that, know of a program that can do this for me.
Thanks for the help!
*Edit- A bit of research got me the exact encoding of the PCM audio I need to edit:
"The audio tracks are 44.1 kilohertz, 16-bit stereo uncompressed unsigned PCM files in little-endian order, left channel first, with a simple eight-byte header. The first four bytes spell out “MSU1” in ASCII. This is followed by a 32-bit unsigned integer used as the loop point, measured in samples (a sample being four bytes) – if the repeat bit is set in the audio state register, this value is used to determine where to seek the audio track to."
*Edit2-I've managed to develop the needed sox command, I just have no idea how to turn it into a batch. Also, turns out the files were 16-bit signed, not unsigned, PCM.
sox -t raw -e signed -b 16 -r 44100 -c 2 -L [filename].pcm -t raw -L [filename].raw vol -7.5dB
I'm fine with either a .BAT I drag and drop files onto or a .BAT that just converts every .PCM file in the folder.
Help appreciated, because I don't even know where to start looking for this one...

Can we get relation between time in second and bytes of audio file?

I want relation between time and bytes in ogg file. If I have 5 second ogg and it's length 68*1024 bytes. If I chunk from that ogg file and save it can I knew that size from before chunk? Like I knew it I want to chunk from 2.4 to 3.2.
And give some mathematical calculation and get accurate answer of bytes I can get. Can anyone tell me please if this is possible?
Bit rate 128kbps, 16 bit , sample rate - 44.1Khz, stereo
I used below logic but can't get accurate answer.
Click here
Any such direct mapping between file size and play time will work, but not if the codec uses variable bit rate (vbr) encoding ... meaning the compression algorithm is vbr if its success in compressing is dependent on the informational density of the source media ... repetitive audio is more efficiently compressed than say random noise ... vbr algorithms are typically more efficient since to maintain a constant bit rate the algo pads the buffer with filler data just so its throughput is in constant bytes per second

flac codec, 2 files, same duration, but different memory sizes

So I have 2 audio flac files converted from mp4 files. Both are 31 seconds long but one is of 1MB and the other one comes out to be of 4MB. Well, I am using ffmpeg with 8000 sample rate in exactly the same manner. Can anyone explain why this could be happening ?
Is there any particular way in which the mp4 source file has to be coded ? or any other pointers please ?
Thanks already,
asmi
Flac files are compressed using lossless compression so the output file size depends on how well that compression works on a particular file. So even for input with the same duration you would expect the output size to vary.
It is only if you were producing uncompressed output (such as a Wav file) that you would expect the sizes to be the same.

a-law/raw audio data

I have spent the evening messing around with raw A-law audio input/output from the built in ALSA tools aplay and arecord, and passing them through an offline moving average filter I have written.
My question is: the audio seems to be encoded using values between 0x2A and 0xAA - a range of 128. I have been reading through this guide which is informative but doesn't really explain why and offset of 42 (0x2A) has been chosen. The file I used to examine this was a square wave exported from audacity as unsigned 8-bit 8kHz audio and examined in a hex editor.
Can anyone shed some light on how A-law is encoded in a file?
This may help;
/dev/dsp
8000 frames per second, 8 bits per frame (1 byte);
# Max volume = \xff (or \x00).
# No volume = \x80 (the middle).

Resources