What is the difference between these 2 wav files? - audio

I am trying to use a program called arss to create a spectrogram from a wav file. I have 2 wav files, one works and the other does not (it was converted to wav from mp3).
The error that arss throws at me is:
This WAVE file is not currently supported.
Which is fine, but I have no idea what parts of my wav file to change so that it will be supported. The docs don't help here (as far as I can tell)
When I run mediainfo on both wav files, I get the following specs:
working wav:
General
Complete name : working.wav
Format : Wave
File size : 1.15 MiB
Duration : 6 s 306 ms
Overall bit rate mode : Constant
Overall bit rate : 1 536 kb/s
Audio
Format : PCM
Format settings : Little / Signed
Codec ID : 1
Duration : 6 s 306 ms
Bit rate mode : Constant
Bit rate : 1 536 kb/s
Channel(s) : 2 channels
Sampling rate : 48.0 kHz
Bit depth : 16 bits
Stream size : 1.15 MiB (100%)
not working wav:
General
Complete name : not_working.wav
Format : Wave
File size : 5.49 MiB
Duration : 30 s 0 ms
Overall bit rate mode : Constant
Overall bit rate : 1 536 kb/s
Writing application : Lavf57.83.100
Audio
Format : PCM
Format settings : Little / Signed
Codec ID : 1
Duration : 30 s 0 ms
Bit rate mode : Constant
Bit rate : 1 536 kb/s
Channel(s) : 2 channels
Sampling rate : 48.0 kHz
Bit depth : 16 bits
Stream size : 5.49 MiB (100%)
Comparing the audio specs of both files, I can't tell any difference between anything other than the file size and duration. I even updated the Sampling rate of the non-working wav using ffmpeg so that it would match the working one at 48.0kHz, but no luck.
Any idea?
Both wav files are available here.

FFmpeg, by default, writes a LIST chunk, with some metadata, before the data chunk. ARSS has a rigid parser and expects the data chunk to start at a fixed byte offset (0x24). FFmpeg can be told to skip writing the LIST chunk using the bitexact option.
ffmpeg -i not_working.wav -c copy -bitexact new.wav
Note that ARSS doesn't check for sampling rate, only that WAVs have little endian PCM.
Here's a related Q, not quite a duplicate, linked for future readers:
ffmpeg - Making a Clean WAV file

I have an older version of ffmpeg and the -bitextact option alone did not help much. That is, it did remove some of the data from the hunk, but the LIST hunk was still present with other data.
You may have to also ask for no metadata at all with the -map_metadata option like so:
ffmpeg ... -i <input> ... -flags +bitexact -map_metadata -1 ... <output>
(The ... represent location with other command line options as required in your case)
By adding the -map_metadata -1, it really removed everything and the LIST hunk is now fully gone.

Related

Sox audio concatenation file length wrong

I am trying to concatenate multiple audio files using Sox. Each file is very hi-res: 4ch, PCM, 256k sampling (yes, not a typo), 24 bit. Each file is approx 2 mins long. I can concatenate up to 9 files successfully with: sox file1.wav file2.wav file3.wav outfile.wav.
After 9 files I have the following sox summary which is correct:
Channels : 4
Sample Rate : 256000
Precision : 24-bit
Duration : 00:21:33.89 = 331236000 samples ~ 97041.8 CDDA sectors
File Size : 3.97G
Bit Rate : 24.6M
Sample Encoding: 24-bit Signed Integer PCM
When I add a 10th ~2 minute file I get:
Channels : 4
Sample Rate : 256000
Precision : 24-bit
Duration : 00:00:25.18 = 6445658 samples ~ 1888.38 CDDA sectors
File Size : 4.37G
Bit Rate : 1.39G
Sample Encoding: 24-bit Signed Integer PCM
You'll note here that we went from a length of 00:21:33.89 to a length of 00:00:25.18 with a corresponding drop in samples. Expected result would be a file of ~00:23:xx.xx with the 2 minutes added. The actual file size grew from 3.97GB to 4.37GB so the data is there, it appears to be a problem in the header.
Does anyone know of an upper limit in sox that we might be meeting?
Alternatively does anyone know how I might fix the file post facto? I tried sox --ignore-length infile.wav outfile.wav but the output file was identical.
Thanks

Video is 25fps, audio 50 fps?

In my country we ever use the 25fps(PAL) for video, and for audio.
Yesterday I record a tv movie with vdr(mpeg-ts format) and mediainfo report this for audio and video
Audio is mp2, video h264
Audio
Format : MPEG Audio
Format version : Version 1
Format profile : Layer 2
Codec ID : 4
Duration : 3 h 58 min
Bit rate mode : Constant
Bit rate : 128 kb/s
Channel(s) : 2 channels
Sampling rate : 48.0 kHz
Frame rate : 41.667 FPS (1152 SPF)
Compression mode : Lossy
Delay relative to video : -406 ms
Stream size : 219 MiB (6%)
Video
Format : AVC
Format/Info : Advanced Video Codec
Format profile : High#L3
Format settings : CABAC / 4 Ref Frames
Format settings, CABAC : Yes
Format settings, Reference frames : 4 frames
Format settings, picture structure : Frame
Codec ID : 27
Duration : 3 h 58 min
Bit rate : 1 915 kb/s
Width : 720 pixels
Height : 576 pixels
Display aspect ratio : 16:9
Frame rate : 25.000 FPS
How is possible audio/video are in sync with a FPS of about 50fps on audio?
If I want to recode it, I have to recode audio on 25fps?
TLDR: You don't have to worry about it. Two different meanings for "frames per second".
MP3 is an interesting file format. It doesn't have a global header that represents the entire file. Instead MP3 is a concatenation of small individual files called "frames". Each frame is a few milliseconds in length. That's why you can often just chop an MP3 file in half and the second half plays just fine. It's what also enables VB3 MP3 to exist. The sample rate or encoding parameters can change at any point in the file.
So your particular MP3 has a "frame rate" of 41.667 frames per second. Now notice the SPF value of 1152 in parentheses. That's "samples per frame". If you do the math: 1152 samples/frame * 41.667 frames/second` is almost exactly 48000 samples per second. Identical to the sampling rate presented by the mediainfo tool.
When a media player plays a video file, it will basically render the video stream separate from the audio stream, so there's very little effort it needs to keep the different sample rates in sync.
As to your question about resampling for video. The encoding tool you use will do the right thing. The FPS for MP3 is completely orthogonal to the video FPS.

aplay quits before entire song is read

Problem statement: I do not receive the entire song in the output when dumped as a file (I can't hear the song through the jack but I could dump the file contents).
excerpts: I am new to ALSA programming and I have an embedded board with limited set of commands. I have gone through the links here: ALSA tutorial required but I couldn't figure out this timing related issues.
Setup:
OS: linux 4.14.70
aplay: version 1.1.4 by Jaroslav Kysela <perex#perex.cz>
Advanced Linux Sound Architecture Driver Version k4.14.70.
The audio box involved has a separate hardware and a separate DSP for stand alone processing
Flow of information: Linux -> DSP core
The input song is communicated to the linux core, loading the song into DMA area -> Read DMA into separate DMA ring buffer area used by the DSP and write it to I2S output path into a file
I could see the size of the song is 960000 Bytes, with sample rate of 48000, S16_LE formwat, 2 channel, 16 bit bit-depth -> which calculates as shown below - as per the page "https://www.colincrawley.com/audio-duration-calculator/"
Bit Rate: 1536 kbps
Duration:
0 Hours : 0 Minutes : 5 Seconds . 34 Milliseconds
As I put the logs, my DSP core is processing the song only for a period of approx. 1 second before "aplay" application sends an ioctl call to close the audio interface on the linux.
My questions are:
How does aplay understand time ? For a 5 seconds time, how could we be sure that it has run for a period of 5 seconds.
Is there a way to understand to wait until entire song is transmitted to the DSP core to process before the close IOCTL command is issued?
Some more info about the input file I am feeding:
stream : PLAYBACK
access : RW_INTERLEAVED
format : S16_LE
subformat : STD
channels : 2
rate : 48000
exact rate : 48000 (48000/1)
msbits : 16
buffer_size : 24000
period_size : 6000
period_time : 125000
tstamp_mode : NONE
tstamp_type : MONOTONIC
period_step : 1
avail_min : 6000
period_event : 0
start_threshold : 24000
stop_threshold : 24000
silence_threshold: 0
silence_size : 0
boundary : 6755399441055744000
appl_ptr : 0
hw_ptr : 0
I would be happy to provide more information into understanding why the aplay application closes the song early. But please be aware that its a closed-source project.
Command I use:
aplay input.wav -c 2 -r 48000 -t wav
input size: 960044 bytes (including wav header)
output size: 306 KB observed before IOCTL call to close the audio interface occurs.
For a input.wav file of 960044 file size i.e., 938 KB,
time aplay input.wav returns:
real 0m0.988s
user 0m0.012s
sys 0m0.080s
To find the duration of wav file:
fileLength/(sampleRate*channel*bits per sample/8) = 960000/((48000 * 2 * 16)/8) = 5 seconds
If I run the same song on my Ubuntu machine, it is as expected:
real 0m5.452s
user 0m0.025s
sys 0m0.029s
Any hints on why this could occur ? As seen above, I could see the aplay applications quits in 0.98 seconds. But the song has to be played for 5 seconds.
Looks like there was some problem with the custom audio hardware that was delaying the timing to process the song. It now seems to be fixed. Basically put, the sound hardware has to give sufficient delays atleast as per AXI protocol as the bytes are being read. But in this case, it is not so. Its now resolved. Thanks for the attention

WAV File missing Audio and Recordings

We use FreePBX to record a conference line. This Line appears not to have disconnected and it created a continuous WAV file for 209 hours.
[matt#ait-debian ~/SLP ]$ mediainfo 7000-7000-always-20170823-162901-
1503469728.35757-1503469748.wav
General
Complete name : 7000-7000-always-20170823-162901-
1503469728.35757-1503469748.wav
Format : Wave
File size : 11.2 GiB
Duration : 209 h
Overall bit rate mode : Constant
Overall bit rate : 128 kb/s
Audio
Format : PCM
Format settings, Endianness : Little
Format settings, Sign : Signed
Codec ID : 1
Duration : 209 h
Bit rate mode : Constant
Bit rate : 128 kb/s
Channel(s) : 1 channel
Sampling rate : 8 000 Hz
Bit depth : 16 bits
Stream size : 11.2 GiB (100%)
But when I check with sox (Sound Exchange) it shows only 60hours worth of audio. VLC shows the same when listening to the file.
[matt#ait-debian ~/SLP ]$ soxi 7000-7000-always-20170823-162901-1503469728.35757-1503469748.wav
Input File : '7000-7000-always-20170823-162901-1503469728.35757-
1503469748.wav'
Channels : 1
Sample Rate : 8000
Precision : 16-bit
Duration : 60:22:32.63 = 1738821024 samples ~ 1.63014e+07 CDDA sectors
File Size : 12.1G
Bit Rate : 444k
Sample Encoding: 16-bit Signed Integer PCM
The issue is that some timer after the 60 hours, at the about 72 hour mark another conference call was made that I need the recording for.
Now I would have thought that the conference continued to record so it should have recorded this audio.
Issue is. VLC, SOX don't see it. But mediainfo says there is 209h worth. So which is correct. I would think that VLC, SOX should show 109h duration.
Can anyone help or advise what happened?
I also posted this to reddit - https://www.reddit.com/r/linuxquestions/comments/6xbz8u/wav_file_missing_audio/
And was able to get an answer:
Using Audacity
Import the WAV file as RAW Data
Set Encoding to Signed 16-bit PCM Set
Set the Start Offset to 44 bytes
Set the Sample Rate 8000 Hz

Sound issues with RaspberryPi 2 and OSMC (Kodi)

I have sound issues with some video files I want to play from an external HDD with my Raspberry Pi 2 and OSMC. Instead of the video sound I hear a very loud chattering, rusteling sound. Kind of like a machine gun sound in an old video game. The Error only occurs with certain video files, the others work fine.
As audio output, I use HDMI as the Raspberry is connected to my projector. This is then connected to my speakers.
As I thought it might be a codec problem, I allready bought and activated MPEG-II and VC1 codecs, but this did not help.
Here some information on one of the malfunctioning files:
General
Complete name : hds-hp1-rmx.mkv
Format : Matroska
Format version : Version 2
File size : 24.1 GiB
Duration : 2h 38mn
Overall bit rate mode : Variable
Overall bit rate : 21.7 Mbps
Encoded date : UTC 2011-04-16 09:29:16
Writing application : mkvmerge v4.6.0 ('Still Crazy After All These Years') gebaut am Mar 10 2011 02:50:32
Writing library : libebml v1.2.0 + libmatroska v1.1.0
Video
ID : 1
Format : VC-1
Format profile : Advanced#L3
Codec ID : V_MS/VFW/FOURCC / WVC1
Codec ID/Hint : Microsoft
Duration : 2h 38mn
Width : 1 920 pixels
Height : 1 080 pixels
Display aspect ratio : 16:9
Frame rate mode : Constant
Frame rate : 23.976 fps
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
Compression mode : Lossy
Default : No
Forced : No
Audio #1
ID : 2
Format : AC-3
Format/Info : Audio Coding 3
Mode extension : CM (complete main)
Format settings, Endianness : Big
Codec ID : A_AC3
Duration : 2h 38mn
Bit rate mode : Constant
Bit rate : 448 Kbps
Channel(s) : 6 channels
Channel positions : Front: L C R, Side: L R, LFE
Sampling rate : 48.0 KHz
Bit depth : 16 bits
Compression mode : Lossy
Stream size : 509 MiB (2%)
Language : German
Default : Yes
Forced : Yes
Audio #2
ID : 3
Format : DTS
Format/Info : Digital Theater Systems
Format profile : MA / Core
Mode : 16
Format settings, Endianness : Big
Codec ID : A_DTS
Duration : 2h 38mn
Bit rate mode : Variable
Bit rate : Unknown / 1 509 Kbps
Channel(s) : 6 channels
Channel positions : Front: L C R, Side: L R, LFE
Sampling rate : 48.0 KHz
Bit depth : 16 bits
Compression mode : Lossless / Lossy
Language : English
Default : No
Forced : No
I have the same problem with files with the following codecs:
V_MPEG4/ISO/AVC (Video) & A_AC3 (Audio)
But V_MPEG4/ISO/AVC (Video) & A_DTS (Audio) work fine...
I am thankful for any hint.
Kind regards,
Mathias
Alright,
after hours of Google Search and even some scripting I found the solution.
As often, it was unbelievably simple...
The problem, in my case, was, that my projector simply does not support AC3. So I only had to deactivate the AC3 Audio passthrough in the Kodi Settings and now it works like a charm.
I hope this also helps someone else, than only me. This is why I will keep this monologue of mine online.

Resources