We need to extract the volume information for every second from a video file in order to produce a graphical representation of volume changes during the video progress.
I'm trying to use FFMPEG with audio filter but I get stucked in how to extract the volume information for every second (or frame) and then export this information to some report file.
Thanks in advance.
Related
Is there an audio file format, where I can save all the individual chunks (recorded in javascript) while splitting them up at any point to save them to different files and have them still all playable?
Yes this is what WAV file does ... if you save the file to conform to WAV payload format you can play back the file you create as a WAV file even without file having its normal 44 byte header
I store raw audio data in arrays that can be sent to Web Audio API's AudioBuffer. The raw audio data arrays can be manipulated as you wish.
Specifics for obtaining the raw data are going to vary from language to language. I've not obtained raw data from within JavaScript. My experience comes from generating the data algorithmically or from reading .wav files with Java's AudioInputLine, and shipping the data to JavaScript via Thymefeaf.
I am converting .ima files, collected by an audiologger, into .wav format. It works fine, but when doing this I loose the information about the date/time at which the (original, .ima) files were created. Is there a way of having the .wav files somehow 'timestamped' so I could recover the date/time at which the audio was recorded?
Many thanks for any hint provided.
As commented, you can either:
Store the date/time information in the file name
For example, store files with file names in the format 2018-09-23-19-53-45.wav, or whatever time format you like.
Store the audio in Broadcast WAV format files (BWF)
Broadcast WAV is based on WAV format but allows for metadata in the file. The difference between a Broadcast WAV file and a normal WAV is the presence of the BEXT chunk, and as such the file is compatible with existing WAV players.
The BEXT chunk contains two appropriate fields called OriginationDate and OriginationTime. The layout for the chunk can be found here: BEXT Audio Metadata Information.
Here is the background of the problem I'm trying to solve:
I have a video file (MPEG-2 encoded) sitting on some remote server.
My job is to write a program to conduct the face detection on this video file. The output is the collection of frames on which the face(s) detected. The frames are saved as JPEG files.
My current thinking is like this:
Using a HTTP client to download the remote video file;
For each chunk of video data being downloaded, I split it on the GOP boundary; so the output of this step is gonna be a video segment that contains one or more GOPs;
Create a RDD for each video segment aligned on the GOP boundary;
Transform each RDD into a collection of frames;
For each frame, run face detection;
if the face is detected, mark it and save the frame to JPEG file
My question is: Is Apache-Spark the right tool for this kind of work? If so, could someone point me to some example does the similar thing?
How Can I compare audio volume level from two videos?
One of our clients complains about our output video (from DirectShow based application) increase the audio volume between 0.5db to 1db.
How Can I check this? Is there any external tool that can help me to check audio volume signal?
Thanks!
You need to inspect your filter graph and identify if there are any filters in the audio path, which could modify the data. You can insert a filter that gets you audio stream between the audio renderer, or earler in the pipeline; then when you grab the data, you can calculate volume levels and compare to reference values.
Small discrepancies (up to 1 dB, or slightly higher) can be a result of different level calculations or downmixing, yours or taking place somewhere on the way.
I checked through the questions asked on SO on audio metadata, but could not find one which answers my doubt. Where exactly is the metadata of audio files stored, and in what form? Is it in the form of files or in a database? And where is this database of files stored?
Thank you Michelle. My basic confusion was whether the metadata is stored as a part of the file or in a separate file which is stored somewhere else in the file system - like inode in case of Unix like systems. ID3 shows that it is stored with the file as a block of bytes after the actual content of the file.
Is this the way of metadata storage for most of the other file types?
As far as I know, audio file formats :
May support metadata standards (e.g. ID3v1, ID3v2, APEtag, iXML)
May also have their own native metadata format (e.g. MP4 boxes / Quicktime atoms, OGG/FLAC/OPUS/Speex/Theora VorbisComment, WMA native metadata, AIFF / AIFC native metadata...)
=> In these two cases, metadata is stored directly into the audio file itself.
HydrogenAudio maintains a field mapping table between the most common formats : http://wiki.hydrogenaud.io/index.php?title=Tag_Mapping
That being said, many audio players (e.g. iTunes, foobar2000) allow their users to edit any metadata field in any file, regardless of whether said fields are supported or not by the underlying tagging standards (e.g. adding an "Album Artist" field in an S3M file).
In order to do that, these audio players store metadata in their internal database, thus giving the illusion that the audio file has been "enriched" while its actual content remain unchanged.
Another classic use of audio player databases is to store the following fields :
Rating
Number of times played
Last time played
=> In that case, you'll find metadata in the audio player's internal database