Compare audio volume from two videos - audio

How Can I compare audio volume level from two videos?
One of our clients complains about our output video (from DirectShow based application) increase the audio volume between 0.5db to 1db.
How Can I check this? Is there any external tool that can help me to check audio volume signal?
Thanks!

You need to inspect your filter graph and identify if there are any filters in the audio path, which could modify the data. You can insert a filter that gets you audio stream between the audio renderer, or earler in the pipeline; then when you grab the data, you can calculate volume levels and compare to reference values.
Small discrepancies (up to 1 dB, or slightly higher) can be a result of different level calculations or downmixing, yours or taking place somewhere on the way.

Related

Watermarking by key in Spark structured streaming

I have data coming in on Kafka from IoT devices. The timestamps of the sensor data of these devices are often not in sync due to network congestion, device being out of range, etc.
We have to write streaming jobs to aggregate sensor values over a window of time for each device independently. With the groupby with watermark operation, we lose the data of all devices that lag behind device with latest timestamp.
Is there any way that the watermarks could be applied separately to each device based on the latest timestamp for that device, and not the latest timestamp across all devices?
We cannot keep a large lag as the device could be out of range for days. We cannot run an individual query for each device as the number of devices is high.
Would it be achievable using flatMapGroupsWithState? Or is this something that cannot be achieved with Spark Structured Streaming at all?
I think instead of watermarking by event timestamp (which could be lagging behind as you said), you could apply a watermark over the processing timestamp (i.e. the time when you process the data in your Spark job). I faced a very similar problem in a recent project I was working on and that's how I solved it.
Example:
val dfWithWatermark = df
.withColumn("processingTimestamp", current_timestamp())
.withWatermark("processingTimestamp", "1 day")
// ... use dfWithWatermark to do aggregations etc
This will keep a state over 1 day of your IoT data, no matter what the timestamp of the data is that you're receiving.
There are some limitations to this of course, for example if there are devices that send data in intervals larger than your watermark. But to figure out a solution for this you'd have to be more specific with your requirements.
By using flatMapGroupsWithState you can be very specific with your state keeping, but I don't think it's really necessary in your case.
Edit: if you however decide to go with flatMapGroupsWithState, you can use different timeouts per device group by calling state.setTimeoutDuration() with different intervals, depending on the type of device you process. This way you can be very specific with the state keeping.

How to use apache spark to face detection in video stream

Here is the background of the problem I'm trying to solve:
I have a video file (MPEG-2 encoded) sitting on some remote server.
My job is to write a program to conduct the face detection on this video file. The output is the collection of frames on which the face(s) detected. The frames are saved as JPEG files.
My current thinking is like this:
Using a HTTP client to download the remote video file;
For each chunk of video data being downloaded, I split it on the GOP boundary; so the output of this step is gonna be a video segment that contains one or more GOPs;
Create a RDD for each video segment aligned on the GOP boundary;
Transform each RDD into a collection of frames;
For each frame, run face detection;
if the face is detected, mark it and save the frame to JPEG file
My question is: Is Apache-Spark the right tool for this kind of work? If so, could someone point me to some example does the similar thing?

How to extract video's file volume information using FFMPEG?

We need to extract the volume information for every second from a video file in order to produce a graphical representation of volume changes during the video progress.
I'm trying to use FFMPEG with audio filter but I get stucked in how to extract the volume information for every second (or frame) and then export this information to some report file.
Thanks in advance.

Enhance my Core Data design. Experts only!

In AcaniUsers, I'm downloading the closest 20 users to me and displaying their profile pictures as thumbnails in a table view. User & Photo are both Resources because they each have an id (MongoDB BSON ObjectId) on the server. Each user has a unique_id. Each Photo has four different sizes (images) on the server: square: 75x75, square#2x: 150x150, large: 320x480, large#2x: 640x960. But, each device will only have two of these sizes, depending on whether it's an iPhone 3 or 4 (retina display). Each of these sizes has their own MongoDB collection. And, all four images for each Photo have the same BSON ObjectId's across these four collections.
In the future, I may give User a relationship called photos to allow a user to have more than one photo. Also, although I don't foresee this, I may add more Image sizes (types).
The fresh attribute on Image tells me whether I've downloaded the latest Image. I set this to NO whenever the Photo's ID has changed, and then back to yes after I've finished downloading the Image.
Should I store the four different images in Core Data or on the file system and just store their URLs in Core Data? I read somewhere that over 1 or 2MB, you should store in file system, not Core Data. So, I was thinking of storing the square images in Core Data and the large images in the file system, but I'd rather store them all the same way to make things easier. So, maybe I'll just store them all in the file system? What do you think?
Do you think I should discard the 75x75 & 320x480 sizes since pretty soon iPhone 3's will be gone?
How can I improve my design of the entities, and their attributes and relationships. For example, is the Resource entity even beneficial at all?
I'm displaying the Users with an NSFetchedResultsController. However, it doesn't know when the User's image gets updated, so the images don't show up until I scroll aggressively the first time. How do I let the NSFetchedResultsController know that a user's thumbnail has finished downloading? Do I have to use KVO?
To answer your questions:
1 I'd store them all in the file system and record the URL in the database. I've never been a big fan of storing image data in the DB. Plus it'll simplify things a little to have all of the image storage uniform. That way in your image loading code you don't have to worry about if it's a type that's stored in the DB or on the file system.
2 No, I wouldn't do that yet. The iPhone 3 is going to be around for a bit longer. ATT is still selling them as the cheap entry level iPhone. I just saw a commercial the other night advertising them for $49.
3 Remove the Resources entry and add the id attribute to each of the classes. How you did it is actually bad. Abstract entities should only be used when you have a couple of entities that are almost identical and only have a few differences between them. Under the hood, Core Data will make only one table for an abstract entity and all of its children. So right now you're going to end up with only one table that will contain both your user and photo entries which can be bad when you're trying to query just type of entity.
You should also delete the Image entity and move its attributes into the Photo entity. The Photo will always have those values associated with it and the same values won't be shared between photos. Having them as a separate entity will cause a slow down. You'll either need to load them with the photos which will require a join (slow) or they'll be loaded one at a time when you access either the data or fresh attributes which is also slow. When each of the faults is fired in the latter scenario a separate query and round trip to the disk will happen for each object. So when you loop through your pictures for display in the table, you'll be firing n queries instead of one which can be a big difference in performance.
4 You can use KVO to do it. Have your table cell observer the User or Picture (depends on if you have the Picture already added to the user and are changing the data or if you're adding a new picture to the user on load completion). When the observer gets triggered, update the image being displayed.

Audio metadata storage

I checked through the questions asked on SO on audio metadata, but could not find one which answers my doubt. Where exactly is the metadata of audio files stored, and in what form? Is it in the form of files or in a database? And where is this database of files stored?
Thank you Michelle. My basic confusion was whether the metadata is stored as a part of the file or in a separate file which is stored somewhere else in the file system - like inode in case of Unix like systems. ID3 shows that it is stored with the file as a block of bytes after the actual content of the file.
Is this the way of metadata storage for most of the other file types?
As far as I know, audio file formats :
May support metadata standards (e.g. ID3v1, ID3v2, APEtag, iXML)
May also have their own native metadata format (e.g. MP4 boxes / Quicktime atoms, OGG/FLAC/OPUS/Speex/Theora VorbisComment, WMA native metadata, AIFF / AIFC native metadata...)
=> In these two cases, metadata is stored directly into the audio file itself.
HydrogenAudio maintains a field mapping table between the most common formats : http://wiki.hydrogenaud.io/index.php?title=Tag_Mapping
That being said, many audio players (e.g. iTunes, foobar2000) allow their users to edit any metadata field in any file, regardless of whether said fields are supported or not by the underlying tagging standards (e.g. adding an "Album Artist" field in an S3M file).
In order to do that, these audio players store metadata in their internal database, thus giving the illusion that the audio file has been "enriched" while its actual content remain unchanged.
Another classic use of audio player databases is to store the following fields :
Rating
Number of times played
Last time played
=> In that case, you'll find metadata in the audio player's internal database

Resources