I am trying to create a custom transform to detect and replace Pii information in videos using video indexer and media services, but I am not able to find the correct workflow to use the services? video indexer detects insights (OCR)-> text analytics detects Pii -> Media Services encodes and blurs (or overlay) the regions in video? There is no sample for media services to blur regions only faceredaction
Media Services API only supports the detection of faces and the blurring of them in a two-pass or single-pass process.
The two-pass process returns a JSON file with bounding boxes that can be used to adjust the positioning and choose which areas are blurred or not blurred. That file can be updated and then used in the second pass.
https://learn.microsoft.com/en-us/azure/media-services/latest/analyze-face-redaction-concept
also see the JSON schema here - https://learn.microsoft.com/en-us/azure/media-services/latest/analyze-face-redaction-concept#elements-of-the-output-json-file
The current .NET sample for this only shows the single-pass mode being used though, and I don't yet have a detailed sample showing the process of editing and re-submitting the job for the second pass, but I can help with that if you are interested in the details.
The current sample uses the "Redact" mode, but you would want to start with the "Analyze" mode if you merely wanted the JSON file with the bounding boxes to be used for blurring adjustments.
There is no support to blur text or OCR related data directly in the service or in Video Indexer.
Related
Is it possible to retrieve the full list of default recognized classes of Azure Video Indexer?
Azure Video Analyzer for Media, a.k.a. Video Indexer, supports thousands of class labels for video frames classification referred to as Labels. Although the full list is not available online you can easily infer the classes relevant for your data with a few API calls... Feel free to reach out the customer support at: visupport#microsoft.com for additional assistance.
Here is what I am trying to do.
I am analyzing videos and based on my analysis I know at certain time intervals I need to capture a screenshot. I want this to be taken care of as part of encoding but I don't see any documentation that lets me achieve it in v3. Is this even possible in v3?
This feature (key frames) is available on Video Indexer. More info in here:
https://learn.microsoft.com/en-us/azure/media-services/video-indexer/scenes-shots-keyframes
You can use the v3 APIs to generate thumbnails at fixed intervals. In the sample here, you can see how the PngImage and PngFormat elements are used. You can also output JPEG images - the schema details are here.
Context
I have an mobile app that provides our users with the possibility to capture the name plate of our products automatically. For this I use the Azure Cognitive Services OCR service.
I am a bit worried that customers might capture pictures of insufficient quality or of the wrong area of the product (where no name plate is). To analyse whether this is the case it would be handy to have a copy of the captured pictures so we can learn what went well or what went wrong.
Question
Is it possible to not only process an uploaded picture but to also store it in Azure Storage so that I can analyse it in a later point in time?
What I've tried so far
I configured the Diagnostic settings in a way that the logs and metrics are stored into Azure Storage. As it is called, this is only logs and metrics and not the actual image. So this does not solve my issue.
Remarks
I know that I can manually implement that in the app but I think it would be better if I have to upload
the picture only once.
I'm aware that there are data protection considerations that must be made.
No, you can't add an automatic logging based only on OCR operation, you have to implement it.
But to avoid uploading it twice as you said, you could create your logic on server side, but sending the image to your api and in the api, get the image and send it to OCR while storing it in parallel.
But I guess that based on your question, you might not have any server side things in your app?
I'm working on an IoT project that involves a sensor transmitting its values to an IoT platform. One of the platforms that I'm currently testing is Thingsboard, it is Open Source and I find it quite easy to manage.
My sensor is transmitting active energy indexes to Thingsboard. Using these values, I would like to calculate and show on a widget the values of the active power (= k*[ActiveEnergy(n)- ActiveEnergy(n-1)/Time(n)-Time(n-1)]). This basically means that I want to have access to history data, use this data to generate new data and inject it to my device.
Thingsboard uses Cassandra database to save history values.
One alternative to my problem could be to find a way to communicate with the database via a Web API for example, do the processing and send back the active power by MQTT or HTTP on my device using its access token.
Is this possible?
Is there a better alternative to my problem?
There are several options how to achieve this (based on a layer or component of the system):
1) Visualization layer only. Probably the most simple one. There is an option to apply post-processing function. The function has following signature:
function(time, value, prevValue)
Please note that prevTime is missing, but we may add this in future releases.
post processing function
2) Data processing layer. Use advanced analytics frameworks like Apache Spark to post-process your data using sliding time window, for example.
See our integration article about this.
I am attempting to build an interface that allows timing / rhythm (potentially pitch) input to a Web Audio Oscillator node. in effect creating a 'step sequencer'.
What's the best way to trigger scheduled NoteOn for the Web Audio API Oscillator Nodes?
In a specific pattern, i.e. 1/4 notes, 1/8th notes or a user entered pattern.
This is a great question, and in fact I just published an HTML5Rocks article on this very topic: http://www.html5rocks.com/en/tutorials/audio/scheduling/.