Optimized way of face recognition using Azure Face API

Optimized way of face recognition using Azure Face API - azure

I need to implement face recognition using azure face api . I have developed a programme which is able to find similiar faces using .net SDK . For my use case ,I need to click photo of a person from the webcam and find matching faces from images kept in azure cloud storage . Now, there could be thousand of images in azure cloud storage and in my current implementation of face recognition ,I'm iterating through all the images(kept in azure cloud storage ) and then matching them with the webcam image .
The concern here is :
The face api (provided by azure ) charges 1 dollar per thousand call . Is there a way the search could be optimized such that i don't have to scan the faces which i have already scanned for previous searches
public async Task<List<DetectedFaceAttributes>> FindSimiliarFacesWithAttributesFromContainer(IFaceClient client, string RECOGNITION_MODEL1, string sourceImageFileName)
{
string url = BlobBaseURL;
string sourceurl = sourceContainerURL;
var imagesInNovotraxContainer = await _blobService.GetNames();
IList<Guid?> targetFaceIds = new List<Guid?>();
var faceList = new List<DetectedFaceAttributes>();
// Detect faces from source image url.
IList<DetectedFace> detectedFaces = await DetectFaceRecognize(client, $"{sourceurl}{sourceImageFileName}", RECOGNITION_MODEL1);
if (detectedFaces.Any())
{
foreach (var targetImageFileName in imagesInNovotraxContainer)
{
var faceattribute = new DetectedFaceAttributes();
// Detect faces from target image url.
var faces = await DetectFaceRecognizeWithAttributes(client, $"{url}{targetImageFileName}");
// Add detected faceId to list of GUIDs.
if (faces.Any())
{
targetFaceIds.Add(faces[0].FaceId.Value);
faceattribute.DetectedFace = faces[0];
faceattribute.ImageFileName = targetImageFileName;
faceList.Add(faceattribute);
}
}
// Find a similar face(s) in the list of IDs. Comapring only the first in list for testing purposes.
IList<SimilarFace> similarResults = await client.Face.FindSimilarAsync(detectedFaces[0].FaceId.Value, null, null, targetFaceIds);
var similiarFaceIDs = similarResults.Select(y => y.FaceId).ToList();
var returnDataTypefaceList = faceList.Where(x => similiarFaceIDs.Contains(x.DetectedFace.FaceId.Value)).ToList();
return returnDataTypefaceList;
}
else
{
throw new Exception("no face detected in captured photo ");
}
public async Task<List<DetectedFace>> DetectFaceRecognize(IFaceClient faceClient, string url, string RECOGNITION_MODEL1)
{
// Detect faces from image URL. Since only recognizing, use the recognition model 1.
IList<DetectedFace> detectedFaces = await faceClient.Face.DetectWithUrlAsync(url, recognitionModel: RECOGNITION_MODEL1);
//if (detectedFaces.Any())
//{
// Console.WriteLine($"{detectedFaces.Count} face(s) detected from image `{Path.GetFileName(url)}` with ID : {detectedFaces.First().FaceId}");
//}
return detectedFaces.ToList();
}

Your implementation is not totally clear for me in terms of calls to Face API / your storage (what's behind "DetectFaceRecognizeWithAttributes"). But I think you are right in the fact that you missed something and your global processing is over costly.
What you should do depends on your target:
Is it face "identification"?
Or face "similarity"?
Both have the same logic, but they are using different API operations
Case 1 - Face identification
Process
The global process is the following: you will use a "Person Group" or "Large Person Group" (depending of the number of persons you have) to store data about faces that you already know (the one in you storage), and you will use this group to "identify" a new face. with that, you will do "1-n" search, not "1-1" as you do right now.
Initial setup (group creation):
Choose if you need Person Group or Large Person group, here are the actual limits depending on your pricing:
Person Group:
Free-tier subscription quota: 1,000 person groups. Each holds up to 1,000 persons.
S0-tier subscription quota: 1,000,000 person groups. Each holds up to 10,000 persons.
Large Person Group:
It can hold up to 1,000,000 persons.
Free-tier subscription quota: 1,000 large person groups.
S0-tier subscription quota: 1,000,000 large person groups.
Here I am using Person Group in the explanation, but it's the same methods.
When you know the one you need, create it using "Create" operation.
Then, for each person, you will have to create a "PersonGroup Person" using "PersonGroup Person - Create", and add the corresponding faces to it using "PersonGroup Person - Add Face". Once it is done, you never need to reprocess "detect" operation on those faces.
Then for the "run" part
When you have a new image that you want to compare:
Detect faces in your image with Detect endpoint of Face API
Get the face Ids of your result
Call Identify endpoint of Face API to try to identify those face Ids with your (large) person group
To limit the number of call, you can even do batches of identification calls (up to 10 "input" face Ids in 1 call - see doc).
Case 2 - Face similarity
Here you can use a "Face List" or "Large Face List" to store the faces that you already know, and pass the id of this list when calling "Find Similar" operation. Example with FaceList:
Start with "FaceList - Create" to create your list (doc)
Use "FaceList - Add Face" to add all the faces that you have currently in your blob (doc)
Then for the run, when you call "Find Similar", provide the ID of your FaceList in "faceListId" parameter and the id of the face you want to compare (from Face Detect call)

Related

What's the efficient way to load all azure publisher VirtualMachineImage?

I'd like to load all existing images for publisher. But it's impossible to filter them by publisher.
It needs to perform chain of calls:
PagedList<VirtualMachinePublisher> publishers = azure
.virtualMachineImages()
.publishers()
.listByRegion("useast");
final PagedList<VirtualMachineOffer> offers = publisher.offers().list();
offers.loadAll();
return offers.stream()
.flatMap(offer -> {
final PagedList<VirtualMachineSku> skus = offer.skus().list();
return skus.stream();
})
.flatMap(sku -> {
final PagedList<VirtualMachineImage> images = sku.images().list();
return images.stream();
})
.collect(Collectors.toList());
Unfortunately, it takes too much time, I guess sku.images().list() loads images one by one instead of one request.
Is there any more efficient way to do this?
If I'm not wrong new version of API (azure-resourcemanager) also doesn't have filter by publisher.

How do I query the vimeo api for a specific video title?

Hi I'm querying for a specific video by title - and at the moment I get mixed results.
my videos are all named with a consecutive number at the end ie ANDNOW2022_00112, ANDNOW2022_00113 etc
When I search /videos/?fields=uri,name&query=ANDNOW2022_00112 I get all of the videos returned
I've also tried the query_fields using
/me/videos?query_fields=title&sort=alphabetical&query=ANDNOW2022_00112
I just want the one I've searched for - or a no results returned.
At the moment I get all of the videos with AN2022 in the title/name. Now 'usually' the one I searched for is at the top of the list but not everytime.
Any tips appreciated.

Okay I'm not going mad :)
This is from Vimeo and is here for those with the same issue - basically to get t to work you need to understand that:
After speaking with our engineers, the current search capability are not "Exact" search.
When adding numbers or underscores the search is split into parts so "ANDNOW2022_00112" is transforming the query into the parts "andnow2022", "andnow", "2022", and "00112". So this is why your seeing these results. Our engineering team are in the process of improving the search capabilities and hope to provide a release in the near future.
Which means for now I'll have to rename my files.

Preface:
Vimeo does not currently offer an API endpoint for exact title search — but even if it did — it's possible to upload multiple videos and assign them identical titles. There's no way to use the API to positively identify a video by title — this is why every uploaded video is assigned a unique ID.
Solution:
Because the API returns data which includes an array of video objects, you can solve this problem in the same way you'd solve any similar problem in JavaScript where you have to find an element in an array: Array.prototype.find()
Here's how you can apply it to your problem:
Query the API using the parameters you described in your question.
You might also be interested in using the sort and direction parameters for greater control over a deterministic sort order.
Find the first item in the returned array of video objects that match your expected text exactly, and return it (or undefined if it doesn't exist)
Here's a code example with some static data from the API that was used to search for the video Mercedes Benz from the user egarage — note that I've omitted quite a few (irrelevant) fields from the response in order to keep the example small:
// Mocking fetch for this example:
function fetch (_requestInfo, _init) {
const staticJson = `{"total":2,"page":1,"per_page":25,"paging":{"next":null,"previous":null,"first":"/users/egarage/videos?query_fields=title&query=Mercedes%20Benz&sort=alphabetical&direction=asc&page=1","last":"/users/egarage/videos?query_fields=title&query=Mercedes%20Benz&sort=alphabetical&direction=asc&page=1"},"data":[{"uri":"/videos/61310450","name":"50th Anniversary of the Pagoda SL -- Mercedes-Benz Classic Vehicles","description":"Penned by designer Paul Bracq, the W113 SL had big shoes to fill: it had the incredible task of succeeding the original and instantly iconic 300 SL Gullwing. But you can't copy a legend, so Bracq designed one of his own. Straight lines replaced curves and a low-slung roof was replaced by a high top design that gave the car its nickname: the Pagoda.\\n\\nMUSIC: Developer Over Time","type":"video","link":"https://vimeo.com/61310450"},{"uri":"/videos/55837293","name":"Mercedes Benz","description":"To celebrate Mercedes Benz 125th birthday, the 2011 Pebble Beach Concours d’Elegance showcased the models that trace the lineage to Benz and Daimler —particularly Mercedes-Benz. This tribute chronicled early racing greats, coachbuilt classics, and preservation cars. Produced in association with DriveCulture.","type":"video","link":"https://vimeo.com/55837293"}]}`;
return Promise.resolve(new Response(staticJson));
}
async function fetchVideoByTitle (token, userId, videoTitle) {
const url = new URL(`https://api.vimeo.com/users/${userId}/videos`);
url.searchParams.set("query_fields", "title");
url.searchParams.set("query", videoTitle);
url.searchParams.set("sort", "alphabetical");
url.searchParams.set("direction", "asc");
const headers = new Headers([
["Authorization", `Bearer ${token}`],
]);
const response = await fetch(url.href, {headers});
const parsedJson = await response.json();
// Find the video that matches (if it exists):
const maybeFirstVideoObj = parsedJson.data.find(video => video.name === videoTitle);
return maybeFirstVideoObj;
}
async function main () {
const video = await fetchVideoByTitle(
"YOU_ACTUAL_TOKEN",
"egarage",
"Mercedes Benz",
);
console.log(video); // {name: "Mercedes Benz", link: "https://vimeo.com/55837293", ...}
}
main();

How to Speed Up Contract API CustomerID Search?

I'm trying to search the existing Customers and return the CustomerID if it exists. This is the code I'm using which works:
var CustomerToFind = new Customer
{
MainContact = new Contact
{
Email = new StringSearch { Value = emailIn }
}
};
var sw = new Stopwatch();
sw.Start();
//see if any results
var result = (Customer)soapClient.Get(CustomerToFind);
sw.Stop();
Debug.WriteLine(sw.ElapsedMilliseconds);
However, I've finding it appears extremely slow to the point of being unusable. For example on the DEMO dataset, on my i7-6700k # 4GHz with 24gb ram and SSD running SQL Server 2016 Developer Edition locally a simple email search takes between 3-4seconds. However on my production dataset with 10k Customer records, it takes over 60 seconds and times out.
Is this typical using Contract based soap? Screen based soap seems much faster and almost instant. If I perform a SQL select on the database tables in Microsoft Management Studio I can also return the result instantly.
Is there a better quick way to query if a Customer with email address = "test#test.com" exists and return the Customer ID?

Try using GetList instead of Get. It's better suited for "search for smth" scenarios.
When using GetList, depending on which endpoint you're using, there are two more optimizations. In Default/5.30.001 endpoint there's a second parameter to GetList which you should set to false. In Default/6.00.001 endpoint there's no second parameter but there is additional property in the entity itself, called ReturnBehavior. Either set it to OnlySpecified and then add *Return to required fields, like this:
var CustomerToFind = new Customer
{
ReturnBehavior = ReturnBehavior.OnlySpecified,
CustomerID = new StringReturn(),
MainContact = new Contact
{
Email = new StringSearch { Value = emailIn }
}
};
or set it to OnlySystem and then use ID on returned entity to request the full entity.

Bringing Active Directory Users using JNDI in multiple threads

I have designed an application which brings the users from the active directory to an MySQL database, and shows them on GUI. It also brings the groups of which a user is a member of.
So, my program works this way:
for(String domain : allConfiguredADomains) {
LdapContext domainCtx = getDomainCtx(domain);
// Bring all users from this domain and store them in DB
getAllUsersForDomain(domain, domainCtx);
// Bring all the groups for every user
getAllGroupsForUsersInTheDomain(domain, domainCtx)
}
void getAllUsersForDomain(String domain, LdapContext domainCtx) {
String filter = "(objectClass=User)"
NamingEnumeration<SearchResult> result = domainCtx.search(domain, filter, ..);
while(result.hasMoreElements()) {
SearchResult searchResult = (SearchResult) result.nextElement();
// Process and store in database
storeUserInDatabase(searchResult);
}
}
void getAllGroupsForUsersInTheDomain(String domain, LdapContext domainCtx) {
List<String> userDistinguishedNames = getAllUsersFromDatabase("distinguishedName");
for(String userDn : userDistinguishedNames) {
String filter = "(&(objectClass=Group)(distinguishedName=" + userDn + "))";
NamingEnumeration<SearchResult> result = domainCtx.search(domain, filter, ..);
List<String> allGroupsOfUser = new List<String>();
while(result.hasMoreElements()) {
SearchResult searchResult = (SearchResult) result.nextElement();
String groupDistinguishedName = searchResult.getAttributes().get("distinguishedName").get();
allGroupsOfUser.add(groupDistinguishedName);
}
// Store them in database
storeAllGroupsOfUserInDatabase(userDn, allGroupsOfUser);
}
}
This application, however, takes lot of time, when there are too many users in the active directory. So, I decided to implement parallelism (using Threading). I divided this using search filter on distinguishedName of a user.
String filter = "(&(objectClass=User)(distinguishedName=a*"))";
and so on.. in each thread while fetching users.
I got better performance, but still not so good. Can someone suggest
a better way ?
Also, I don't have an idea how can I introduce
parallelism while fetching groups ?
If someone has any suggestions to do this better with powershell or C#, please suggest, I am open to technology.
Please note: reading user attribute memberOf does not provide all groups, hence I am fetching groups separately.

I'm not an Active Directory expert - just wanted to share some thoughts.
Threading by alphabet letter allows a maximum of 26 threads. Have you considered creating search threads by some other attributes, group membership etc? This might let you create more threads.
Review the Active Directory docs to see whether there is a way to improve search performance (for example, with a database we could create an index).

Data tracking in DocumentDB

I was trying to keep the history of data (at least one step back) of DocumentDB.
For example, if I have a property called Name in document with value "Pieter". Now I am changing that to "Sam", I have to maintain the history , it was "Pieter" previously.
As of now I am thinking of a pre-trigger. Any other solutions ?

Cosmos DB (formerly DocumentDB) now offers change tracking via Change Feed. With Change Feed, you can listen for changes on a particular collection, ordered by modification within a partition.
Change feed is accessible via:
Azure Functions
DocumentDB (SQL) SDK
Change Feed Processor Library
For example, here's a snippet from the Change Feed documentation, on reading from the Change Feed, for a given partition (full code example in the doc here):
IDocumentQuery<Document> query = client.CreateDocumentChangeFeedQuery(
collectionUri,
new ChangeFeedOptions
{
PartitionKeyRangeId = pkRange.Id,
StartFromBeginning = true,
RequestContinuation = continuation,
MaxItemCount = -1,
// Set reading time: only show change feed results modified since StartTime
StartTime = DateTime.Now - TimeSpan.FromSeconds(30)
});
while (query.HasMoreResults)
{
FeedResponse<dynamic> readChangesResponse = query.ExecuteNextAsync<dynamic>().Result;
foreach (dynamic changedDocument in readChangesResponse)
{
Console.WriteLine("document: {0}", changedDocument);
}
checkpoints[pkRange.Id] = readChangesResponse.ResponseContinuation;
}

If you're trying to make an audit log I'd suggest looking into Event Sourcing.Building your domain from events ensures a correct log. See https://msdn.microsoft.com/en-us/library/dn589792.aspx and http://www.martinfowler.com/eaaDev/EventSourcing.html

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string