Microsoft cognitive services Face API - azure

What is suggested (optimal) image size to work with face API. Can't find anything about this.
Looks like images should not be to small but either too large. Probably any recommendation how to prepare them before train model?
Thanks.

This may help from the "Add Face" documentation:
JPEG, PNG, GIF (the first frame), and BMP format are supported. The allowed image file size is from 1KB to 4MB.
"targetFace" rectangle should contain one face. Zero or multiple faces will be regarded as an error. If the provided "targetFace" rectangle is not returned from Face - Detect, there’s no guarantee to detect and add the face successfully.
Out of detectable face size (36x36 - 4096x4096 pixels), large head-pose, or large occlusions will cause failures.
Adding/deleting faces to/from a same face list are processed sequentially and to/from different face lists are in parallel.

Related

How should I handle large video datasets in Google Cloud ML Engine?

I am experimenting with video classification using Keras in Cloud ML Engine. My dataset consists in video sequences saved as separate images (eg. seq1_frame1.png, seq1.frame2.png...) which I have uploaded to a GCS bucket.
I use a csv file referencing the start of end frames of different subclips, and a generator which feeds batch of clips to the model. The generator is responsible for loading frames from the bucket, reading them as images, and concatenating them as numpy arrays.
My training is fairly long, and I suspect the generator is my bottleneck due to the numerous reading operations.
In the exemples I found online, people usually save pre-formatted clips as tfrecords files directly to GCS. I feel like this solution isn't ideal for very large datasets as it implies duplicating the data, even more so if we decide to extract overlapping subclips.
Is there something wrong in my approach ? And more importantly, is there a "golden-standard" for using large video datasets for machine learning ?
PS : I explained my setup for reference, but my question is not bound to Keras, generators or Cloud ML.
In this, you are almost always going to be trading time for space. You just have to work out which is more important.
In theory, for every frame, you have height*width*3 bytes. That's assuming 3 colour channels. One possible way you could save space is to use only one channel (probably choose green, or, better still, convert your complete dataset to greyscale). That would reduce your full size video data to one third size. Colour data in video tends to be at a lower resolution than luminance data so it might not affect your training, but it depends on your source files.
As you probably know, .png is a lossless image compression. Every time you load one, the generator will have to decompress first, and then concatenate to the clip. You could save even more space by using a different compression codec, but that would mean every clip would need full decompression and probably add to your time. You're right, the repeated decompression will take time. And saving the video uncompressed will take up quite a lot of space. There are places you could save space, though:
reduce to greyscale (or green scale as above)
temporally subsample frames (do you need EVERY consecutive frame, or could you sample every second one?)
do you use whole frames or just patches? Can you crop or rescale the video sequences?
are you using optical flow? It's pretty processor intensive, consider it as a pre-processing step, too, so you only have to do it once per clip (again this is trading space for time)

How to process KML/GeoJSON in Nodejs?

I ran out of google searches and reaching out for help here. We are right now processing KML file using geoXml3 at client side. But Ideally I would want to pre-process it in server side and send the ploygons on the client side. Because KML file is 18MB file and it takes forever to download on client side and then client parses it and draws the polygon on google map.
We changed KML files to GeoJSON and reduced the size , compressed it - after all the circus the response time is still not good. I just want to know if there is a way / library in node that can do this.
When you say that you are compressing the file, what do you mean? If you mean an algorithm such as zip or lha, that won't necessarily reduce the size of the file that much. What you want to do is to remove line segments from the KML file. In reducing some geographic information, I found that there were lengths of many miles that varied less than a foot from a straight line. Since the data points were spaced every few feet, that meant that the vast majority of the points in the KML file could be removed without making an appreciable change in the appearance of the geometry. Looking for straight line segments is relatively simple.
You should also keep in mind the scale of the map you are viewing and the spacing of the data points in the KML file. Even if the lines are complex curves, it may be possible to remove large numbers of points by trying to curve fit segments of the features and reduce the size of the data in this way.
You seem to be implying that downloading the data from the server to the client takes far more time than processing the data on the server. If this is correct, reducing the number of points is the most efficient method.
There are a number of tricks you can try to reduce the download & rendering time of a KML polygon file.
As already suggested in previous answers the key is to reduce the size of your data. This can be accomplished in many ways, depending on your use case:
Reduce the number of points that each line segment / polygon boundary is made up of. There are plenty of algorithms for this, the Douglas-Peucker line simplification algorithm is the most well known.
Reduce the precision of your data points. If your coordinates are stored to a high degree of precision (i.e. latitude / longitudes with several decimal places) then you can round these up to a lower degree of precision / fewer decimal places. Note that you might have to play with this as it will make the boundary of your polygons choppy / jagged if you go too far.
Compression. It looks as though you have already experimented with this. Gzip compression should be able to reduce the wire payload size of KML significantly.
Finally, if you are still not getting the results you want you could consider generalizing your data further by removing small / insignificant polygons. Again, this will depend on your use case

Gamma-correct zoomable image (dzi, etc) generator

I'm using MS Deep Zoom Composer to generate tiled image sets for megapixel sized images.
Right now I'm preparing a densely detailed black and white linedrawing.
The lack of gamma correction during resizing is very apparent;
while zooming the tiles appear to become brighter on higher zoom levels.
This makes the boundaries between tiles quite apparent during the loading stage.
While it does not in any way hurt usability it is a bit unsightly.
I am wondering if there are any alternatives to Deep Zoom Composer that do gamma correct resizing?
The vips deepzoom creator can do this.
You make a deepzoom pyramid like this:
vips dzsave somefile.tif pyr_name
and it'll read somefile.tif and write pyr_name.dzi and pyr_name_files, a folder containing the tiles. You can use a .zip extension to the pyramid name and it'll directly write an uncompressed zip file containing the whole pyramid --- this is a lot faster on Windows. There's a blog post with some more examples and explanation.
To make it shrink gamma corrected, you need to move your image to a linear colourspace for saving. The simplest is probably scRGB, that is, sRGB with linear light. You can do this with:
vips colourspace somefile.tif x.tif scrgb
and it'll write x.tif, an scRGB float tiff.
You can run the two operations in a single command by using .dz as the output file suffix. This will send the output of the colourspace transform to the deepzoom writer for saving. The deepzoom writer will use .jpg to save each tile, the jpeg writer knows that jpeg files can only be RGB, so it'll automatically turn the scRGB tiles back into plain sRGB for saving.
Put that all together and you need:
vips colourspace somefile.tif mypyr.dz scrgb
And that should build a pyramid with a linear-light shrink.
You can pass options to the deepzoom saver in square brackets after the filename, for example:
vips colourspace somefile.tif mypyr.dz[container=zip] scrgb
The blog post has the details.
update: the Windows binary is here, to save you hunting. Unzip somewhere, and vips.exe is in the /bin folder.
pamscale1 of the netpbm suite is quite well known not to screw up scaled images as you describe. It uses gamma correction instead of ill-concieved "high-quality filters" and other magic used to paper over incorrect scaling algorithms.
Of course you will need some scripting - it's not a direct replacement.
We maintain a list of DZI creation tools here:
http://openseadragon.github.io/examples/creating-zooming-images/
I don't know if any of them do gamma correction, but some of them might not have that issue to begin with. Also, many of them come with source, so you can add the gamma correction in yourself if need be.

Scaling an image up in Corona SDK without it becoming fuzzy

I am working on a classic RPG that requires a pixelated style of graphics. I want to do this by making a small image and scaling it up. However, when I do this, it gets fuzzy. Is there any way to scale it while keeping a crisp edge for every pixel, or do I just need to make a bigger image?
You cannot scale an image expecting it to keep a crisp aspect if it's not made in a big enough resolution in the first place. In your case you would have to make a bigger image and scale it down to make the small image.
If you do not use the large image all the time however, you should consider having two versions of the same image (one small / one large) for optimization sake.

DICOM Image is too dark with ITK

i am trying to read an image with ITK and display with VTK.
But there is a problem that has been haunting me for quite some time.
I read the images using the classes itkGDCMImageIO and itkImageSeriesReader.
After reading, i can do two different things:
1.
I can convert the ITK image to vtkImageData using itkImageToVTKImageFilter and the use vtkImageReslicer to get all three axes. Then, i use the classes vtkImageMapper, vtkActor2D, vtkRenderer and QVTKWidget to display the image.
In this case, when i display the images, there are several problems with colors. Some of them are shown very bright, others are so dark you can barely see them.
2.
The second scenario is the registration pipeline. Here, i read the image as before, then use the classes shown in the ITK Software Guide chapter about registration. Then i resample the image and use the itkImageSeriesWriter.
And that's when the problem appears. After writing the image to a file, i compare this new image with the image i used as input in the XMedcon software. If the image i wrote ahs been shown too bright in my software, there no changes when i compare both of them in XMedcon. Otherwise, if the image was too dark in my software, it appears all messed up in XMedcon.
I noticed, when comparing both images (the original and the new one) that, in both cases, there are changes in modality, pixel dimensions and glmax.
I suppose the problem is with the glmax, as the major changes occur with the darker images.
I really don't know what to do. Does this have something to do with color level/window? The most strange thing is that all the images are very similar, with identical tags and only some of them display errors when shown/written.
I'm not familiar with the particulars of VTK/ITK specifically, but it sounds to me like the problem is more general than that. Medical images have a high dynamic range and often the images will appear very dark or very bright if the window isn't set to some appropriate range. The DICOM tags Window Center (0028, 1050) and Window Width (0028, 1051) will include some default window settings that were selected by the modality. Usually these values are reasonable, but not always. See part 3 of the DICOM standard (11_03pu.pdf is the filename) section C.11.2.1.2 for details on how raw image pixels are scaled for display. The general idea is that you'll need to apply a linear scaling to the images to get appropriate pixel values for display.
What pixel types do you use? In most cases, it's simpler to use a floating point type while using ITK, but raw medical images are often in short, so that could be your problem.
You should also write the image to the disk after each step (in MHD format, for example), and inspect it with a viewer that's known to work properly, such as vv (http://www.creatis.insa-lyon.fr/rio/vv). You could also post them here as well as your code for further review.
Good luck!
For what you describe as your first issue:
I can convert the ITK image to vtkImageData using itkImageToVTKImageFilter and the use vtkImageReslicer to get all three axes. Then, i use the classes vtkImageMapper, vtkActor2D, vtkRenderer and QVTKWidget to display the image.
In this case, when i display the images, there are several problems with colors. Some of them are shown very bright, others are so dark you can barely see them.
I suggest the following: Check your window/level in VTK, they probably aren't adequate to your images. If they are abdominal tomographies window = 350 level 50 should be a nice color level.

Resources