I am working on the solution for OCR using Azure Read API, and it provides out of box solution for raster PDFs
https://learn.microsoft.com/en-us/azure/cognitive-services/computer-vision/concept-recognizing-text#read-api
but I don't see if it can support vector based PDFs. I have other solution by using some third party libraries such as Aspose and PDFxStream, but prefer if I can stay within just Azure Vision API ecosystem.
So my question is is it possible to use Read API for vector PDF, and if not what is best practical approach I could use?
To answer my question: yes, it supports Vector based PDFs, although it is not explicitly mentioned in API documentation. We checked both through Azure portal and through API code and it works. No problem with mixing raster and vector based PDFs.
Related
I am using the Azure Custom Vision service (customvision.ai) for data labelling. It works well for automatically labelling data.
I want to use this data for a custom neural network, so I want to download the tags. Is there a way to download the tags, either from the GUI or API?
The closest thing I have found is the GetTaggedImages API request (https://southcentralus.dev.cognitive.microsoft.com/docs/services/Custom_Vision_Training_3.2/operations/5dddfe4dc8d30b100855c60c).
Thanks!
If you want to get all tags for a given project and iteration., I would recommend you use API directly. There is a GetTags API can help with this, pleae check the following API reference:
https://southcentralus.dev.cognitive.microsoft.com/docs/services/Custom_Vision_Training_3.2/operations/5dddfe4ec8d30b100855c626
I have images with important file metadata (e.g. provenance and processing history) stored locally or in Azure blob storage.
I would like to import (POST) these to the Azure Custom Vision environment (via the API or GUI) (see e.g. https://southcentralus.dev.cognitive.microsoft.com/docs/services/Custom_Vision_Training_3.0) for training while (i) retaining those image metadata and (ii) being able to retrieve them via (a) the Custom Vision API and (b) the Custom Vision GUI.
An example use case would be to purge images of a certain provenance from the Custom Vision store because of a GDPR-related customer request [Aside: I appreciate that Azure Cognitive Services can anyway use the data for improving their models etc.].
As far as I can tell the only way to reference an image POSTed to Custom Vision is via its UUID. Is there any other way to reference metadata stored with that image or:
Would that constitute a feature request?
Could the image metadata be stored inside the image (e.g. JPEG EXIF) (assuming it is possible to retrieve the image itself from the Custom Vision "environment", which it may not be)?
Otherwise, is the only solution to store the returned Custom Vision image UUID in a database elsewhere alongside the required metadata?
NB In the above, by metadata I do not mean tags/labels in the image model-side sense, but rather data-side file metadata.
[Note that Azure Cognitive Services is using stackoverflow for Q&A, so this question is I believe appropriate for stackoverflow.]
Thanks as ever!
I am not part of Microsoft, so it is only my opinion based on my usage of Custom Vision.
I understand your use-case, especially regarding GDPR as you mentioned, but currently adding metadata is not a feature, whether through the API or the GUI.
To answer your questions:
Would that constitute a feature request?
Definitely. You can create an item on UserVoice for this feature (but 1st check if there is not another related existing item): https://cognitive.uservoice.com/forums/598141-custom-vision-service
Could the image metadata be stored inside the image (e.g. JPEG EXIF)
(assuming it is possible to retrieve the image itself from the Custom
Vision "environment", which it may not be)?
You can get the images your previously posted using GetImagesByIds method from the API for example (or GetTaggedImages / GetUntaggedImages). Remember that the images you post are treated: thumbnail images, resized images are generated based on what you posted. These methods provides the links to the images.
I made a quick try by:
- Uploading an image to CustomVision and adding a tag
- Getting its id
- Getting the image through the API
Good news: on the image downloaded with the "originalImageUri" link, I still have some EXIF available (just needed to rename the file as ".jpg" after download):
Otherwise, is the only solution to store the returned Custom Vision
image UUID in a database elsewhere alongside the required metadata?
Right now it is clearly the best solution in my opinion
Is there any library that can parse and generate a PNG from a Doc, Docx and PDF file?
We're implementing a training system using Node, Sails.js, Express and SQL and would like to generate some PNG image tiles for training modules based on a file upload.
I've done some searching and found some libraries in C# that can do all 3, as well as a just PDF impementation for Node but can't find anything that does more than that.
A point towards any 3rd party libraries or standard implementations of this method would be great.
Thanks
You can do that sort of stuff with C# (probably only on Windows) because C# is from MS stables, the same stable that churns out doc and docx. I am not sure whether the same implementation would work on Linux or Mac (even with Mono).
If you want to achieve this in NodeJS, just create the app in C#, wrap it in a ReSTful cover and call this ReSTful service in NodeJS (via Kue or something similar).
Honestly, converting file formats is a compute intensive process process. I wouldn't recommend it doing it the same main thread any way. If you're anyway gonna spawn a worker, you might as well do it in C# where it's perhaps faster.
Not necessarily an exact match for your requirement, but since you mentioned training purpose, I would recommend Watson Developer Cloud - it has document conversion among many other features which may be relevant and useful for your objective as a whole.
Speaking of the current problem, please see Document conversion overview to see how we can convert a PDF into a desired format such as HTML. Then you could actually get the PNG files from the HTML resource bundle.
Hope this helps.
or, Can i create/modify google docs by 3rd application?
Google does not share info on their native, proprietary format--possibly called "kix" according to this StackExchange answer.
You CAN programmatically create, modify and destroy Google Drive document files in 3rd party apps (or build your own) by manipulating representations of those files exposed by various Google APIs and scripting services. It took a bit of truffle-hunting through the online documentation, but I did find a description of the structure of a Google Doc here: Extending Google Docs.
Again, this is a description of a representation of the file, not the file itself.
A client wants us to develop a Picture Library system for them. The requirements are pretty typical - need to add pictures, tag them with metadata, store different sized versions and so on.
The client is keen on it being developed as a component which plugs into their existing SharePoint system. However, my feeling is that we would be better served building a standalone app - that way we don't have to shoehorn it into a SharePoint page and muck about integrating with SharePoint's APIs.
I am trying to look at this objectively and would welcome any arguments either way that people have.
Using an existing framework like Sharepoint imposes a lot of constraints on the design which makes the software architecture more uniform.
It does require some work on the part of the developer, because the developer does have to understand the API architecture and API's, etc.
However, developing a standalone application is the way that business's software architecture becomes a mix of 200 applications, using 20 different languages/architectures/platforms, half of which were developed by people no longer there - in short, a mess.
Sharepoint is documented, and will be supported probably long after you leave the company. Can you guarantee support for the application that you develop for as long as Microsoft will support Sharepoint?
You should do a cost/benefit analysis of integrating with SharePoint. You have listed some cons for integrating with SharePoint. Here are some pros.
Widely adopted platform.
Existing functionality to store/retreive/update images to data store.
Existing functionality to tag images.
Existing functionality to group several images together and treat as one virtual document (if using SharePoint 2010).
Keep in mind that you can integrate any custom ASP.NET page/application in Sharepoint so you can approach development like a standalone app. Your client wishes might include synchronization with Sharepoint's own picture library functionality and in that case you'll have to work with it's API.
It seems with SharePoint you are already done because it can more or less do what you describe already. What requirements do you have that cannot be met by OOB SharePoint?
I've used picture libraries for something similar before. While they have their quirks you do get a lot 'for free' like a UI, bulk uploading, metadata and 2 alternate sizes rendered.. My biggest gripe is they don't support the datagrid view so I cannot edit list metadata en masse like you can with other list types.