Data extraction using Azure Cognitive services - azure

Goodday!
I've got a library full of documents. I would like to be able to know some more about these documents without reading through all of them. On the first page of each document I already put some relevant information about the document such as year of creation, author, number of pages and confidentiality.
I want to use some AI to browse through these documents and decide which is what and return it to me so I can store it in a CSV or something. This way I'll be able to easily find the right document.
Now, I did use LUIS and some of the TAA service of Azure's cognitive services before. But I cannot figure out a way how to accomplish this.
Is it theoretically possible to send the first page (using a powershell script to cut the rest of the document and send it through) to LUIS, which can then return the entities? I'd like them in a way: Date of creation: 18/10/2018, pages: 100.
Or is there a better way to use Azure's powerfull AI to get this done?

Related

Azure component that compares 2 word documents?

I am looking to build a solution where we can compare (Automated) a word document we email to a client to the word document the client emails us back.
Any suggestions please, we will be using MS Azure to create the solution.
Due to the lack of a direct automation of this feature you can use an indirect route to approach the solution.
Note:
This solution invokes Word using COM automation. Hence this is expected to be running on a VM that has word installed.
Since this solution simplifies the content that is being compared, it might lose the details you might want to show up in your comparison.
You can create a PowerShell script that does the following
Convert the documents to a simpler (txt) format - See this or this.
Compare the text files - See this.
The powershell script will get the comparison done. After that its up to how much fancy you want to get when exposing this functionality outside that VM. e.g. You could create an HTTP invokable API that can call this Powershell script and return results.
I think your best bet would be using Microsoft Azure Logic Apps! Azure Logic Apps allows you to develop advanced automation workflows on the cloud and it supports a vast array of connectors out of the box including email triggers and the Microsoft Word Connector. If that's not enough then, you could even develop your own connectors too...

Using Azure Search to index and search an Orchard CMS site

I am working on an Orchard CMS system that is hosted in Azure. However, using the inbuilt Lucene search it has proved difficult to implement a search algorithm that filters out documents that are links to files (e.g. PDF/Images) and filtering out documents that do not belong to certain taxonomies have are associated in a certain lat/long square, date/time of occurrence. To get an idea of the data that I am dealing with, the website is https://ahdb.org.uk/. Consequently, I am looking into implementing Azure Search to index and provide the search functionality for the site. Just so that you know the version of Orchard that is installed is 1.10.1.0.
I have searched the web to the best of my ability and there seems to be nothing out there.
Graham Harris
While there's no direct integration of Orchard with Azure Cognitive Search, it should still be possible with a little work. It looks like you have custom rules about what you need to index. You might need to create a custom database view that normalizes the data and is specific about your use case, and then feed that into the Azure Search pipeline. The Orchard 1.x schema is very relational, and will require some understanding of how parts and content items are related, as well as how versioning is implemented. A good way to do that is to install the miniprofiler module and look at some of the queries being generated by Orchard itself as it's doing similar tasks (such as a projection of data that looks like what you want to feed into search).

PowerBI Embedded API functionality

I have some queries about the PowerBI Embedded API, and more so if functionality exists, and if so where can I find it.
In particular, I am looking to find, from the APIs (PowerBI, Embedded or Azure) where I can complete the following functions:
View the number of Rendered Views within a Workspace Collection
Delete a report/import which has been uploaded
Ability to find out how many renders a single report would create - I would find this especially useful given it is billable per render.
Additional functionality I am looking for, is also to be able to save the rendered chart to image or pdf and responsiveness in the dashboards.
I do realise its still in public preview, however, has anyone managed to find the above functionality within the current APIs.
Thanks
David
View Number of Rendered Views within a Workspace Collection:
Make a POST request to the following ARM API with Content-Length: 0:
https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroup}/providers/Microsoft.PowerBI/workspaceCollections/{workspaceCollectionName}/billingUsage?api-version=2016-01-29
Delete import:
Make a DELETE request to the following Power BI API:
https://api.powerbi.com/beta/collections/{workspaceCollectionName}/workspaces/{workspaceId}/datasets('{datasetKey}')
There is no API for this yet.
Consider making the suggestion at https://ideas.powerbi.com/.

Need to find reports on Search. E.g. Top searched keyword, categories

I am using Azure search where it creates index on my database tables and shows results as expected.
Now I have a requirement where I need to find-out what are the words or items users have searched most or what was the pick time for search.
Is it possible to find any such reports with Azure Search?
Either by its portal or using the API or Code?
I'm on Azure Search team, thanks for using the service. Currently it's not possible, however, we understand the importance of this feature and we're working to deliver it. No exact dates yet. For now, you'd have to collect and aggregate the information you need on the client side.
For feature request like this, feel free to use our User Voice page to help us prioritize work: http://feedback.azure.com/forums/263029-azure-search

Intranet search engine frontend?

We are currently using a number of open source and commercial products to store different type of information (in our internal network). All these products come with their own repositories (usually a database) and their own search capabilities and store different type of information.
Currently the list of products is as follows:
Wordpress
Jira
Confluence
Sharepoint
Dynamics AX
Moodle
The problem we are facing is that when one needs to search for information, one needs to login into all these different systems and execute a search on each one.
I Googled for "search engine frontend", "meta search engine", etc. but i was not able to find something obvious that solves our problem. At this point, i have to say that we are not interested in building one "central repository" to be searched, but instead we are in need of a frontend that will accept the query from the user, "package it" to the format that each of the individual search engines understand, receive the respone (JSON or XML) and present it to the user
Any suggestions on how we could solve it?
Your strategy is right: If you are not interested in building a central index, you will need an application that accepts the query from the user, converts it to the format that each of the individual search engines understand, receives the responses and presents them to the user. This is exactly what a meta search engine does. Even if you use a framework (e.g. Carrot2), much work will probably remain to write those query and result transformers, and you will probably experience slow results because the meta search can never be faster than the underlying search modules of the components you search through.
Instead of querying each backend separately you can put your data into one backend.
You could export your data to a Apache Solr server and use a frontend like CorePages, http://www.corepages.biz . You could add a backlink to your data so you can directly jump to your search result entry, f. e. a Jira Ticket or a wiki article.

Resources