I've been exploring Azure Search recently as I'd like to use it in some of our apps. I've created an index, imported the data and have began querying the data using both the Search Explorer and the REST APIs. All well and good.
I changed the underlying data to test out the fuzzy searching capabilities. However, I was getting incorrect results as the data being returned was still the old data. I eventually found how I forcibly refresh the underlying data from the Azure portal, but is there a way to do this using a REST API, or to automate this in some way. I don't want to have to keep manually refreshing the Azure Search Index going forwards.
An indexer normally runs once, immediately after it is created. You can run it again on demand using the portal, the REST API, or the .NET SDK. You can also configure an indexer to run periodically on a schedule.
Source data will change over time, and you want the Azure Cognitive Search indexers to automatically process the changed data. Schedule indexers in Azure Cognitive Search, where you can set a custom interval (between 5 minutes and 24 hours).
Related
I have opted for the free plan where I am using azure cognitive search along with azure cosmos Db. When I delete a product from the database, In the search result, It is still showing
I have set indexer to refresh in 5 min
also tried track deletion
Still, nothing is helping...
Hard deletes are not picked up by Azure Search. The only way to do this today is through soft deletes.
You can learn more in the documentation here
I am looking for a proper solution architecture for a data transfer scenario from SQL Server to an external API and then from the API back to SQL Server. We are thinking of using Azure technologies.
We have a database hosted on an Azure VM. When the value of the author of the book table changes, we would like to get all the data for that book from related table and transfer it an external API. the quantity of the rows to be transferred (the select-join) is huge so it takes a long time to execute the select-join query, After this data is read it is transformed and then it is sent to an external API (over which we have no control) The transfer of the data to the API could take up to an hour. After the data is written into this API, we read some reports from this API and write these reports back into the original database.
We must repeat this process more than 50 per day.
We are thinking of using Logic app to detect the trigger from SQL Server (as it is hosted in Azure VMs) publish this even to an Azure Data grid and then use Azure Durable functions to handle the Read SQL data-Transform it- and Send to the external API.
Does this make sense? Does anybody have any better ideas?
Thanks in advance
At this moment, Logic App SQL connector can't detect when a particular row changes, it will perform a select (which you'll provide), and then it will check for changes every X interval (you'll specify).
In other words, SQL Database doesn't offer a change feed like CosmosDB where you can subscribe to events and trigger an Azure Function.
Things you can do:
1-Add a Trigger on SQL after insert / update which will insert the new/changed row into a separated table, and then you can use Logic App / Azure Functions to query this table and retrieve data.
2-Migrate to Cosmos DB and use the change feed + Azure Functions
3-Change your code to after insert into SQL Database, also add a message with the Identifier for the row you're about to insert / update, then add it to a Queue, which will be consumed by Azure Function.
I have dataset of 442k JSON documents in single ~2.13GB file in Azure Data Lake Store.
I've upload it to collection in CosmosDB via Azure Data Factory pipeline. Pipeline is completed successfully.
But when I went to CosmosDB in Azure Portal, I noticed that collection size is only 1.5 GB. I've tried to run SELECT COUNT(c.id) FROM c for this collection, but it returns only 19k. I've also seen complains that this count function is not reliable.
If I open collection preview, first ~10 records match my expectations (ids and content are the same as in ADLS file).
Is there a way to quickly get real record count? Or some other way to be sure that nothing is lost during import?
According to this article, you could find:
When using the Azure portal's Query Explorer, note that aggregation queries may return the partially aggregated results over a query page. The SDKs produces a single cumulative value across all pages.
In order to perform aggregation queries using code, you need .NET SDK 1.12.0, .NET Core SDK 1.1.0, or Java SDK 1.9.5 or above.
So I suggest you could firstly try to use azure documentdb sdk to get the count value.
More details about how to use , you could refer to this article.
I have an index in Azure Search lets says called Hotels.
I have a hotels table in Azure SQL that has the same schema that is a copy of the hotels index found in Azure Search.
I push from my back-end to Azure SQL table and Azure Search at create/update/delete.
In a scenario my data was pushed to Azure SQL but failed to be pushed to Azure Search is it possible to have my Azure SQL Hotels table be an indexer, such that the indexer could sync data to my Azure Search index (hotels) that failed to be pushed from my backend?
Yes, you can both mix push and pull as well as have multiple pull indexers targeting the same index. We see this done often when part of the data is in one data source and part in another, where the index is the point where they converge, coordinated by their key.
The pattern you're describing is not as common, but generally speaking it should work. You'd have to account for cases where your write conflicts with an indexer write, and make sure the writes you do as they happen ultimately win. Also if you go down this path make sure to configure a change detection (and deletion detection if you delete rows) policy so we index from SQL incrementally and don't ready everything on every run.
An alternative approach if you're worried about missing writes is to push all your writes into a queue, and then pull from the queue and into Azure Search. That way you have a single stream of writes instead of two.
I am using Azure search and would like to make sure I can recover from a self inflicted disaster before I push more docs in there. How do I backup my index?
Is creating Azure Search replicas equivalent to making a backup?
How would one restore that?
Thanks
Microsoft have released a console app on github that can be used to backup and restore Azure Search Indexes - its not perfect, but I use it almost daily for backup and restores from prod to CI/QC/Dev instances
https://learn.microsoft.com/en-us/samples/azure-samples/azure-search-dotnet-samples/azure-search-backup-restore-index/
Right now you can't do that from the API or the portal, just save a copy of the JSON schema to a .js file, for example. See the Get Index API.
Normally you don't need to touch the index very often, only add, update or remove documents.
You would need to use an indexer from an external source to push the data into Search and be able to create backups at the same time.
If its an AzureSQL database, this may do it for you automatically, depending on your subscription
Create a table with the same fields in the Azure Search Index and add a deleted flag and a last update date, then import all of your data into the database. Set the date flag to the time that you imported the data.
At the top of the azure search bar, there is an option to 'Import Data'. This will allow you to connect the data source, this way you can create an index which will look at the last modified data and deleted flag when you create the connection.
The wizard will take you through all of the options
From there, just update the SQL table with your changes and the indexer will automatically push them to Azure search.
Thank you for an answer about https://learn.microsoft.com/en-us/rest/api/searchservice/Get-Index
Sometimes Azure Search index it's an only source to restore data.
For example in Microsoft QnA maker - if you will delete azure web app or azure app service- you no longer can even export Knowledge base from QnA maker.
To somehow restore data from QnA maker- I used Azure Search index.