Azure Search for SQL Server Blob Column - azure

I have a table [Assets] on Azure SQL Server with columns (Id, Name, Owner, Asset). The [Asset] column is varbinaryblob type that store PDF files.
I would like to use Azure Search to be able to search through the content of this column. Currently Azure Search can be directly used with Blob Store or exclusively for table store however I a am not able to find a solution for my scenario, Any help in terms of approach is greatly appreciated.

Is it possible for you to create a SQL VM, sync your data on SQL Azure with the VM with SQL Data Sync, then sync data on the SQL VM with Azure Search as explained here?
Another option is to move your SQL Azure database to a SQL VM on Azure, then sync data on SQL VM with Azure Search as explained here.
Hope this helps.

Azure Search SQL indexer doesn't support document extraction from varbinary/blob columns.
One approach would be to upload the file data into Azure blob storage and then use Azure Search blob indexer.
Another approach is to use Apache Tika or iTextSharp to extract text from PDF in your code and then index it with Azure Search.

Related

How to take a backup & Restore of Azure SQL table in Azure Blob storage and vice versa

I want to take an Archival(Backup) of Azure SQL Table to the Azure Blob Storage, I have done the backup in Azure Blob storage using via the Pipeline in CSV file format. And From the Azure Blob Storage, I have restored the data into the Azure SQL Table successfully using the Bulk Insert process.
But now I want to retrieve the data from this CSV file using some kind of filter criteria. Is there any way that I can apply a filter query on Azure Blob storage to retrieve the data?
Is there any other way to take a backup differently and then retrieve the data from Azure Storage?
My end goal is to take a backup of the Azure SQL table in Azure Storage and retrieve the data directly from Azure Storage with a filter.
Note
I know that I can take a backup using the SSMS, but that is not a requirement, I want this process through some kind of Pipeline or using the SQL command.
AFAIK, there is no such filtering option available when restoring the database. But, as you are asking for another way to backup and restoring, SQL Server Management Studio (SSMS) is one the most conveniently used platform for almost all SQL Server related activities.
You can use SSMS to access Azure SQL database using server name and Login Password.
Find this official tutorial from Microsoft about how to take backup of your Azure SQL Database and store it in Storage account and then restore it.

How to move dataset from Azure Managed Instance to Azure SQL using SSIS with change data capture

How to move data from Azure Managed Instance to Azure SQL Database by using SSIS package with Change data capture feature in it?
Please help me with links or documents.
Good news, CDC was implemented for Managed Instance and Azure SQL Database as per yesterday:
"Change data capture (CDC) records insert, update, and delete activity that applies to tables in SQL Server, Azure SQL Managed Instance or Azure SQL Databases [...]"
You can see some common scenarios in this video.

Azure search - Which is the best way to follow API or Portal when there is two data sources one sql on VM & other the blob storage?

We have the following scenario and we need to implement Azure search. We have to finalize the method/work flow of the process.
We have two data sources one Sql on VM & other the Blob storage. We need to combine data from both the sources to be in a single index & then to be searched. Which is the best way to implement API or portal?
Unless you use two different indexes, there's no way to combine both using portal. So you need to write some code that will merge information from both sources and push them to your Azure Search index.
Here's a sample using CosmosDB and Blob Storage, all you need to do is use Sql rather than cosmos Db and proper model your index:
https://learn.microsoft.com/en-us/azure/search/tutorial-multiple-data-sources

Bulk upload Excel to SQL Azure daily

I have a requirement to bulk upload data from a excel file to an Azure SQL table on a daily basis. I did some research and found that we could create a VM install full SQL and use SSIS package to do this.
Is there any other reliable way to go about this? The excel may contain up to 10,000 rows.
I have also read we could upload file to a blob storage and read from there but found it's not very robust approach.
Can anyone suggest if this is feasible approach-
Place excel file in Azure Website accessed via FTP
Azure Timer job using SQL Bulk copy code to update the SQL table
Any help would be highly appreciated!
You could use Azure Data Factory - check out the documentation here. Place your files in Azure Data Lake and the ADF will process them.

Azure Data Sync - Copy Each SQL Row to Blob

I'm trying to understand the best way to migrate a large set of data - ~ 6M text rows from (an Azure Hosted) SQL Server to Blob storage.
For the most part, these records are archived records, and are rarely accessed - blob storage made sense as a place to hold these.
I have had a look at Azure Data Factory and it seems to be the right option, but I am unsure of it fulfilling requirements.
Simply the scenario is, for each row in the table, I want to create a blob, with the contents of 1 column from this row.
I see the tutorial (i.e. https://learn.microsoft.com/en-us/azure/data-factory/data-factory-copy-activity-tutorial-using-azure-portal) is good at explaining migration of bulk-to-bulk data pipeline, but I would like to migrate from a bulk-to-many dataset.
Hope that makes sense and someone can help?
As of now, Azure Data Factory does not have anything built in like a For Each loop in SSIS. You could use a custom .net activity to do this but it would require a lot of custom code.
I would ask, if you were transferring this to another database, would you create 6 million tables all with the same structure? What is to be gained by having the separate items?
Another alternative might be converting it to JSON which would be easy using Data Factory. Here is an example I did recently moving data into DocumentDB.
Copy From OnPrem SQL server to DocumentDB using custom activity in ADF Pipeline
SSIS 2016 with the Azure Feature Pack, giving Azure Tasks such as Azure Blob Upload Task and Azure Blob Destination. You might be better off using this, maybe an OLEDB command or the For Each loop with an Azure Blob destination could be another option.
Good luck!
Azure has a ForEach activity which can be place after LookUp or Metadata to get the each row from SQL to blob
ForEach

Resources