Does azure databricks support stream access fromr azure postgresql? - azure

I have asked similar question but I would like to ask question if I can use Microsoft Azure to achieve my goal.
Is streaming input from external database (postgresql) supported in Apache Spark?
I have a database deployed on Microsoft Azure Postgresql. I have a table which I want to stream access from . Using Kafka connect , it seems that I could stream access the table, however, looking on online document , I could not find database(postgresql) as a datasource .
Does azure databricks suport stream reading postgresql table ? Or is it better to use
azure HDInsight with kafka and spark ?
I appreciate if I could get some help.
Best Regards,
Yu Watanabe

Unfortunately, Azure Databricks does not support stream reading of Azure postgresql database.
Azure HDInsight with Kafka and Spark will be the right choice for your requirement.
Managed Kafka and integration with other HDInsight offerings that can be used to make a complete data platform.
Azure also offers a range of other managed services needed in a data platform such as SQL Server, Postgre, Redis and Azure IoT Event Hub.
As per my research, I have found a third-party tool name "Panoply" which integrate Databricks and PostgreSQL using Panoply.
Hope this helps.

Related

Can I use Azure Synapse functionality outside the Azure environment?

Forum,
I am currently looking into Azure Synapse as an option for migrating our on-prem data architecture. I am excited by the functionality it offers - SQL Pools, Spark Pools, and the accompanying notebooks. I get that Synapse can function as a all in one data platform, where my data scientists and data analists can use its functionality to deliver insights at will. However, a large part of the work my team does is creating data products.
We currently have a kubernetes cluster with several stand-alone API's that perform data-science operations in the larger whole of our software. They can be thought of as microservices. Most of the ETL is done in our SQL-server, and the microservices in our K8S cluster (usually python + some python packages + FastAPI) typically get the required data from our SQL-server through some SQL-query with an ODBC connector.
Now my question is, how suitable is Synapse for such an architecture? Can I call upon the SQL-pool or spark-pool to do the heavy data-lifting from outside the azure environment, say from a kubernetes pod?
Unfortunately you can't integrate Azure Synapse Analytics with Kubernetes Services.
While Synapse SQL helps perform SQL queries, Apache Spark executes batch/stream processing on Big Data. SQL Pool is used to work with data stored in Dedicated SQL Pool while Spark SQL can be integrated with existing data preparation or data science projects that you may hold in Azure Databricks or Azure Machine Learning Services.
Also, as per this third-party document, Azure Synapse Analytics can't integrate with Kubernetes Services.
As a workaround, you can copy/move your data from Kubernetes to Azure Services like Azure Dedicated SQL Pool, Azure Blob Storage or Azure Data Lake Storage and then integrate it with Azure Synapse pipeline or Spark Pool.

How to read/load data into Azure DataBricks or Azure DataLake using Spring Batch?

I am looking to Read from the csv/xml/Postgres DB and write into the Azure DataLake or DataBricks using Spring Batch. I dont see any API as for yet which does this. Anyone knows how can we do it using Spring batch?
Here: https://github.com/spring-projects/spring-batch/issues/4074
Unfortunately this feature isn't available in Azure Databricks yet so you need to change your approach.
There is already a similar feature request in GitHub, you can add your vote here.

Azure Migration Service with Cassandra and Cosmos

Does anyone know when Azure's Migration Service is going to be compatible with migrating Cassandra data over to Cosmos DB? I heard the team might be working on it a while ago and I'm wondering if there have been any updates as to when it will be available/if it's still happening?
Based on this official document,you could find two options to copy data from existing Cassandra workloads to Azure Cosmos DB.
1.Using cqlsh COPY command
2.Using Spark
However,the data migration tool is still not support Cassandra API so far.You could submit feedback here to push the progress of whatever you want.

Azure table storage with loopbackJS

I want to use Azure table storage with loopbackJS. Is there any library to use this or can any one please help me to how can i use this.
Loopback doesn't have a commercial Azure Table Storage connector available currently, and neither is there a community connector for it.
You can write a new connector yourself, but that may be overkill/difficult for you.
If you're flexible about your Azure NoSQL technology you could look at using Azure DocumentDB which has protocol support for MongoDB, so you could use the Loopback MongoDB connector.

What are the Azure ML output formats?

Does Azure ML only provide output through it's web services?
Is it possible to feed the output to an Azure SQL database?
Is it possible to feed the output to a Redshift database?
Essentially I am looking to know if I can integrate Azure ML Studio with our existing redshift analytics database.
yes you can write to SQL DB in Azure.
you can also use a Python module to make REST calls so in theory you can write to Redshift.
Writing to SQL DB is possible in Azure ML and so is Writing directly to Azure Blob Storage.
However, unlike #Hai, I do not believe you can write to a Redshift DB since it is clearly stated by the "Python Module" documentation from Microsoft that the Python execution is Sandboxed and therefore can not access resources outside the virtual machine it runs on(i.e Internet resources, on-premises resources, ...)

Resources