Connecting Azure Data Factory with InfluxDB - azure

I'm working on some time series data that I want migrated to cloud. Working in Australia and PIE is stopping me from using Time Series Insights, so I've decided to use InfluxDb as my Time Series database.
I've set up a Grafana VM on Azure and installed InfluxDB in it.
The task where I'm stuck is.
1. Import a csv file (with time series data) to blob storage using Azure Data Factory (Have done this)
2. Use ADF to transfer the files to InfluxDb. (Need help here)
3. Do cool stuff on the data (have nice people in the team who're experts in this task)
Need help in point 2. Appreciate you putting your time to help me.
Thanks

Currently, ADF doesn't support influxdb as data source/sink.
Here is the list ADF supported.

Related

Can you drop Azure SQL Tables from Azure ML?

I am currently developing an Azure ML pipeline that is fed data and triggered using Power Automate and outputs to a couple of SQL Tables in Azure SQL. One of the tables that is generated by the pipeline needs to be refreshed each time the pipeline is run, and as such I need to be able to drop the entire table from the SQL database so that the only data present in the table after the run is the newly calculated data.
Now, at the moment I am dropping the table as part of the Power Automate flow that feeds the data into the pipeline initially. However, due to the size of the dataset, this means that there is a 2-6 hour period during which the analytics I am calculating are not available for the end user while the pipeline I created runs.
Hence, my question; is there any way to perform the "DROP TABLE" SQL command from within my Azure ML Pipeline? If this is possible, it would allow me to move the drop to immediately before the export, which would be a great improvement in performance.
EDIT: From discussions with Microsoft Support, it does appear that this is not possible due to how the current ML Platform is designed. Not answering this question in case someone does solve it, but adding this note so that people who come along with the same problem know.
Yes you can do anything you want inside an Azure ML Pipeline with a Python Script Step. I'd recommend using the pyodbc library, and you'd just have to pass the credentials to your script as environment variables or script arguments.

Databricks or Functions with ADF?

I'm using ADF to output some reports to pdf (at least that's the goal.)
I'm using ADF to output a csv to a storage blob and I would like to ingest that, do some formatting and stats work (with scipy and matplotlib in python) and export as a pdf to the same container. This would be run once a month, and I may do a few other things like this, but they are periodical reports at the most, no streaming or anything like that.
From an architectural stand point, would this be a good application for an Azure Function (which I have some experience), or Azure Databricks (which I will want some experience in).
My first thought is the Azure Functions, since they are serverless and pay-as-you-go. But I don't know too much about Databricks except that it's primarily used for big data and long running jobs.
Databricks would be almost certainly an overkill for this. So yes, Azure Function for Python sounds like a perfect fit for your scenario.

Snowflake Connection & Setup

I'm working with a client to manage an integration with their closed system CRM to their email platform (called MYGuestList) and BI reporting platform (Tableau).
Our CRM is doing to push a data replication to a SQL server (Microsoft Azure). We've been interested to bring Snowflake into the tech stack to house all data points (Google Analytics, email marketing efforts, media etc.) to produce this "single customer view" and manage our integrations.
I'm a bit lost as to what we do next (I can't seem to reach out to Snowflake for any sort of support) and would love any guidance on the following questions:
How do I go about connecting our Azure server (once set up) to Snowflake?
How might I suggest our third party developers look to integrate with our data points in Snowflake?
Do I need to purchase a third party connector (FiveTran/Stitch) to ETL data from Google Analytics?
Thank you in advance for your help with this very big newbie trying to solve this problem for my client!
Kate
You have this 3 options to load data:
Use some ETL tool, or own eg Python app to get data from source system and insert it into Snowflake (directly or to PUT file to Snowflake stage)
extract data as csv to cloud directory or Snowflake stage, and use bulk load (COPY INTO) to load it
you may use PIPE to automaticaly load data from files inserted like in second option

SSIS alternatives for ETL in Azure data factory

Please could you all assist us in answering what we believe to be a rather simple question, but is proving to be really difficult to get a solution to. We have explored things like Data Bricks and Snowflake for our Azure based data warehouse, but keep getting stuck at the same point.
Do you have any info you could share with us around how you would move data from an Azure database (source) to another Azure database (destination) without using SSIS ?
We would appreciate any info you would be able to share with us on this matter.
Looking forward to hearing from you
Thanks
Dom

Application insight -> export -> Power BI Data Warehouse Architecture

Our team have just recently started using Application Insights to add telemetry data to our windows desktop application. This data is sent almost exclusively in the form of events (rather than page views etc). Application Insights is useful only up to a point; to answer anything other than basic questions we are exporting to Azure storage and then using Power BI.
My question is one of data structure. We are new to analytics in general and have just been reading about star/snowflake structures for data warehousing. This looks like it might help in providing the answers we need.
My question is quite simple: Is this the right approach? Have we over complicated things? My current feeling is that a better approach will be to pull the latest data and transform it into a SQL database of facts and dimensions for Power BI to query. Does this make sense? Is this what other people are doing? We have realised that this is more work than we initially thought.
Definitely pursue Michael Milirud's answer, if your source product has suitable analytics you might not need a data warehouse.
Traditionally, a data warehouse has three advantages - integrating information from different data sources, both internal and external; data is cleansed and standardised across sources, and the history of change over time ensures that data is available in its historic context.
What you are describing is becoming a very common case in data warehousing, where star schemas are created for access by tools like PowerBI, Qlik or Tableau. In smaller scenarios the entire warehouse might be held in the PowerBI data engine, but larger data might need pass through queries.
In your scenario, you might be interested in some tools that appear to handle at least some of the migration of Application Insights data:
https://sesitai.codeplex.com/
https://github.com/Azure/azure-content/blob/master/articles/application-insights/app-insights-code-sample-export-telemetry-sql-database.md
Our product Ajilius automates the development of star schema data warehouses, speeding the development time to days or weeks. There are a number of other products doing a similar job, we maintain a complete list of industry competitors to help you choose.
I would continue with Power BI - it actually has a very sophisticated and powerful data integration and modeling engine built in. Historically I've worked with SQL Server Integration Services and Analysis Services for these tasks - Power BI Desktop is superior in many aspects. The design approaches remain consistent - star schemas etc, but you build them in-memory within PBI. It's way more flexible and agile.
Also are you aware that AI can be connected directly to PBI Web? This connects to your AI data in minutes and gives you PBI content ready to use (dashboards, reports, datasets). You can customize these and build new reports from the datasets.
https://powerbi.microsoft.com/en-us/documentation/powerbi-content-pack-application-insights/
What we ended up doing was not sending events from our WinForms app directly to AI but to the Azure EventHub
We then created a job that reads from the eventhub and send the data to
AI using the SDK
Blob storage for later processing
Azure table storage to create powerbi reports
You can of course add more destinations.
So basically all events are send to one destination and from there stored in many destinations, each for their own purposes. We definitely did not want to be restricted to 7 days of raw data and since storage is cheap and blob storage can be used in many analytics solutions of Azure and Microsoft.
The eventhub can be linked to stream analytics as well.
More information about eventhubs can be found at https://azure.microsoft.com/en-us/documentation/articles/event-hubs-csharp-ephcs-getstarted/
You can start using the recently released Application Insights Analytics' feature. In Application Insights we now let you write any query you would like so that you can get more insights out of your data. Analytics runs your queries in seconds, lets you filter / join / group by any possible property and you can also run these queries from Power BI.
More information can be found at https://azure.microsoft.com/en-us/documentation/articles/app-insights-analytics/

Resources