I have a PySpark notebook running in Azure Synapse, which creates an interactive visualisation of the data.
I want to make this visualisation available to others, but can't manage to export the visualisation as html code.
There are manual options to achieve what I want: I can manage to manually export the entire notebook as html, and I can print the html code of the visualisation and copy the string to a new local file. Obviously, this is not really what I am looking for.
I could store the html code as string and write it to the Blob Storage. However, while the preview option does show the string as desired, the downloaded file does not contain meaningful content.
The connection strings are hidden in my Azure Portal, unfortunately, which makes most of the workarounds (mssparkutils etc.) not feasible.
Is there any way in which I can store the visualisation in a regular .html or .txt file programmatically?
Alternatively, are there ways in which a blob can be read once downloaded?
Related
Our organization is migrating our routine work onto Azure Cloud platform. One of my works is using Python to read many pdf files and convert all the text/unstructured data into tables, e.g.
first column shows the file name and second column saves all the text data etc.
Just wondering is there a service in Azure platform that can achieve this automatically? I am new user to Azure, so not quite familiar with this. Thanks heaps if any help.
I would recommend looking at Azure Form Recognizer. You can train it to recognize tables and extract data from PDF files.
I am looking to import data form a publicly available Excel sheet into ADF. I have set up the dataset using an HTTP linked service (see first screenshot), with AutoResolveIntegrationRuntime. However, when I attempt to preview the data, I get an error suggestion that the source file is not in the correct format (second screenshot).
I'm wondering if I may have something set incorrectly in my configuration?
.xls format is not supported while using HTTP.
Since, the API downloads file you can't preview data. You can load file to blob or Azure Datalake Storage using copy activity and then on top of that file have a dataset to preview.
The workaround is to save your .xlsx file as a .csv file because Azure Data Factory does not support reading .xlsx files explicitly for HTTP connectors.
Furthermore, there is no need to convert the.xlsx file to.csv if you only want to copy it; simply select the Binary Copy option.
Here, is a similar discussion where the MS-FTE has confirmed with Product Team that's its not supported yet for HTTP Connector.
Please submit a proposal in the QnA thread to allow this functionality in future versions, which will be actively monitored by the data factory product team and evaluated for adoption.
Please check the issue at QnA Thread- Here.
We are using Office365 Excel and manually creating some data that we need in BigQuery. What solution would you create to automatically load the data from this excel to a table in bq? We are not allowed to use Google Sheets (which would solve all our problems).
We use Matillion and GCP products.
I have no idea how to solve this, and I don't find any information about this, so any suggestion or idea is appreciated.
Cheers,
Cris
You can save your data as csv and then load them to BigQuery.
Then, you can load data by using one of the following:
The Google Cloud Console
The bq command-line tool's bq load command
The API
The client libraries
Here you can find more details Loading data from local files
As a different approach, you can also try this other option:
BigQuery: loading excel file
For this you will need to use Google Drive and federated tables.
Basically you will upload your Excel files to Google Drive with the option "Convert uploaded files to Google Docs editor format" checked in your settings, and upload to BigQuery from Google Drive.
I want to upload table [Excel file] on windows azure Mobile services without coding. Can server side scripting is use for this? Any other option to upload it on Azure Mobile Service Data?
No, there's no automatic way to do that. You will need to read your table from the excel file and upload the rows to the server. That should be fairly easy to implement - save your file as a comma-separated value list (or tab-separated value list, which should make parsing easier). In a program which uses the mobile service SDK you'd read the lines from the CSV (or TSV) file, convert it to the appropriate structure (either directly to JSON via the JObject type or a typed class) and call the InsertAsync method in the client to insert the data to the server.
What would be the best way to use IFilter to extract textual content from pdf/word/whatever in an Azure solution?
I've seen examples of IFilter that use a stream, but what should the content of the stream be?
Should it contain some sort of OLE headers and what not?
Sending the raw file content as a stream to IFilter doesnt seem to work.
Or would it be better to save the files to local file storage and let the IFilter read them from that location?
using ifilter in azure will be tricky because several of the ifilters that are common on a desktop aren't available in an azure web/worker role.
You could create a durable VM in azure and install the missing ifilters.
However, if you're going to build your lucene index via a webupload you could just process the files into text as they are uploaded, and then index the text, and save the file off separately. Add a field to your index that lets you get back to the original source document.
Might be an easier way, but that's how I solved the same issue.