Office365 Excel as source for GCP BigQuery - excel

We are using Office365 Excel and manually creating some data that we need in BigQuery. What solution would you create to automatically load the data from this excel to a table in bq? We are not allowed to use Google Sheets (which would solve all our problems).
We use Matillion and GCP products.
I have no idea how to solve this, and I don't find any information about this, so any suggestion or idea is appreciated.
Cheers,
Cris

You can save your data as csv and then load them to BigQuery.
Then, you can load data by using one of the following:
The Google Cloud Console
The bq command-line tool's bq load command
The API
The client libraries
Here you can find more details Loading data from local files

As a different approach, you can also try this other option:
BigQuery: loading excel file
For this you will need to use Google Drive and federated tables.
Basically you will upload your Excel files to Google Drive with the option "Convert uploaded files to Google Docs editor format" checked in your settings, and upload to BigQuery from Google Drive.

Related

Is there an Azure platform service that can convert text from pdf files and save those unstructured data in database?

Our organization is migrating our routine work onto Azure Cloud platform. One of my works is using Python to read many pdf files and convert all the text/unstructured data into tables, e.g.
first column shows the file name and second column saves all the text data etc.
Just wondering is there a service in Azure platform that can achieve this automatically? I am new user to Azure, so not quite familiar with this. Thanks heaps if any help.
I would recommend looking at Azure Form Recognizer. You can train it to recognize tables and extract data from PDF files.

Access worksheet names from Excel file with Google Apps Script (without Drive.Files.insert)

In a Google App Script attached to a Google Sheet, I have the file ID of an excel file. I want to read the worksheet names of that excel file. The tutorials I've seen on conversion load the excel file as a blob then write it to Drive as a Google Sheet, then read it.
Is there a way to do this that does not to create artifacts that I then need to delete? The reasoning is that I am concerned with the following: safety if there's a bug (the wrong thing gets deleted), additional processing time (I need to process a long list of excel files), and leftover artifacts if the script aborts unexpectedly between inserting and deleting.
Thank you!
Answering your questions, the reason the tutorials first convert the Excel file to a Google Sheet is to interact with it (in your case, to gather the worksheet names) it's because the Google APIs or Apps Script cannot interact with the Excel file as row data, and Google needs to convert the file to something readable using Google APIs.
A workaround for this will be to use Excel JavaScript API to read the information original Excel file, you can use externals API in Apps Script since it's based in JavaScript, so you will use Apps Script as an IDE.
However, you can do the same with any other IDE that works with JavaScript.
There are some examples on how to list the worksheets using the Excel JavaScript API in this blog.
If you will like to keep using Google APIs, and using the Google Apps Script built-in services. You will need to convert the file to Google Sheets.
Updating Answer:
You can review more about the Excel Services API services here.

Azure Data Factory Excel read via HTTP fails

I am looking to import data form a publicly available Excel sheet into ADF. I have set up the dataset using an HTTP linked service (see first screenshot), with AutoResolveIntegrationRuntime. However, when I attempt to preview the data, I get an error suggestion that the source file is not in the correct format (second screenshot).
I'm wondering if I may have something set incorrectly in my configuration?
.xls format is not supported while using HTTP.
Since, the API downloads file you can't preview data. You can load file to blob or Azure Datalake Storage using copy activity and then on top of that file have a dataset to preview.
The workaround is to save your .xlsx file as a .csv file because Azure Data Factory does not support reading .xlsx files explicitly for HTTP connectors.
Furthermore, there is no need to convert the.xlsx file to.csv if you only want to copy it; simply select the Binary Copy option.
Here, is a similar discussion where the MS-FTE has confirmed with Product Team that's its not supported yet for HTTP Connector.
Please submit a proposal in the QnA thread to allow this functionality in future versions, which will be actively monitored by the data factory product team and evaluated for adoption.
Please check the issue at QnA Thread- Here.

Is there a way to use excel power query to query a SAS dataset on a local drive?

I've searched everywhere and can't seem to find an answer so hopefully someone here can assist. We have a SAS program set to run weekly that is outputting a dataset to a local drive. Is there a way to get excel Power Query to see it? I can connect to datassets fine that are housed within the database but stored locally is an issue. Outputting this to the database isn't an option for us. Any ideas?
If you have the Stored Process server you can create a web query to access it, as described here: https://www.rawsas.com/sas-as-a-service-an-easy-way-to-get-sas-into-excel-power-bi-and-000s-of-other-tools-languages/
This functionality also comes bundled with https://datacontroller.io (free for up to 10 users)
Disclosure - I wrote the blog and created the product.
Alternatives:
update your job to export your data as CSV or some other format that can be read natively by excel.
Use the IOM interface and VBA
SAS Addin for excel
All these options require server SAS. In short, there is no way that Excel Power Query can connect directly to a SAS dataset on a local drive, as the .sas7bdat format is a proprietary SAS format optimised for use by SAS.

How to open or download an excel file with Knime from SharePoint?

I have a corporate SharePoint platform of course and I want to:
-Open or download-to-soon-modify some Excel files
-Download and upload files after some changes
Or, is there a way for getting Knime to retrieve data from SAP GUI?
But I don´t get how to, seems to be Knime can not open an Excel file or csv online.
Any suggestion? Thank you very much
In general, the Excel Reader (XLS) node will allow you to read files from a URL. You can e.g. simply enter an http: or https: URL instead of a local file path, and in case the web server does not require any authentication, the node will download and parse the file.
Speaking of SharePoint however, the files you need to access are likely protected by some form of login. So, there are two options:
In case, SharePoint has REST API you should be able to access your files with the KNIME REST Nodes or the Palladian nodes. Uploading should also be possible this way. Check your SharePoint’s REST API documentation about the appropriate endpoints, interfaces, formats, and authentication mechanism.
Alternatively, you can “script” your web browser and automate the action which you would do as a user through a KNIME workflow. For that approach, you can use the Selenium Nodes.

Resources