Load .asc file into azure machine learning - azure

For my Azure Machine Learning experiment I want to load a .asc file into an Execute R script in my experiment. It is in fact a tab delimited file with some comments on the first couple of rows. Can anyone tell me how to do this?
A csv goes well, but with this file I get an error.

You need to upload this file as part of the zip file. Follow the steps provided under heading "Script Bundle" - https://azure.microsoft.com/en-us/documentation/articles/machine-learning-r-quickstart/

You need to create a new data set and upload the csv file after selecting the right data type "GenericTSVNoHeader" or with header.
On you experiment, you will be able to view or visualize the data set and then do any manipulation you can add execute R script module.

If you plan to send each line of the text file as parameter to the webservice, then you can also use the "enter data" module for providing the data as shown below
If you want to send the whole file as a parameter to the web service, then I would recommend to use reader with SQL or blob option, clean first couple of rows first, and then use SQL script or blob credentials as web service parameter as described here

Related

How to use a Tab-Delimited UTF-16le file as source in a Microsoft Azure data Factory dataflow

I am working for a customer in the medical business (so excuse the many redactions in the screenshots). I am pretty new here so excuse any mistakes I might make please.
We are trying to fill a SQL database table with data coming from 2 different sources (CSV files). Both are delivered on a BLOB storage where we have read access.
The first flow I build to do this with azure data factory works perfectly so I just thought to clone that flow and point it to the second source. However the CSV files from the second source are TAB delimited and UTF-16le encoded. Luckily you can set these parameters when you create a dataset:
Dataset Settings
When I verify the dataset by using the "Preview Data" option, I see a nice list with data coming from the CSV file:Output from preview data So it appears to work fine !
Now I create a new dataflow and in the source I use the newly created Data source. All settings I left at default. data flow settings
Now when I open Data Preview and click refresh I get garbage and NULL outputs instead of the nice data I received when testing the data source. output from source block in dataflow In my first dataflow i created this does produce the expected data from the csv file but somehow the data is now scrambled ?
Could someone please help me with what I am missing or doing wrong here ?
Tried to repro and here you could see if you have the Dataset settings,
Encoding as UTF-8 instead of UTF-16 then you will ne able to preview the data.
Data Preview inside the Dataflow:
And if even I try to have the UTF-16LE enabled for the encoding having such issues:
Hence, for now you could change the Encoding and use the pipeline.

Azure Data Factory - Recording file name when reading all files in folder from Azure Blob Storage

I have a set of CSV files stored in Azure Blob Storage. I am reading the files into a database table using the Copy Data task. The Source is set as the folder where the files reside, so it's grabbing it's file and loading it into the database. The issue is that I can't seem to map the file name in order to read it into a column. I'm sure there are more complicated ways to do it, for instance first reading the metadata and then read the files using a loop, but surely the file metadata should be available to use while traversing through the files?
Thanks
This is not possible in a regular copy activity. Mapping Data Flows has this possibility, it's still in preview, but maybe it can help you out. If you check the documentation, you find an option to specify a column to store file name.
It looks like this:

In NetSuite using SuiteTalk, Is it possible to create a CSV file from saved search

Backgorund:
I am a newbie in the NetSuite world. We are trying to integrate NetSuite with our ERP and I am doing some preliminary research to find out what would be the best option moving ahead. The primary objective of the first task is to download huge volume of data from NetSuite to our end and find alternatives approaches.
I did some research on SuiteScript/SuiteTalk/Analytics and some facts I have come to find and my questions are below:
Custom search can be created and save SuiteScript/SuiteTalk.
This saved search can be invoked via both SuiteScript as well as SuiteTalks
Well have a confusion, is the Saved Search the View, which SuiteAnalytics can access? (NOT MY MAIN QUESTION THOUGH!!).
Using SuiteScript, return of Saved Search execution can be saved on as a flat file, and that file can be moved to File Cabinet. Exposing a REST API using RESTlet, this file can be downloded. [But have not implemented this yet!!]
[MAIN QUESTION] IS IT POSSIBLE TO DO THE SAME, CREATE A FLAT FILE AT NETSUITE END USING SUITTALK? AND ALSO HOW TO DO SAVE/MOVE THE FILE TO FILE CABINET AFTER THAT?
I have not researched more on the topic File Cabinet and how a created file or files here are indexed?
Or Is it better to load whole result set from the SOAP call?
Your comments are highly appreciated!
Thank you!
You can certainly execute a saved search via SuiteTalk. You can also loop through all the results of the saved search and do whatever you'd like with those results, such as create a text file.
The SuiteTalk API also allows for accessing the File Cabinet to create or retrieve files, with limitations on file size.
Suitetalk can be used to create File and move file from a folder to another by changing the folder internalId of the fileObject.
Since you are using the Suitetalk to create/load Saved Search; you are required to create and save the CSV at your end using the search result and then move the file to file cabinet.
Since your objective is to get huge data from NetSuite I would recommend below option:
Use Scheduled script/Map Reduce to build a file and place it in required folder of file cabinet
Using Suitetalk you can extract that file. (Note: You don't need REST do this job. You can get the fileContents and store the result at your end. You cannot directly store the file. You will have to store the filecontents)
Thanks #netsuite-guru and #suite-resources!
So doing some further research, and considering your recommendations, the server side(at NETSUITE) scripting can be done only using SuiteScript to achieve the goal of automating - READ from NetSuite and WRITE to File to FILECABINET!
Also found another good read thread as an option to MapReduce link
But would go with "Scheduled script/Map Reduce" at this time.

Export Sharepoint list to .csv and upload to Azure Data Lake Using Flow

I am trying to using Microsoft Flow to export a Sharepoint List to Azure Data Lake.
I want it so that anytime a particular online list is changed, its entire contents are loaded into a file in Data Lake. If the file already exists, I want to overwrite it. Can someone please explain how I can go about doing this, I have tried multiple ways, but they are not getting the job done.
Thanks
I was able to get the items in the SharePoint list to near perfection. I will post the Flow here in case anyone in the future needs it.
So what I did is that every 5 minutes I "create" a file in Azure Data Lake which overwrites the file if it exists. The content of the files cannot be blank, so I added a newline to the content. Then I use Get Items to retrieve all the items in the SharePoint List. From there, using an Apply to each loop, I append the content of the current row of the Sharepoint list to the Data Lake file (separated by | and ending with a new line after all the content is added). This works to near perfection, with the only caveat being the newline at the beginning of the file, which I eliminate using PowerQuery.
This is exactly what I needed. If anybody sees a way to make this better, please post so that we can get this to perfection.

How do I create a file from a result set?

I'm using Oracle 11g
I'm trying to create a flat file (CSV or TXT) from a result set but am struggling on where to even start. It seems like I have to create a stored proc and use UTL_FILE. After doing some research, I have two questions:
Where does the file get created? According to this question I need to get access to the Oracle user directory, but where is that on a Windows and Linux environment? I have to test on Windows , and the script will eventually be on a Linux environment.
What would be the basic format of a SQL script to create the aforementioned file, and load data into it from a fairly basic SELECT query? I'm not seeing a UTL_FILE function to write the records to the file; do I have to iterate through the entire result set and use PUT or can I somehow just push the entire result to a file?
I think using "spool" can do the trick.
Check this out https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:9518534700346581975
And more information is here http://www.dba-oracle.com/t_sqlplus_spool.htm
The file will get created in the directory where you launch sqlplus from.
If you're using SQL Developer you can create a view for your query. Right click view in schema browser and choose export and export as csv.
But personally I would go for spool as previous answer said. SQL Plus is most basic client so I don't believe you won't have it.

Resources