SSIS - Power Query Source: setting connection at runtime - excel

I'm trying to use the Power Query source component in a generic way from SSIS (VS2019).
The idea would be to use a for each loop to load and transform Excel files. At run time, I need to set the connection manager properties for each file as well as the PQY script to be executed on the file.
What I did so far is trying to create a JSON connection string inside a script component and assign the connection string to the connection manager. It keeps on saying that the file requires credentials.
Would someone already experienced that kind of dev? All the files do have the same structure so far, do meta-data need to be refreshed too?
[Edit]
1. In the control flow, I'm retrieving the PQY script I want to apply from a DB.
Before transormations, script starts like this:
let Source = Excel.Workbook(File.Contents("path_to_a_file.xlsx"),null,true),RawData_Sheet = Source{[Item="Table1",Kind="Table"]}[Data]..."
In the C# script task, I'm replacing the path to excel file by the current file variable. M Script is stored in a variable used in the PQY component.
C# Script is then updating the PQY connection manager to target the appropriate file:
ConnectionManager _conn = Dts.Connections["Power Query Connection Manager"];
String _ConnectString = "[{kind:File,path:path_to_a_file.xlss,AuthenticationKind:Windows,Username:myusername,Password:mypassword}]";
_conn.ConnectionString = _ConnectString;
The PQY component is left has it is, connected to ["Power Query Connection Manager"] and getting its script from the variable I set.
PQY configuration screen
Thanks for any tip on this,
Olivier

I can't address the specifics of the PQ but generic anything in a Data Flow will not work.
The Data Flow task works because it makes a strict contract between the source(s) and the destination(s). These columns with these data types will be in play during the run. It's a design-time contract because that allows the run-time engine to allocate resources based on how many buffers of data the system can support. Each row is X bytes, we have Y bytes of memory available, so Z buffers worth of data plus parallelism stuff.
Wish I had a better story to tell you.

Related

Import data to OMNET++

I am trying to model a network in OMNET++. What I have is a text file (can be in an Excel file format) with nodes' names, list of interfaces, and interface connections. What I like to do is to write a program (perhaps a plug-in) to feed this file to OMNET++ and (automatically) create .ned and .cc based on this file. The rationale is that there is a long list of nodes/interfaces, that makes it difficult to do it manually, and possibly a change in the connections makes it difficult to recreate it, undelss it is done automatically. Could you point to some links/websites/documents, so that I learn how to write a plugin to read the information and create the nodes and their connections automatically? Obviously, the node types and characteristics could be modified in the plugin as necessary later.
An example is like:
(some other information there)...
cr1.atl-cr1.hst cr1.atl cr1.hst 2488
cr1.kcy-cr1.wdc cr1.kcy cr1.wdc 2488
cr1.atl-cr2.atl cr1.atl cr2.atl 10000
cr2.atl-cr1.wdc cr1.wdc cr2.atl 2488
...
where the second column is the source node, the third column is the destination node, and the first column is the link (firstNode-secondNode). The 4th column is the capacity/delay or other information of the link.
If you want this to be as flexible as possible, I would recommend writing a small Python script that reads a .csv file and renders .ned files as needed.
You might even consider using a templating engine like Mako. Quoting from its website, Mako is pretty straightforward to use:
from mako.template import Template
print(Template("hello ${data}!").render(data="world"))

Select the connection manager to be used in SSIS source by means of a parameter

I created 3 different connection managers in an SSIS package, say connA connB and ConnC. How can I select the connection to use in an ado.net source using a parameter?
Thank you in advance
Carlo
The way that I would go about it is to wrap your logic in a Script Task with a simple IF Else logic
Return your source via SQLDataAdapter and store in an object Variable.
This variable is then available as a table to read from in in any part of your package. Assuming that you assign the scope of the variable correctly.

SSIS: Why when I add expression variable to Connection Manager properties I lose connection?

I have a SSIS package that needs to enumerate Excel files in Sharepoint. When I setup the For Each loop container at the package level and then set up the file path in my Excel Source at the task level everything is fine. When I add my expression variable to the Excel Connection Manager properties I lose the connection and can't view my input table anymore. I have "Delay Validation" set to true on the Excel Connection Manager.
Has anybody experienced this?
EDIT:
The following are screenshots of my ForEach Loop Container configuration:
The following is a screenshot of where I am passing the variable. It is in the Expression box of the Excel Connection Manager:
Do you have a default value set for the variable to point to a sample file? Even though you have set the Delay validation to be True it just means that when the package runs it would not do a validation on the connection. It does not mean that while you are working on the package it would not use the variable to translate to the connection. So you should go ahead and set the default value of your connection variable to the path of the file you are working on.
I had set the DelayValidation at the connection level, at the dataflow level, and at the for-each loop in control flow level but none were working. I had spent 3-4 hours until I found this thread and setting a default value of the variable finally worked.
So setting a default value of the connection variable does solve the problem if you have a similar issue.

executing script file from azure blob and write its results to file

I'll explain the task requested from me:
I have two containers in Azure, one called "data" and one called "script". In the "data" container there's a txt file with data, and in the "script" container there's a script file.
Now, I need programatically (with WorkerRole) to execute the script file, with the content of the data file as parameters (Example: a script file that accepts a string 's' and returns to the screen "Hello, 's'", when 's' in the string given, and in the data file there's a string), and save the result of the run into another file which needs to be saved in another container called "result".
How do I do all these? I've already uploaded the files and created the blobs programatically, but I can't seem to understand how to execute the file of how to save its result to another file?
Can I please have some help?
Thanks in advance
Here are the steps in pseudo code:
Retrieve the script from the blob(using DownloadToStream())
Compile the script(I will leave this to you as I have no idea what
format your script is)
Load parameters from blob(same as step 1)
Execute script with those parameters.
If your script's can be written as lambda expressions then this becomes a lot easier as you can turn them into Action's
Edit based on your questiions:
DownloadText() is no longer included in Azure Storage 2.0, you only have access to DownloadToStream(). Even if you are using an older version(say 1.7) I would recommend using DownloadToStream() in the event you ever upgrade in the future. This will prevent having to refactor your code.
In terms of executing your script, depending on what type of script it is(if it is c# code you can use this example: Is it possible to dynamically compile and execute C# code fragments?. If you need to execute a different type of script you would need to run it using Process.Start and you can look at this example: http://www.dotnetperls.com/process-start
I do not have much experience with point number 2 but those are the processes I have heard and seen used.

Odd Oracle + .net behaviour when comparing types

My workplace has a .net application supplied to us by a postal service, it connects to an oracle database running on the same machine and is responsible for registering, storing and printing shipping labels.
Seeing as the database host etc. is configurable we asked the company if the application could be used over the network (simply copying it over to another machine resulted in "literal does not match format string" errors), all we were told is "it isn't possible". Not wanting to take no for an answer I poked around the exe with reflector.
Together with Oracle's v$sqlarea view I pinpointed the errors to a few date comparison functions, but I have no idea why the application was working in the first place on the original machine.
The original application uses queries similar to
SELECT * FROM shipping WHERE date = '2011/03/28' --error
easily fixed with something like
SELECT * FROM shipping WHERE to_char(date, 'yyyy/mm/dd') = '2011/03/28'
Why does the original application work without throwing any errors? The incorrect query pops up in the v$sqlarea view when the application is used on the original host, if I copy the query and run it manually using anything else it throws the error, if I run the application on any other machine it throws the error too, is there some setting in Oracle that is modifying queries on the fly, but only for queries originating from the local machine, while storing the original query in v$sqlarea?
This sounds like a regional settings difference between the two client machines, since formatting of dates will be dependent on the culture used to convert the date to a string in .NET, and unless the application specifies a culture, it will use the settings of the current logged on user running the application. This is obviously a problem if the database engine is expecting them in a certain format. This problem is less likely to arise with parametrized queries, where the date parameters are passed separate from the query and as a date datatype instead of a string.
If you work with dates, you must avoid String.Format based query generation. Use parametrized selects and parameters to set those values.
OracleCommand cmd = new OracleCommand("SELECT * FROM shipping WHERE date = :dataParam", connection);
var param = cmd.Parameters.Add("date", OracleDbType.Date);
param.Value = DateTime.Now;
It worked, because the format was matching the datetime settings on the developer machine and on the target database.
In other words: the issue is connected to an incorrect date time format you are trying to provide.
This could be because of regional settings on the server. Please check that the new server is configured for the same Locale (EN-GB, EN-US, or whatever the original server is configured to use).

Resources