I have set up a sql task that loads the full result set of names into an object variable, I have it connected to a foreach loop that scans the whole object row by row. I'm unsure about the next steps though. If I can create a data flow task and somehow set up the destination variable equal to the for each loop mapping variable that would be nice. Any tips?
Based on what you described, all you need to do is as following:
1: The execute SQL task to return a list of excel file names, which you have already had.
Connect the output to Foreach Loop Container and starts to iterate each name.
Inside the container, the first task you need is Script Task, which is used for creating each excel file.
I assume that the excel format are the same for all that you need to populate. You need create a new template with desired column header name specified.
For that script task, take the mapped variable from container as the read only variable, you need to create another variable, set it as read and write, suppose it is named as A; for storing the dynamic excel file path for each one,and edit the script.
If you are familiar with C#, it will be easy for you to Copy the template for each iterating name.
Code will be like this:
Using System.IO;
...
...
...
string source = "C:\\template.xlsx";//need to be a full path
string target = "C:\\" + Dts.Variables["that read only variable"].Value.Tostring() + ".xlsx"
File.Copy(source,target );
Dts.Variables["A"].Value = target; //important!
After Script task, need a constrained data flow task, inside that, you need a excel destination, tricky part is (1): you need set a dynamic ExcelFile path from the properties for that excel connection manager, I suggest first time use an existing excel for cache the mapping, then for the dynamic connection part, select A, which is the read and write variable from the script task.
For populating the data to excel, you need to convert all the varchar type to nvarchar, this could be done using either derived column or data conversion
Last but not least, set delay validation to TRUE for both the connection manager, excel destination and the entire data flow task, it is very important for dynamic process.
All above might be a brief explanation, but that is the main idea.
PS: (1)Excel is very picky in SSIS, if you do not have data access engine installed, might not populate the data successfully. for excel it may need .JET (older) or .ACE(newer) provider.
(2)If your header row is not simply in the 1st row, you might also need to think about OPENROWSET properties.
Related
I am trying to find the name and full path of the current excel file where the power query is run.
I dont need the filename as such, its just that I want to have access to a sheet which do not have any data table, rather raw data is there.
When I try the Excel.CurrentWorkbook() it only gives a list of tables in the current workbook. But when I try to access the file using its name and full path using File.Contents() then all the sheet objects are returned which includes the sheets that contain raw data (without being converted into a data table).
So my plan is, if I could get the file name and path of the current workbook, then I can use it to access the sheet. I cant hardcode the file name as it gets changed everyday with the date as suffix.
Is there any other way around it?
I don't think this is currently possible using Excel.CurrentWorkbook().
It's possible to use a substring of CELL("filename") as a named range to read in the current path and workbook name into Power Query to use File.Contents but at that point, it's probably easier just to convert the sheet to a named range instead (only a few keys/clicks: select all data and hit the From Table button in the Data tab Get & Transform ribbon section).
The scenario is like i have a folder that contains aleast 4 to 5 excel workbook . The work book has a standard first name the rest of the name will vary. I need to take the count of the excel workbook then read the data's in workbook's and same it is in diffrent datatable's after each time .This has to be done in Uipath
I would recommend you to create this activity as a Library. This is kind of a pattern that can be reused everywhere.
You can find a complete example here. There you can also download it.
To summarize it:
User Select Folder activity -> yourFolder
Create variable with value Directory.GetFiles(yourFolder) -> fileArray
Go through the files via a For Each fileArray
And if you would like to use it as library, I would recommend you to add those things:
variable "FilterFileExtentions" to filter for specific files
variable "NameStartsWith" to filter files starting with specific String
It looks like you are looking to work with files first, to determine which Excel workbook you want to open. To do that you cold get a list of all files in specific folder by using .NET System.IO.Directory.GetFiles method.
So assuming you are working with your project folder you will have an Assign activity looking like this:
ListOfFiles = System.IO.Directory.GetCurrentDirectory().GetFiles()
Where ListOfFiles is a variables declared as System.String[]
You could then iterate this array using For Each activity or get a count of workbooks by using its .Count property
I am trying to combine (or perhaps append is a better term) a group (10) of identical column Excel files into one master file.
I have tried a very simple process using a foreach loop in the control flow and simply doing an Excel Source to an Excel Destination. The process was not only slow (about 1 record pasted per second) but the process died after about 50k records.
It looks like:
Foreach Loop Container --> Data Flow task
where the Data Flow Task is Excel Source --> Excel Destination
At the end, I'd want to see one master file with all files appended. I recognize there are other tools that can do this like PowerQuery directly in Excel, but I'm trying to better understand SSIS and I have a lot of processing that would be better done in SQL Server.
Is there a better way to do this? I searched high and low online but couldn't find an example of this in SSIS.
This is very simple. The one thing I would suggest is to load to a flat file in csv format that easily opens in Excel.
Foreach Loop enumerated on filename.
In Foreach GUI set:
The path of the Excel files
The structure of the file (ex myfiles*.xls)
Go to Variable mapping and map the fully qualified name to a variable
Create an Excel connection to any one of the files.
In excel connection properties open Expression and set filepath to the variable from 5
Also in properties set delay validation to true
Add a dataflow task to the foreach loop container
goto dataflow
Use source assistant to read excel source
use destination assistant to load to a flat file (may sure not to overwrite destination or you will only get the last workbook
I've written an SSIS package to upload data from an Excel source to a OLE DB Destination - however when I wish to use a ForeachLoop container in order to load data from multiple excel files I am getting an error. I have followed the tutorial contained in the link below:
https://msdn.microsoft.com/en-us/library/ms345182.aspx
All of the configurations are correct apart from the Variable strFileName which needs to be dynamically populated. As can be seen from the screen shot below my variable remains blank:
I am unsure how to do this. Is there an expression or function that can be used to dynamically populate this variable?
If you want stored fileName dynamically for each one in your folder use the Variable Mappings in your loop like this
Mapping
And for your loop:
ForEach
Note that your variable is always blank because the field is only updated when you enter in your loop.
Am trying to import a number of metric values from an Excel file into SSIS.
I have named each of the cells with data and was hoping to be able to configure a Connection, that would be updated in a ForEach container, to point to each Named Range in turn, in order to bring over the data one value at a time.
I see many articles on how to connect to a Sheet or Table in Excel, but none to a Named Range? I saw one article on how to bring over one single cell, but that cell was a part of a table.
Can I setup a Connection in SSIS to a single cell, Named or otherwise, and bring back that value?
JK
I can see you implementing this in one of two ways. The first is just a straight Execute SQL Task that returns a single row. The other being a data flow with, probably a script task as your source.
With each pass through your loop, you'd probably need to modify the Excel connection manager and/or your query string to point to the correct named range
In the section To create a linked server against an Excel spreadsheet
To access data from an Excel spreadsheet, associate a range of cells
with a name. A named range can be accessed by using the name of the
range as the table name. The following query can be used to access a
named range called SalesData using the linked server set up in the
previous example.
This article also describes programmatically access Excel via C#, albeit from ASP.NET but the principal should be the same. My hazy recollection is that the worksheet name would have a $ appended to it, thus sheet1$ while accessing the named range would be without the $.
One thing we ran into with our implementation was our servers did not have the appropriate drivers on them and it required us to install the Access engine
Lots of generalities in this answer so if you run into specifics, feel free to ping me.
Take a look at the top two answers from this question:
Want to insert excel file data into table using ssis - format problem
which explains 2 different approaches to doing what you ask.
Here is how to do it http://www.mssqltips.com/sqlservertip/1930/use-ssis-to-import-one-cell-of-an-excel-file-into-sql-server/#comments.
Unfortunately it didn't help me because i want to set a single variable with the values and use it later on.