I am using Excel Online in the browser, have setup a workbook link to my main file from a source. In my main file I have table headers and additional columns with formula. I just need from A2 to AC down. The issue is that the source file changes daily. There might be more rows the next day or fewer. I need to be able to reference set columns and then detect how many rows are in the data source and update the main file
So far, I have something like this
='https://sharepoint.com/personal/myFolder/Documents/[data_source.xlsx]in'!A2
Which on columns B2 and C2 load the first row. I can select a range from the source data so it loads all of it, but if the next day there is more rows, it wont load those, or if there are fewer, it will display as blanks.
How can I tell the formula to select Columns A2 to C2 and extend down, or refresh the data like it does in Excel desktop when using data connections?
As you can see Source data, Day 2 has extra rows that wont be loaded in my main file.
You can use PowerAutomate and two Office Scripts to link the two workbooks together.
You'd start by using a recurrence. So you'd pick how often you'd like the flow to run (weekly, daily, etc.)
After you set the recurrence, you have to write an office script that work with the table data. You can work with the dataBodyRange of the table by using the table's GetRangeBetweenHeaderAndTotal() method. And once you have that, you can resize the range to get the data you need. Next, you need to get the values which you can use with the GetValues method. GetValues returns a 2d array which you can't return from a PowerAutomate RunScript. Since you can't do that, but you can return a string, you get around that by converting the 2d array to a json string. You can see the code below:
function main(workbook: ExcelScript.Workbook): string {
let sh: ExcelScript.Worksheet = workbook.getActiveWorksheet();
//get table
let tbl: ExcelScript.Table = sh.getTable("Table1");
//get table's column count
let tblColumnCount: number = tbl.getColumns().length;
//set number of columns to keep
let columnsToKeep: number = 3;
//set the number of rows to remove
let rowsToRemove: number = 0;
//resize the table range
let tblRange: ExcelScript.Range = tbl.getRangeBetweenHeaderAndTotal().getResizedRange(rowsToRemove,columnsToKeep - tblColumnCount);
//get the table values
let tblRangeValues: string[][] = tblRange.getValues() as string[][];
//create a JSON string
let result: string = JSON.stringify(tblRangeValues);
//return JSON string
return result;
}
Once you created your script, consider naming it something you'll remember when you call it in PowerAutomate (I called mine getTableValues). Next, after the recurrence in PowerAutomate, add a Run Script step. Fill out the values and select the script like so:
Next, you have to create the script which takes the input returned from the previous script and completes the final steps. So the script has to have a parameter that takes the string returned from the previous script (I called it tableValues in mine). In the script, you have to parse the json string array to create a 2d array, resize the initial range, and then set the values of the resized range. You can see a script that does that below:
function main(workbook: ExcelScript.Workbook, tableValues: string)
{
let sh: ExcelScript.Worksheet = workbook.getWorksheet("Sheet1")
//parses the JSON string to create array
let tableValuesArray: string[][] = JSON.parse(tableValues);
//gets row count from the array
let valuesRowCount: number = tableValuesArray.length - 1
//gets column count from the array
let valuesColumnCount: number = tableValuesArray[0].length - 1
//resizes the range
let rang: ExcelScript.Range = sh.getRange("A1").getResizedRange(valuesRowCount,valuesColumnCount)
//sets the value of the resized range to the array
rang.setValues(tableValuesArray)
}
In PowerAutomate, you have to create a second run script step. In the second step, you should be prompted with a value to enter after you've selected the script (the value is called tableValues in my step.) In the table values input, you have to enter the dynamic content Result value. Once this is done, you can save the script and test.
One thing to note is that the second script doesn't delete old range values from previous runs. This can be done in a number of different ways. But the preferred way may depend on how the workbook is structured. So I'd recommend writing code to clear the range in the second script somewhere in the beginning. Or better yet, add the output of the first script into an Excel table. And just empty out the table every time you run the second script.
If you'd like to see how you might do that, you can take a look at this post here
Related
I am working on the heritage codes which use C++ Excel Automation to output our analysis data in the excel spreadsheet. From the following article,
https://support.microsoft.com/en-us/topic/how-to-use-mfc-to-automate-excel-and-create-and-format-a-new-workbook-6f2450bc-ba35-a36a-df2f-c9dd53d7aef1
I knew we can use "range.SetFormula() function to calculate the formula results from some specific cells, for example:
range = sheet.GetRange(COleVariant("C2"), COleVariant("C6"));
range.SetFormula(COleVariant("=A2 & \" \" & B2"));
My question here is how can I use SetFormula function to point to some cell elements whose row & column are unknow but will be determined as the program runs. In specifically, I have a number of cell elements populated as my analysis runs. Different analysis will have different number of elements output to the excel spreadsheet. For example, if I have kw data, then the excel output will be populated in kw row 6 column and I also need to output some summary results based on these element underneath these populated elements. Something like this:
int kw = var_length; // the row changes depending on different analysis
DWORD numElements[2];
Range range;
range = sheet.GetRange(COleVariant(_T("A3")),COleVariant(_T("A3")));
numElements[0]= kw; //Number of rows in the range.
numElements[1]= 6; //Number of columns in the range.
saRet.Create(VT_R8, 2, numElements);
for(int iRow = 0;iRow < kw; iRow++)
{
for (iCol = 0; iCol < 6; iCol++)
{
index[0] = iRow;
index[1] = iCol;
saRet.PutElement(index, &somevalue);
}
}
range.SetValue2(COleVariant(saRet));
CString TStr;
TStr.Format(_T("A%d"), kw+2);
range = sheet.GetRange(COleVariant(TStr), COleVariant(TStr))
CString t1, t2;
t1.Format(_T("A%d"), kw/2);
t2.Format(_T("A%d"), kw);
range.SetFormula(COleVariant(L"=SUM(A&t1: A&t2)")); // Calculate the sum of second half of whole elements, Apparently, this didn't work, How can I fix this?
Here I want to sum the second half of whole elements but in the SetFormula function, I didn't know exactly row number for these element, eg, A25 - A50. The row number is dependent on the kw which is given as input from program. Different analysis, kw is different. I attempted to use TStr format to get the row number but it CAN NOT be used inside SetFormula function. Ideally I want to use formula for my summary data output so that if I change my populated the element values, the summary data output can change accordingly. I searched in your MSDN website but couldn't find any solution on how to resolve this.
Can someone help me with the issue?
Thanks in advance.
I need to know if there is any function that can import data from excel row by row?
I used to work with xlsread but it won't work for this case unless i use it in a function that takes all the columns and group all the element in the same row together...
Edit: I was able to do it using simple xlsread by the following code:
num = xlsread(excel_file,'B2:BI174');
row1=num(1:173:end);
It is tempting to read the data one row at a time, but that means you will waste time due to file access overhead. It's a lot faster to read all at once and re-pack into a cell array:
allData = xlsread('filename.xls');
oneRowPerElementCell = mat2cell(allData, ones(size(allData,1),1), size(allData,2));
Read xlsread documentation here to read a block from excel file.
Example: To read the first row from 1st to 26th coulmn use,
row1 = xlsread('filename.xlsx',sheet_no,'A1:Z1');
I have got a Excel file having around 7000 rows approx to read. And Excel file contains Table of Contents and the actual contents data in details below.
I would like to avoid all rows for Table of Content and start from actual content data to read. This is because if I need to read data for "CPU_INFO" the loop and search string occurrence twice 1] from Table of Content and 2] from actual Content.
So I would like to know if there is any way I can point to Start Row Index to start reading data content for Excel File , thus skipping whole of Table Of Content Section?
As taken from the Apache POI documentation on iterating over rows and cells:
In some cases, when iterating, you need full control over how missing or blank rows or cells are treated, and you need to ensure you visit every cell and not just those defined in the file. (The CellIterator will only return the cells defined in the file, which is largely those with values or stylings, but it depends on Excel).
In cases such as these, you should fetch the first and last column information for a row, then call getCell(int, MissingCellPolicy) to fetch the cell. Use a MissingCellPolicy to control how blank or null cells are handled.
If we take the example code from that documentation, and tweak it for your requirement to start on row 7000, and assuming you want to not go past 15k rows, we get:
// Decide which rows to process
int rowStart = Math.min(7000, sheet.getFirstRowNum());
int rowEnd = Math.max(1500, sheet.getLastRowNum());
for (int rowNum = rowStart; rowNum < rowEnd; rowNum++) {
Row r = sheet.getRow(rowNum);
int lastColumn = Math.max(r.getLastCellNum(), MY_MINIMUM_COLUMN_COUNT);
for (int cn = 0; cn < lastColumn; cn++) {
Cell c = r.getCell(cn, Row.RETURN_BLANK_AS_NULL);
if (c == null) {
// The spreadsheet is empty in this cell
} else {
// Do something useful with the cell's contents
}
}
}
I have reviewed the questions that may have had my answer and unfortunately they don't seem to apply. Here is my situation. I have to import worksheets from my client. In columns A, C, D, and AA the client has the information I need. The balance of the columns have what to me is worthless information. The column headers are consistent in the four columns I need, but are very inconsistent in the columns that don't matter. For example cell A1 contains Division. This is true across all of the spreadsheets. Cell B1 can contain anything from sleeve length to overall length to fit. What I need to do is to import only the columns I need and map them to an SQL 2008 R2 table. I have defined the table in a stored procedure which is currently calling an SSIS function.
The problem is that when I try to import a spreadsheet that has different column names the SSIS fails and I have to go back in an run it manually to get the fields set up right.
I cannot imagine that what I am trying to do has not been done before. Just so the magnitude is not lost, I have 170 users who have over 120 different spreadsheet templates.
I am desperate for a workable solution. I can do everything after getting the file into my table in SQL. I have even written the code to move the files back to the FTP server.
I put together a post describing how I've used a Script task to parse Excel. It's allowe me to import decidedly non-tabular data into a data flow.
The core concept is that you will use a the JET or ACE provider and simply query the data out of an Excel Worksheet/named range. Once you have that, you have a dataset you can walk through row-by-row and perform whatever logic you need. In your case, you can skip row 1 for the header and then only import columns A, C, D and AA.
That logic would go in the ExcelParser class. So, the Foreach loop on line 71 would probably be distilled down to something like (code approximate)
// This gets the value of column A
current = dr[0].ToString();
// this assigns the value of current into our output row at column 0
newRow[0] = current;
// This gets the value of column C
current = dr[2].ToString();
// this assigns the value of current into our output row at column 1
newRow[1] = current;
// This gets the value of column D
current = dr[3].ToString();
// this assigns the value of current into our output row at column 2
newRow[2] = current;
// This gets the value of column AA
current = dr[26].ToString();
// this assigns the value of current into our output row at column 3
newRow[3] = current;
You obviously might need to do type conversions and such here but that's core of the parsing logic.
I had to fill an Excel range from SQL Server according the following scheme
C1 C2 C3....C29
L1
L2
L3
L4
L5
.....
L120
I wondered what could be the fastest way to fetch each value corresponding to each pair (Li,Cj), value which is stored in SqlServer ?
I could not iterate over each cell.
What could have been your solution ?
I have to say that i managed to retrieve those data in less than 3 seconds
The fastest way VSTO provides is a helper function described in this article:
http://social.msdn.microsoft.com/Forums/en/vsto/thread/5cfc24cd-cbeb-4583-b6c8-ad1521e31267
If all you have is an array then you can assign the array directly to the value2 of a range and it populates fairly quickly.
You can set Application.EnableEvents=False and Application.ScreenUpdating=False to speed up your process dramatically. Remember reset them after the process.
Rather than make an individual stored procedure call for each (L,C) pair, make one call that fetches all the pairs in the table. Hopefully there is no precondition that prevents this from happening. Otherwise, because of the SQL calling overhead alone, you will not be able to get the data back in < 3 seconds. Pull the data in to a SqlDataReader if you can.
Next, populate a 2D array variable, according to the (L,C) relationship in your fetched data. Excel uses a 1-based array, which you can emulate (but not strictly required) as below:
// this creates a 1-based 2D array with 5 rows, 2 columns (5,2)
var my2DArray = Array.CreateInstance(
typeof(object), new int[] { 5, 2 }, new int[] { 1, 1 });
After populating the 2D array, set the array to the Excel worksheet. The code would look roughly like below:
// not sure what your cell refs are, so I'll be arbitrary...
var rng = myWorksheet.get_Range("A1", "B1");
rng = rng.get_Resize(my2DArray.GetUpperBound(0), my2DArrary.GetUpperBound(1));
rng.Value2 = my2DArray;
This should be the fastest way, as compared to setting cell values one by one.