C++ Excel Automation : How to use SetFormula function for the cell elements whose row & column were unknown but will be given by the program later on? - excel

I am working on the heritage codes which use C++ Excel Automation to output our analysis data in the excel spreadsheet. From the following article,
https://support.microsoft.com/en-us/topic/how-to-use-mfc-to-automate-excel-and-create-and-format-a-new-workbook-6f2450bc-ba35-a36a-df2f-c9dd53d7aef1
I knew we can use "range.SetFormula() function to calculate the formula results from some specific cells, for example:
range = sheet.GetRange(COleVariant("C2"), COleVariant("C6"));
range.SetFormula(COleVariant("=A2 & \" \" & B2"));
My question here is how can I use SetFormula function to point to some cell elements whose row & column are unknow but will be determined as the program runs. In specifically, I have a number of cell elements populated as my analysis runs. Different analysis will have different number of elements output to the excel spreadsheet. For example, if I have kw data, then the excel output will be populated in kw row 6 column and I also need to output some summary results based on these element underneath these populated elements. Something like this:
int kw = var_length; // the row changes depending on different analysis
DWORD numElements[2];
Range range;
range = sheet.GetRange(COleVariant(_T("A3")),COleVariant(_T("A3")));
numElements[0]= kw; //Number of rows in the range.
numElements[1]= 6; //Number of columns in the range.
saRet.Create(VT_R8, 2, numElements);
for(int iRow = 0;iRow < kw; iRow++)
{
for (iCol = 0; iCol < 6; iCol++)
{
index[0] = iRow;
index[1] = iCol;
saRet.PutElement(index, &somevalue);
}
}
range.SetValue2(COleVariant(saRet));
CString TStr;
TStr.Format(_T("A%d"), kw+2);
range = sheet.GetRange(COleVariant(TStr), COleVariant(TStr))
CString t1, t2;
t1.Format(_T("A%d"), kw/2);
t2.Format(_T("A%d"), kw);
range.SetFormula(COleVariant(L"=SUM(A&t1: A&t2)")); // Calculate the sum of second half of whole elements, Apparently, this didn't work, How can I fix this?
Here I want to sum the second half of whole elements but in the SetFormula function, I didn't know exactly row number for these element, eg, A25 - A50. The row number is dependent on the kw which is given as input from program. Different analysis, kw is different. I attempted to use TStr format to get the row number but it CAN NOT be used inside SetFormula function. Ideally I want to use formula for my summary data output so that if I change my populated the element values, the summary data output can change accordingly. I searched in your MSDN website but couldn't find any solution on how to resolve this.
Can someone help me with the issue?
Thanks in advance.

Related

How to convert matrix data into columns?

I have data of 100 x 101. I want to convert them in series e.g. for first row all column data then for 2nd row all column data and so on. It means the result will be three columns only. The first column with row numbers, the 2nd column with column numbers and the 3rd column with the value for that respective row and column.
Could you please help me doing this conversion in MATLAB.
Available data are in ASCII format and it is possible to open in both MATLAB and Excel.
This can be done by find:
A = rand(100,101);
[data(:,1), data(:,2), data(:,3)] = find(A);
data = sortrows(data,[1 2]);
Note that this is highly inefficient, as you are storing 3 values where you only need to store 1 (the element's actual value). For accessing a specific element, say row 31, column 43, you simply do A(31,43), where you index the matrix.
The file size of data is indeed three times larger than that of A:
whos
Name Size Bytes Class Attributes
A 100x101 80800 double
data 10100x3 242400 double
You can use the ind2sub function that is faster and make more sense in this situation:
tic
A = rand(100,101);
[data(:,1), data(:,2), data(:,3)] = find(A);
data = sortrows(data,[1 2]);
toc
tic
B = A' ;
[data_B(:,1), data_B(:,2)] = ind2sub(size(B), 1:length(B(:)));
data_B(:,3) = B(:);
toc
The output for the timing is as follow:
Elapsed time is 0.002130 seconds (first method)
Elapsed time is 0.000525 seconds (second method).

What is the most efficient format for storing strings from a for loop?

I have a script that runs through a series of strings and using regex pulls out certain strings (approx 4 output strings per input string).
e.g. HelloStackOverflowWorld
-> Hello; Stack; Overflow; World;
The final output would ideally be a table where I can filter based upon the strings in the columns. Using the case above, column 1 row 1 would have 'Hello', column 2 row 1 would have 'Stack' and so on.
The problem is, the size of the output will change depending on the input so I am unsure of what output format to use.
At the moment I used something similar to this:
if strfind(missing{ii},'hello')
miss.exch = [miss.exch;'hello'];
temp.exc = regexp(missing{ii},'(?<=\d[Q|T])(\w*?)(?=[q])','match');
miss.exc = [miss.exc;temp.exc];
temp.TQ= regexp(missing{ii},'(Qc|Tc)','match');
if strcmp(temp.TQ{1,1}, 'Tc')
miss.TQ = [miss.TQ;'variableA'];
elseif temp.TQ{1,1} == 'Qc'
miss.TQ = [miss.TQ;'variableB'];
end
else if .........
end
Which obviously results in a 1x1 struct consisting of a number of fields each with many cells. This makes filtering on strings an issue!
How can I define and add data into a 'table of strings' that I can then filter?
I think you are just looking for a cell array. Here is a simple example of what they can do:
C = {'Abc','Bcd';'Cde',[]}
strcmp(C,'Cde')
Results in:
ans =
0 0
1 0
Make sure to check doc cell to see how you can access them.

Read from a specific row onwards from Excel File

I have got a Excel file having around 7000 rows approx to read. And Excel file contains Table of Contents and the actual contents data in details below.
I would like to avoid all rows for Table of Content and start from actual content data to read. This is because if I need to read data for "CPU_INFO" the loop and search string occurrence twice 1] from Table of Content and 2] from actual Content.
So I would like to know if there is any way I can point to Start Row Index to start reading data content for Excel File , thus skipping whole of Table Of Content Section?
As taken from the Apache POI documentation on iterating over rows and cells:
In some cases, when iterating, you need full control over how missing or blank rows or cells are treated, and you need to ensure you visit every cell and not just those defined in the file. (The CellIterator will only return the cells defined in the file, which is largely those with values or stylings, but it depends on Excel).
In cases such as these, you should fetch the first and last column information for a row, then call getCell(int, MissingCellPolicy) to fetch the cell. Use a MissingCellPolicy to control how blank or null cells are handled.
If we take the example code from that documentation, and tweak it for your requirement to start on row 7000, and assuming you want to not go past 15k rows, we get:
// Decide which rows to process
int rowStart = Math.min(7000, sheet.getFirstRowNum());
int rowEnd = Math.max(1500, sheet.getLastRowNum());
for (int rowNum = rowStart; rowNum < rowEnd; rowNum++) {
Row r = sheet.getRow(rowNum);
int lastColumn = Math.max(r.getLastCellNum(), MY_MINIMUM_COLUMN_COUNT);
for (int cn = 0; cn < lastColumn; cn++) {
Cell c = r.getCell(cn, Row.RETURN_BLANK_AS_NULL);
if (c == null) {
// The spreadsheet is empty in this cell
} else {
// Do something useful with the cell's contents
}
}
}

Reading and Combining Excel Time Series in Matlab- Maintaining Order

I have the following code to read off time series data (contained in sheets 5 to 19 in an excel workbook). Each worksheet is titled "TS" followed by the number of the time series. The process works fine apart from one thing- when I study the returns I find that all the time series are shifted along by 5. i.e. TS 6 becomes the 11th column in the "returns" data and TS 19 becomes the 5th column, TS 15 becomes the 1st column etc. I need them to be in the same order that they are read- such that TS 1 is in the 1st column, TS 2 in the 2nd etc.
This is a problem because I read off the titles of the worksheets ("AssetList") which maintain their actual order throughout subsequent codes. Therefore when I recombine the titles and the returns I find that they do not match. This complicates further manipulation when, for example column 4 is titled "TS 4" but actually contains the data of TS 18.
Is there something in this code that I have wrong?
XL='TimeSeries.xlsx';
formatIn = 'dd/mm/yyyy';
formatOut = 'mmm-dd-yyyy';
Bounds=3;
[Bounds,~] = xlsread(XL,Bounds);
% Determine the number of worksheets in the xls-file:
FirstSheet=5;
[~,AssetList] = xlsfinfo(XL);
lngth=size(AssetList,2);
AssetList(:,1:FirstSheet-1)=[];
% Loop through the number of sheets and RETRIEVE VALUES
merge_count = 1;
for I=FirstSheet:lngth
[FundValues, ~, FundSheet] = xlsread(XL,I);
% EXTRACT DATES AND DATA AND COMBINE
% (TO REMOVE UNNECCESSARY TEXT IN ROWS 1 TO 4)
Fund_dates_data = FundSheet(4:end,1:2);
FundDates = cellstr(datestr(datevec(Fund_dates_data(:,1),...
formatIn),formatOut));
FundData = cell2mat(Fund_dates_data(:,2));
% CREATE TIME SERIES FOR EACH FUND
Fundts{I}=fints(FundDates,FundData,['Fund',num2str(I)]);
if merge_count == 2
Port = merge(Fundts{I-1},Fundts{I},'DateSetMethod','Intersection');
end
if merge_count > 2
Port = merge(Port,Fundts{I},'DateSetMethod','Intersection');
end
merge_count = merge_count + 1;
end
% ANALYSE PORTFOLIO
Returns=tick2ret(Port);
q = Portfolio;
q = q.estimateAssetMoments(Returns)
[qassetmean, qassetcovar] = q.getAssetMoments
This is probably due to merge. By default, it sorts columns alphabetically. Unfortunately, as your naming pattern is "FundN", this means that, for example, Fund10 will normally be sorted before Fund9. So as you're looping over I from 5 to 19, you will have Fund10, through Fund19, followed by Fund4 through Fund9.
One way of solving this would to be always use zero padding (Fund01, Fund02, etc) so that alphabetical order and numerical order are the same. Alternatively, force it to stay in the order you read/merge the data by setting SortColumns to 0:
Port = merge(Port,Fundts{I},'DateSetMethod','Intersection','SortColumns',0);

Excel VBA - Referring between ranges

Here's my problem:
I have two ranges, r_products and r_ptypes which are from two different sheets, but of same length i.e.
Set r_products = Worksheets("Products").Range("A2:A999")
Set r_ptypes = Worksheets("SKUs").Range("B2:B999")
I'm searching for something in r_products and I've to select the value at the same position in r_ptypes. The result of Find method is being stored in cellfound. Now, consider the following data:
Sheet: Products
A B C D
1 Product
2 S1
3 P1
4 P2
5 S2
6 S3
Sheet: SKUs
A B C D
1 SKU
2 S1-RP003
3 P1-BQ900
4 P2-HE300
5 S2-NB280
6 S3-JN934
Now, when I search for S1, cellfound.Row gives me value 2, which is, as I understand, 2nd row in the total worksheet, but is actually 1st row in the range(A2:A999).
When I use this cellfound.Row value to refer to r_ptypes.cells(cellfound.Row), It is taking it as an Index value and returns B3 (P1-BQ900) instead of what I want, i.e. B2 (S1-RP003).
My question is how'll I find out the index number in cellfound? If not possible, how can I use Row number to extract data from r_ptypes?
Dante's solution above works fine. Also, I managed to get the index value using built in excel function Match instead of using Find method of a range. Listing it here for reference.
indexval = Application.WorksheetFunction.Match("searchvalue", r_products, 0)
Using the above, I'm now able to refer the rows in r_ptypes
skuvalue = r_ptypes.Rows(indexval).Value
Because .Row always returns the absolute row number of a sheet, not the offset (i.e. index) in the range.
So, just do some minus job to deal with it.
For you example,
r_ptypes.Cells(cellfound.Row - r_ptypes.Cells(1).Row + 1)
or a little bit neat (?)
With r_ptypes
.Cells(cellfound.Row - .Cells(1).Row + 1)
End With
That is, get the row difference between cellfound and the first cell and + 1 because Excel counts cells from 1.

Resources