I am working on the heritage codes which use C++ Excel Automation to output our analysis data in the excel spreadsheet. From the following article,
https://support.microsoft.com/en-us/topic/how-to-use-mfc-to-automate-excel-and-create-and-format-a-new-workbook-6f2450bc-ba35-a36a-df2f-c9dd53d7aef1
I knew we can use "range.SetFormula() function to calculate the formula results from some specific cells, for example:
range = sheet.GetRange(COleVariant("C2"), COleVariant("C6"));
range.SetFormula(COleVariant("=A2 & \" \" & B2"));
My question here is how can I use SetFormula function to point to some cell elements whose row & column are unknow but will be determined as the program runs. In specifically, I have a number of cell elements populated as my analysis runs. Different analysis will have different number of elements output to the excel spreadsheet. For example, if I have kw data, then the excel output will be populated in kw row 6 column and I also need to output some summary results based on these element underneath these populated elements. Something like this:
int kw = var_length; // the row changes depending on different analysis
DWORD numElements[2];
Range range;
range = sheet.GetRange(COleVariant(_T("A3")),COleVariant(_T("A3")));
numElements[0]= kw; //Number of rows in the range.
numElements[1]= 6; //Number of columns in the range.
saRet.Create(VT_R8, 2, numElements);
for(int iRow = 0;iRow < kw; iRow++)
{
for (iCol = 0; iCol < 6; iCol++)
{
index[0] = iRow;
index[1] = iCol;
saRet.PutElement(index, &somevalue);
}
}
range.SetValue2(COleVariant(saRet));
CString TStr;
TStr.Format(_T("A%d"), kw+2);
range = sheet.GetRange(COleVariant(TStr), COleVariant(TStr))
CString t1, t2;
t1.Format(_T("A%d"), kw/2);
t2.Format(_T("A%d"), kw);
range.SetFormula(COleVariant(L"=SUM(A&t1: A&t2)")); // Calculate the sum of second half of whole elements, Apparently, this didn't work, How can I fix this?
Here I want to sum the second half of whole elements but in the SetFormula function, I didn't know exactly row number for these element, eg, A25 - A50. The row number is dependent on the kw which is given as input from program. Different analysis, kw is different. I attempted to use TStr format to get the row number but it CAN NOT be used inside SetFormula function. Ideally I want to use formula for my summary data output so that if I change my populated the element values, the summary data output can change accordingly. I searched in your MSDN website but couldn't find any solution on how to resolve this.
Can someone help me with the issue?
Thanks in advance.
First explaining the dataframe, the values of columns '0-156', '156-234', '234-546' .... '> 76830' is the percentage distribution for each range of distances in meters, totaling 100%.
Column 'Cell Name' refers to the data element of the other columns and the column 'Distance' is the column that will trigger the desired sum.
I need to sum the values of the columns '0-156', '156-234', '234-546' .... '> 76830' which are less than the value of the 'Distance' (Meters) column.
Below creation code for testing.
import pandas as pd
# initialize list of lists
data = [['Test1',0.36516562,19.065996,49.15094,24.344206,0.49186087,1.24217,5.2812457,0.05841639,0,0,0,0,158.4122868],
['Test2',0.20406325,10.664485,48.70978,14.885571,0.46103176,8.75815,14.200708,2.1162114,0,0,0,0,192.553074],
['Test3',0.13483211,0.6521175,6.124511,41.61725,45.0036,5.405257,1.0494527,0.012979688,0,0,0,0,1759.480042]
]
# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['Cell Name','0-156','156-234','234-546','546-1014','1014-1950','1950-3510','3510-6630','6630-14430','14430-30030','30030-53430','53430-76830','>76830','Distance'])
Example of what should be done:
The value of column 'Distance' = 158.412286772863 therefore would have to sum the values <= of the following columns, 0-156, '156-234' totalizing 19.43116162 %.
Thanks so much!
As I understand it, you want to sum up all the percentage values in a row, where the lower value of the column-description (in case of '0-156' it would be 0, in case of '156-234' it would be 156, and so on...) is smaller than the value in the distance column.
First I would suggest, that you transform your string-like column-names into values, as an example:
lowerlimit=df.columns[2]
>>'156-234'
Then read the string only till the '-' and make it a number
int(lowerlimit[:lowerlimit.find('-')])
>> 156
You can loop this through all your columns and make a new row for the lower limits.
For a bit more simplicity I left out the first column for your example, and added another first row with the lower limits of each column, that you could generate as described above. Then this code works:
data = [[0,156,234,546,1014,1950,3510,6630,11430,30030,53430,76830,1e-23],[0.36516562,19.065996,49.15094,24.344206,0.49186087,1.24217,5.2812457,0.05841639,0,0,0,0,158.4122868],
[0.20406325,10.664485,48.70978,14.885571,0.46103176,8.75815,14.200708,2.1162114,0,0,0,0,192.553074],
[0.13483211,0.6521175,6.124511,41.61725,45.0036,5.405257,1.0494527,0.012979688,0,0,0,0,1759.480042]
]
# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['0-156','156-234','234-546','546-1014','1014-1950','1950-3510','3510-6630','6630-14430','14430-30030','30030-53430','53430-76830','76830-','Distance'])
df['lastindex']=None
df['sum']=None
After creating basically your dataframe, I add two columns 'lastindex' and 'sum'.
Then I am searching for the last index in every row, that is has its lower limit below the distance given in that row (df.iloc[x,-3]); afterwards I'm summing up the respective columns in that row.
for i in np.arange(1,len(df)):
df.at[i,'lastindex']=np.where(df.iloc[0,:-3]<df.iloc[i,-3])[0][-1]
df.at[i,'sum']=sum(df.iloc[i][0:df.at[i,'lastindex']+1])
I hope, this is helpful. Best, lepakk
I'm reading an excelfile in matlab with
[NUM,TXT,RAW]=xlsread(DATENEXCEL,sSheet_Data);
In the excelfile are different datamatrices in different sheets in the following form
Date Firm1 Firm2 Firm3 ...
1.1.16 12 12 12
... ... ... ...
Currently I'm handling the pure data with the NUM object and the header row with the TXT object. My first issue is how to combine the header row with the data rows. Looping does not work, since I predefine the data matrix with
daten=zeros([length(sDatesequence) size(RAW,2)]);
because I want to be able to add more data from different sources to that object. Predefining with zeros, however, leads Matlab to expect doubles and not characters. Converting the cell array TXT with cell2mat delivers unsatisfying results:
cell2mat(TXT(1,:))=Firm1Firm2Firm3...
hence only a long string vector.
Question: Is there another way to combine character vectors and double matrices?
Regards,
Richard
You can combine them in a cell array.
c{1,1} = 'Firm1';
c{1,2} = datavector;
c{2,1} = 'Firm2';
c{2,2} = datavector;
But as far as I know it is not possible to add text headers to a numerical matrix, unless you do something with typcasting. But I would not recommend that.
d(1:8)='Firm1 '; %must have exactly eight characters (a double has a length of 8 bytes)
y = typecast(uint8(d),'double') %now you have a number that would fit in a matrix of doubles
x=char(typecast(y,'uint8')) %now it's converted back to text
I am trying to figure out how to specify a common range for xlsread() function in matlab.
Usually I use n=xlsread('filename','#sheet','A1:A10'), but I have quite a bit of data in the same sheet and I'd like to know if I can specify it with one range, i.e . if all my data is between '1:10', I want to specify 1:10 as range, and only call the letter values of each column.
I was thinking to do it as follows:
function [a,b,c]=getdata(filename,'1:10')
a=xlsread(filename,1,'A:A'???)
b=xlsread(filename,1,'B:B'???)
c=xlsread(filename,1,'C:C'???)
end
After some research I could not find any information as to how this is done.
Thanks in advance,
Greg
If you want to read 1 to 10 rows of column A, use:
data = xlsread(filename, 1, 'A1:A10');
If you want to read 1 to 10 rows of all columns, use:
data = xlsread(filename, 1, '1:10');
If you want to read 1 to 10 rows of, say, first three columns A, B, and C, use:
data = xlsread(filename, 1, 'A1:C10');
Using dynamic variable names is always a bad idea. Read this for explanation. But if you still want to create a, b, and c and so on depending on the number of columns in the Excel file, you can use:
for k=1:size(data,2)
assignin('caller', char(96+k), data(:,k)); %or char(64+k) for block letters
end
The above will work if number of columns are less than or equal to 26. This may only be feasible if you're dealing with a few columns. But I still recommend to avoid it.
I have the following code to read off time series data (contained in sheets 5 to 19 in an excel workbook). Each worksheet is titled "TS" followed by the number of the time series. The process works fine apart from one thing- when I study the returns I find that all the time series are shifted along by 5. i.e. TS 6 becomes the 11th column in the "returns" data and TS 19 becomes the 5th column, TS 15 becomes the 1st column etc. I need them to be in the same order that they are read- such that TS 1 is in the 1st column, TS 2 in the 2nd etc.
This is a problem because I read off the titles of the worksheets ("AssetList") which maintain their actual order throughout subsequent codes. Therefore when I recombine the titles and the returns I find that they do not match. This complicates further manipulation when, for example column 4 is titled "TS 4" but actually contains the data of TS 18.
Is there something in this code that I have wrong?
XL='TimeSeries.xlsx';
formatIn = 'dd/mm/yyyy';
formatOut = 'mmm-dd-yyyy';
Bounds=3;
[Bounds,~] = xlsread(XL,Bounds);
% Determine the number of worksheets in the xls-file:
FirstSheet=5;
[~,AssetList] = xlsfinfo(XL);
lngth=size(AssetList,2);
AssetList(:,1:FirstSheet-1)=[];
% Loop through the number of sheets and RETRIEVE VALUES
merge_count = 1;
for I=FirstSheet:lngth
[FundValues, ~, FundSheet] = xlsread(XL,I);
% EXTRACT DATES AND DATA AND COMBINE
% (TO REMOVE UNNECCESSARY TEXT IN ROWS 1 TO 4)
Fund_dates_data = FundSheet(4:end,1:2);
FundDates = cellstr(datestr(datevec(Fund_dates_data(:,1),...
formatIn),formatOut));
FundData = cell2mat(Fund_dates_data(:,2));
% CREATE TIME SERIES FOR EACH FUND
Fundts{I}=fints(FundDates,FundData,['Fund',num2str(I)]);
if merge_count == 2
Port = merge(Fundts{I-1},Fundts{I},'DateSetMethod','Intersection');
end
if merge_count > 2
Port = merge(Port,Fundts{I},'DateSetMethod','Intersection');
end
merge_count = merge_count + 1;
end
% ANALYSE PORTFOLIO
Returns=tick2ret(Port);
q = Portfolio;
q = q.estimateAssetMoments(Returns)
[qassetmean, qassetcovar] = q.getAssetMoments
This is probably due to merge. By default, it sorts columns alphabetically. Unfortunately, as your naming pattern is "FundN", this means that, for example, Fund10 will normally be sorted before Fund9. So as you're looping over I from 5 to 19, you will have Fund10, through Fund19, followed by Fund4 through Fund9.
One way of solving this would to be always use zero padding (Fund01, Fund02, etc) so that alphabetical order and numerical order are the same. Alternatively, force it to stay in the order you read/merge the data by setting SortColumns to 0:
Port = merge(Port,Fundts{I},'DateSetMethod','Intersection','SortColumns',0);