Reading and Combining Excel Time Series in Matlab- Maintaining Order - excel

I have the following code to read off time series data (contained in sheets 5 to 19 in an excel workbook). Each worksheet is titled "TS" followed by the number of the time series. The process works fine apart from one thing- when I study the returns I find that all the time series are shifted along by 5. i.e. TS 6 becomes the 11th column in the "returns" data and TS 19 becomes the 5th column, TS 15 becomes the 1st column etc. I need them to be in the same order that they are read- such that TS 1 is in the 1st column, TS 2 in the 2nd etc.
This is a problem because I read off the titles of the worksheets ("AssetList") which maintain their actual order throughout subsequent codes. Therefore when I recombine the titles and the returns I find that they do not match. This complicates further manipulation when, for example column 4 is titled "TS 4" but actually contains the data of TS 18.
Is there something in this code that I have wrong?
XL='TimeSeries.xlsx';
formatIn = 'dd/mm/yyyy';
formatOut = 'mmm-dd-yyyy';
Bounds=3;
[Bounds,~] = xlsread(XL,Bounds);
% Determine the number of worksheets in the xls-file:
FirstSheet=5;
[~,AssetList] = xlsfinfo(XL);
lngth=size(AssetList,2);
AssetList(:,1:FirstSheet-1)=[];
% Loop through the number of sheets and RETRIEVE VALUES
merge_count = 1;
for I=FirstSheet:lngth
[FundValues, ~, FundSheet] = xlsread(XL,I);
% EXTRACT DATES AND DATA AND COMBINE
% (TO REMOVE UNNECCESSARY TEXT IN ROWS 1 TO 4)
Fund_dates_data = FundSheet(4:end,1:2);
FundDates = cellstr(datestr(datevec(Fund_dates_data(:,1),...
formatIn),formatOut));
FundData = cell2mat(Fund_dates_data(:,2));
% CREATE TIME SERIES FOR EACH FUND
Fundts{I}=fints(FundDates,FundData,['Fund',num2str(I)]);
if merge_count == 2
Port = merge(Fundts{I-1},Fundts{I},'DateSetMethod','Intersection');
end
if merge_count > 2
Port = merge(Port,Fundts{I},'DateSetMethod','Intersection');
end
merge_count = merge_count + 1;
end
% ANALYSE PORTFOLIO
Returns=tick2ret(Port);
q = Portfolio;
q = q.estimateAssetMoments(Returns)
[qassetmean, qassetcovar] = q.getAssetMoments

This is probably due to merge. By default, it sorts columns alphabetically. Unfortunately, as your naming pattern is "FundN", this means that, for example, Fund10 will normally be sorted before Fund9. So as you're looping over I from 5 to 19, you will have Fund10, through Fund19, followed by Fund4 through Fund9.
One way of solving this would to be always use zero padding (Fund01, Fund02, etc) so that alphabetical order and numerical order are the same. Alternatively, force it to stay in the order you read/merge the data by setting SortColumns to 0:
Port = merge(Port,Fundts{I},'DateSetMethod','Intersection','SortColumns',0);

Related

C++ Excel Automation : How to use SetFormula function for the cell elements whose row & column were unknown but will be given by the program later on?

I am working on the heritage codes which use C++ Excel Automation to output our analysis data in the excel spreadsheet. From the following article,
https://support.microsoft.com/en-us/topic/how-to-use-mfc-to-automate-excel-and-create-and-format-a-new-workbook-6f2450bc-ba35-a36a-df2f-c9dd53d7aef1
I knew we can use "range.SetFormula() function to calculate the formula results from some specific cells, for example:
range = sheet.GetRange(COleVariant("C2"), COleVariant("C6"));
range.SetFormula(COleVariant("=A2 & \" \" & B2"));
My question here is how can I use SetFormula function to point to some cell elements whose row & column are unknow but will be determined as the program runs. In specifically, I have a number of cell elements populated as my analysis runs. Different analysis will have different number of elements output to the excel spreadsheet. For example, if I have kw data, then the excel output will be populated in kw row 6 column and I also need to output some summary results based on these element underneath these populated elements. Something like this:
int kw = var_length; // the row changes depending on different analysis
DWORD numElements[2];
Range range;
range = sheet.GetRange(COleVariant(_T("A3")),COleVariant(_T("A3")));
numElements[0]= kw; //Number of rows in the range.
numElements[1]= 6; //Number of columns in the range.
saRet.Create(VT_R8, 2, numElements);
for(int iRow = 0;iRow < kw; iRow++)
{
for (iCol = 0; iCol < 6; iCol++)
{
index[0] = iRow;
index[1] = iCol;
saRet.PutElement(index, &somevalue);
}
}
range.SetValue2(COleVariant(saRet));
CString TStr;
TStr.Format(_T("A%d"), kw+2);
range = sheet.GetRange(COleVariant(TStr), COleVariant(TStr))
CString t1, t2;
t1.Format(_T("A%d"), kw/2);
t2.Format(_T("A%d"), kw);
range.SetFormula(COleVariant(L"=SUM(A&t1: A&t2)")); // Calculate the sum of second half of whole elements, Apparently, this didn't work, How can I fix this?
Here I want to sum the second half of whole elements but in the SetFormula function, I didn't know exactly row number for these element, eg, A25 - A50. The row number is dependent on the kw which is given as input from program. Different analysis, kw is different. I attempted to use TStr format to get the row number but it CAN NOT be used inside SetFormula function. Ideally I want to use formula for my summary data output so that if I change my populated the element values, the summary data output can change accordingly. I searched in your MSDN website but couldn't find any solution on how to resolve this.
Can someone help me with the issue?
Thanks in advance.

Need an Expression to Split a DateRange column into 3 bins

I want to represent count of ID's wrt to Daterange column. I tried to split it into bins but there was option to split into no of parts and not as 3 differnt date values.I want to split the Date Range column into 3 parts so that i  can represent the data in a bar chart as Current,Pat and Future data. Each of the 3 bins are represented as:
1.Current - Count of ID for Current month[Dec 2016]. The data for current month should be dunamically calculated since if the next month comes the data should point to that by the dynalic calculation
2.Past - Count of ID for Data less than current month[Data < Dec 2016]. I need to be able to dynamically change the no of months using custom expression so that user can chnage th e no of months it goes back to. Need the expression in such a way that it can be set by a custom expression ,if not the nos can be changed at the expression
3.Future - Count of ID for Data greater than current month[Data  Dec 2016]. I need to be able to dynamically change the no of months using custom expression so that user can chnage th e no of months in the future. There will be future dates available since it is a data for manintenance done in the future time.
This 3 data needs to be as a custom/binned column so that the data is represented as shown in the attached picture.
You just need to create a calculated column...
case
when DatePart('month',[DateColumn]) = DatePart('month',DateTimeNow()) and DatePart('year',[DateColumn]) = DatePart('year',DateTimeNow()) then "Current"
when [DateColumn] < DateTimeNow() and [DateColumn] >= DateAdd('month',${NumOfMonthsBack} * (-1),[DateColumn]) then "Past"
when [DateColumn] > DateTimeNow() and [DateColumn] <= DateAdd('month',${NumOfMonthsAhead},[DateColumn]) then "Future"
end as [MonthRange]

How to convert matrix data into columns?

I have data of 100 x 101. I want to convert them in series e.g. for first row all column data then for 2nd row all column data and so on. It means the result will be three columns only. The first column with row numbers, the 2nd column with column numbers and the 3rd column with the value for that respective row and column.
Could you please help me doing this conversion in MATLAB.
Available data are in ASCII format and it is possible to open in both MATLAB and Excel.
This can be done by find:
A = rand(100,101);
[data(:,1), data(:,2), data(:,3)] = find(A);
data = sortrows(data,[1 2]);
Note that this is highly inefficient, as you are storing 3 values where you only need to store 1 (the element's actual value). For accessing a specific element, say row 31, column 43, you simply do A(31,43), where you index the matrix.
The file size of data is indeed three times larger than that of A:
whos
Name Size Bytes Class Attributes
A 100x101 80800 double
data 10100x3 242400 double
You can use the ind2sub function that is faster and make more sense in this situation:
tic
A = rand(100,101);
[data(:,1), data(:,2), data(:,3)] = find(A);
data = sortrows(data,[1 2]);
toc
tic
B = A' ;
[data_B(:,1), data_B(:,2)] = ind2sub(size(B), 1:length(B(:)));
data_B(:,3) = B(:);
toc
The output for the timing is as follow:
Elapsed time is 0.002130 seconds (first method)
Elapsed time is 0.000525 seconds (second method).

What is the most efficient format for storing strings from a for loop?

I have a script that runs through a series of strings and using regex pulls out certain strings (approx 4 output strings per input string).
e.g. HelloStackOverflowWorld
-> Hello; Stack; Overflow; World;
The final output would ideally be a table where I can filter based upon the strings in the columns. Using the case above, column 1 row 1 would have 'Hello', column 2 row 1 would have 'Stack' and so on.
The problem is, the size of the output will change depending on the input so I am unsure of what output format to use.
At the moment I used something similar to this:
if strfind(missing{ii},'hello')
miss.exch = [miss.exch;'hello'];
temp.exc = regexp(missing{ii},'(?<=\d[Q|T])(\w*?)(?=[q])','match');
miss.exc = [miss.exc;temp.exc];
temp.TQ= regexp(missing{ii},'(Qc|Tc)','match');
if strcmp(temp.TQ{1,1}, 'Tc')
miss.TQ = [miss.TQ;'variableA'];
elseif temp.TQ{1,1} == 'Qc'
miss.TQ = [miss.TQ;'variableB'];
end
else if .........
end
Which obviously results in a 1x1 struct consisting of a number of fields each with many cells. This makes filtering on strings an issue!
How can I define and add data into a 'table of strings' that I can then filter?
I think you are just looking for a cell array. Here is a simple example of what they can do:
C = {'Abc','Bcd';'Cde',[]}
strcmp(C,'Cde')
Results in:
ans =
0 0
1 0
Make sure to check doc cell to see how you can access them.

Separating Data in Excel

I receive data which is numbered consecutively but under each number there is zero, 1 , 2 or 3 pieces of information before the next number. I need to separate the data so that the numbers are evenly spaced!
Any help would be greatlyfully received - I should have done Excel at University but that was decades ago!
Thankyou,
Graeme
Hi Robert,As a Australian new user I am not allowed to insert images! Hence imagine each comma as separating cells and a dash as an empty Excel cell with the first row what I initially receive. The "g,h" etc are sentences. The 2nd row is what I would like to achieve. Unfortunately it wont format to show the cells going down not across so it has to be across here as it would in Excel or an inserted image! Thanks!
Receive 1,g,h,2,k,3,a,c,v,4,5,r,t
Require 1,g,h,-,2,k,-,-,3,a,c,v,4,-,-,-,5,r,t,-,
Hence if I received 3 pieces of information for each "number" (client) the numbers would automatically be evenly spaced but this unfortunately does not occur!
Graeme, let me know if you can create VBA Macro's based on an example.
I just wrote one that does the job, although being far form elegant.
<-----------CODE START--------->
Sub uniformdata()
ReceiveCol = 1 'Column where received data starts
ReceiveRow = 1 'Row where where received data starts
PublishCol = 1 'Column where Published data starts
PublishRow = 2 'Column where Published data starts
Do Until Cells(ReceiveRow, ReceiveCol).Value = ""
For counter = 1 To 4
If counter = 1 Then
Cells(PublishRow, PublishCol).Value = Cells(ReceiveRow, ReceiveCol).Value
ReceiveCol = ReceiveCol + 1
Else
If IsNumeric(Cells(ReceiveRow, ReceiveCol).Value) Then
Cells(PublishRow, PublishCol).Value = "-"
Else
Cells(PublishRow, PublishCol).Value = Cells(ReceiveRow, ReceiveCol).Value
ReceiveCol = ReceiveCol + 1
End If
End If
PublishCol = PublishCol + 1
Next
Loop
End Sub
<---------CODE END--------->
What it does is to go through your list from left to right until it finds an empty cell. Inside it constantly runs a loop for 4 times. On the first pass it assumes that your number is there, and immediately writes it. On the other 3 passes, it checks if the next received cell is a number or not. If it is a number it knows to fill in the gap with a dash, if it is not a number the character found is used.
Let me know if this solves your issue. In my example, it worked like a charm.
Regards,
Robert Ilbrink

Resources