how to write data to excel using scriptom in groovy? - excel

I am reading properties and their values from soapUI and write them to an excel.
I am able to write the unique properties name into an excel
def oExcel = new ActiveXObject('Excel.Application')
Thread.sleep(1000)
assert oExcel != null, "Excel object not initalized"
def openWb = oExcel.Workbooks.Open(excelPath) //excelPath complete path to the excel
def dtUsedRange = openWb.Sheets(dataSheetName).UsedRange //dataSheetName is the name of teh sheet which will ultimately hold the data
//add property names to xlMapSheet under col d or col# 4
for(int r = 1;r<=uniqPropName.size().toInteger();r++){ //uniqPropName is a list that holds all the unique property names in a test suite
openWb.Sheets(xlMapSheet).Cells(r,4).Value = uniqPropName[r-1]
}
oExcel.DisplayAlerts = false
openWb.Save
oExcel.DisplayAlerts = true
openWb.Close(false,null,false)
oExcel.Quit()
Scriptom.releaseApartment()
However now I have to write all the properties to the same excel. I have already created a map of the excel column names and soapUI properties so now i just have to find the matching excel col name from the map and write the property value under that excel.
I am using a function to do this stuff. This function is called from within a for loop which loops through all the properties in a test case. To this function I pass
sheetName //sheet where data has to be written
sheet //path of the excel file
pName //property name
pValue //property value
xMap //excel col name/heading map
tName //test case name
tsNum //step number
The relevant code for this function is below.
def write2Excel(sheetName,sheet,pName,pValue,xMap,tName,tsNum){
//find the xl Col Name from the map
def xl = new ActiveXObject('Excel.Application')
assert xl != null, "Excel object not initalized"
//open excel
def wb = xl.Workbooks.Open(sheet)
def rng = wb.Sheets(sheetName).UsedRange
//get row count
int iColumn = rng.Columns.Count.toInteger()
int iRow = rng.Rows.Count.toInteger()
//find column number using the col name
//find the row with matching testcase name and step#
//write data to excel
if(rFound){ //if a row matching test case name and step number is found
rng.Cells(r,colId).Value = pValue
}else{
rng = rng.Resize(r+1,iColumn) //if the testcase and step# row doesn't exist then the current range has to be extended to add one more row of data.
rng.Cells(r+1,colId).Value = pValue
}
//save and close
xl.DisplayAlerts = false
wb.Save
xl.DisplayAlerts = true
wb.Close(false,null,false)
xl.Quit()
Scriptom.releaseApartment()
}
The code is currently running. It has been running since yesterday evening(2pm EST) so even if the code works it is not optimal. I can't wait this long to write data.
The curious thing is that the size of the excel keeps increasing which would mean that data is being written to the excel but i have check the excel and it has no new data..nothing..zilch!!
Evidence of increasing size of the file.
20/02/2014 04:23 PM 466,432 my_excel_file.xls
20/02/2014 04:23 PM 466,944 my_excel_file.xls
20/02/2014 04:38 PM 470,016 my_excel_file.xls
20/02/2014 04:45 PM 471,552 my_excel_file.xls
20/02/2014 04:47 PM 472,064 my_excel_file.xls
20/02/2014 05:01 PM 474,112 my_excel_file.xls
20/02/2014 05:01 PM 474,112 my_excel_file.xls
21/02/2014 07:23 AM 607,232 my_excel_file.xls
21/02/2014 07:32 AM 608,768 my_excel_file.xls
21/02/2014 07:50 AM 611,328 my_excel_file.xls
My questions are:
1. Why is data not being written when i am calling the function from within the for loop but getting written when i call it linear-ly?
2. In the first piece of code the excel process goes away when its done writing but when the function is run, the excel process remains even though its memory utilization goes up and down.
I am going to kill the excel process and instead of looping I am going to try and write only one or two sets of data using the function and will update this question accordingly.

The process of opening an excel, writing to a cell, , save the excel, closing the excel is a time consuming task and when you multiply this with 300 test cases and ~15 properties per test, it can take significantly long. That is what has happening in my case and hence the process was taking forever to complete.
I am not 100% on why the size of the excel was increasing and nothing was getting written but i would guess that data was being kept in the memory and would have been written once the last cell was written, workbook saved and excel closed. This never happened because I didn't let it complete and would kill it when i realized that it has been running for an exceptionally long time.
In order to make this work, i changed my approach to the following.
generate a map of col name and prop name
generate a map of prop name and prop value for each test case. As one test case can have multiple property test steps, i create a multi map like this...
[Step#:[propname:propvalue,....propname:propvalue]]
create another map with col name and col id.
Create a new map with col id and prop value. I made this using the above created maps.
write data to excel. Because i already have the col id, and the value that goes into it. i don't do any checks and just write data to excel.
these steps are repeated for all test cases in the test suite. Using this process, i was able to complete my task within a few minutes.
I know i am using quite a few maps but this is the approach i could come up with. If anyone has a better approach, I would really like to try that out too.

Related

MATLAB: Save multiple tables in Excel using a while loop

I have the following while loop in which an image is read and analyzed, then the results are saved in a table, that is saved in an Excel worksheet. I initially did this code for one single image/table, then realized I need to develop it for n images.
I basically want the results to be saved in the same Excel worksheet without overwriting, ideally the tables are vertically separated by an empty row.
Here's my effort as for now:
while(1)
...
%code code code
...
message = sprintf('Do you want to save the results in an Excel
worksheet?');
reply = questdlg(message,'Run Program?','OK','Cancel', 'OK');
if strcmpi(reply, 'Cancel')
% User canceled so exit.
return;
end
% Table creation.
% code code code
% Saving table to Excel
T = table(Diameter,BandWidth,n1,n2,P1,P2,Damage,...
'RowNames',Band);
filename = 'Results.xlsx';
writetable(T, filename, 'Sheet',1, 'Range', 'A1','WriteRowNames',true);
% Create a while loop to save more experiments to the Excel worksheet.
promptMessage = sprintf('Do you want to process another photo?');
button = questdlg(promptMessage, 'Continue', 'Continue', 'Cancel',
'Continue');
if strcmpi(button, 'Cancel')
break;
end
end
If it can help you to get an idea, each table is a 6x8.
Prior to your while loop, declare a cell array to hold the table that you will eventually write to an excel file.
cellArrayOfTableToBeWritten = {};
Also prior to the loop, define a cell array that will serve as a blank row.
rowWidth = 8;
blankrow = repmat({''},1,rowWidth);
Where you currently write the table, instead add what you would have written to the cell array with a blank row at the bottom.
cellArrayOfTableToBeWritten = [cellArrayOfTableToBeWritten;
T.Properties.VariableNames;
table2cell(T);
blankrow];
Once your while loop is done, write the combined cell array to a file as an excel file.
xlswrite(filename, cellArrayOfTableToBeWritten);

How to loop through excel sheets, perform calculations, and compile results

I have roughly 70,000 sheets that all have to have calculations done, and then all results compiled into a new sheet (which would be 70,000 lines long).
It needs to be sorted by date.
I'm VERY very very poor at matlab, but I've what I need the script to do for each excel sheet, I'm just unsure how to make it do them for all.
Thank you!!! (I took out some of the not important code)
%Reading in excel sheet
B = xlsread('24259893-008020361800.TorqueData.20160104.034602AM.csv');
%Creating new matrix
[inYdim, inXdim] = size(B);
Ydim = inYdim;
[num,str,raw]=xlsread('24259893-008020361800.TorqueData.20160104.034602AM.csv',strcat('A1:C',num2str(Ydim)));
%Extracting column C
C=raw(:,3);
for k = 1:numel(C)
if isnan(C{k})
C{k} = '';
end
end
%Calculations
TargetT=2000;
AvgT=mean(t12);
TAcc=((AvgT-TargetT)/TargetT)*100 ;
StdDev=std(B(ind1:ind2,2));
ResTime=t4-t3;
FallTime=t6-t5;
DragT=mean(t78);
BreakInT=mean(t910);
BreakInTime=(t10-t9)/1000;
BreakInE=BreakInT*BreakInTime*200*.1047;
%Combining results
Results=[AvgT TAcc StdDev ResTime FallTime DragT BreakInT BreakInTime BreakInE]
I think I need to do something along the lines of:
filenames=dir('*.csv')
and I found this that may be useful:
filenames=dir('*.csv');
for file=filenames'
csv=load(file.name);
with stuff in here
end
You have the right idea, but you need to index your file names in order to be able to step through them in the for loop.
FileDir = 'Your Directory';
FileNames = {'Test1';'Test2';'Test3'};
for k=1:length(FileNames)
file=[FileDir,'/',FileNames{k}]);
[outputdata]=xlsread(file,sheet#, data locations);
THE REST OF YOUR LOOP, Indexed by k
end
How you choose to get the file names and directory is up to you.

Working with Excel sheets in MATLAB

I need to import some Excel files in MATLAB and work on them. My problem is that each Excel file has 15 sheets and I don't know how to "number" each sheet so that I can make a loop or something similar (because I need to find the average on a certain column on each sheet).
I have already tried importing the data and building a loop but MATLAB registers the sheets as chars.
Use xlsinfo to get the sheet names, then use xlsread in a loop.
[status,sheets,xlFormat] = xlsfinfo(filename);
for sheetindex=1:numel(sheets)
[num,txt,raw]=xlsread(filename,sheets{sheetindex});
data{sheetindex}=num; %keep for example the numeric data to process it later outside the loop.
end
I 've just remembered that i posted this question almost 2 years ago, and since I figured it out, I thought that posting the answer could prove useful to someone in the future.
So to recap; I needed to import a single column from 4 excel files, with each file containing 15 worksheets. The columns were of variable lengths. I figured out two ways to do this. The first one is by using the xlsread function with the following syntax.
for count_p = 1:2
a = sprintf('control_group_%d.xls',count_p);
[status,sheets,xlFormat] = xlsfinfo(a);
for sheetindex=1:numel(sheets)
[num,txt,raw]=xlsread(a,sheets{sheetindex},'','basic');
data{sheetindex}=num;
FifthCol{count_p,sheetindex} = (data{sheetindex}(:,5));
end
end
for count_p = 3:4
a = sprintf('exercise_group_%d.xls',(count_p-2));
[status,sheets,xlFormat] = xlsfinfo(a);
for sheetindex=1:numel(sheets)
[num,txt,raw]=xlsread(a,sheets{sheetindex},'','basic');
data{sheetindex}=num;
FifthCol{count_p,sheetindex} = (data{sheetindex}(:,5));
end
end
The files where obviously named control_group_1, control_group_2 etc. I used the 'basic' input in xlsread, because I only needed the raw data from the files, and it proved to be much faster than using the full functionality of the function.
The second way to import the data, and the one that i ended up using, is building your own activeX server and running a single excelapplication on it. Xlsread "opens" and "closes" an activeX server each time it's called so it's rather time consuming (using the 'basic' input does not though). The code i used is the following.
Folder=cd(pwd); %getting the working directory
d = dir('*.xls'); %finding the xls files
N_File=numel(d); % Number of files
hexcel = actxserver ('Excel.Application'); %starting the activeX server
%and running an Excel
%Application on it
hexcel.DisplayAlerts = true;
for index = 1:N_File %Looping through the workbooks(xls files)
Wrkbk = hexcel.Workbooks.Open(fullfile(pwd, d(index).name)); %VBA
%functions
WorkName = Wrkbk.Name; %getting the workbook name %&commands
display(WorkName)
Sheets=Wrkbk.Sheets; %sheets handle
ShCo(index)=Wrkbk.Sheets.Count; %counting them for use in the next loop
for j = 1:ShCo(index) %looping through each sheet
itemm = hexcel.Sheets.Item(sprintf('sheet%d',j)); %VBA commands
itemm.Activate;
robj = itemm.Columns.End(4); %getting the column i needed
numrows = robj.row; %counting to the end of the column
dat_range = ['E1:E' num2str(numrows)]; %data range
rngObj = hexcel.Range(dat_range);
xldat{index, j} = cell2mat(rngObj.Value); %getting the data in a cell
end;
end
%invoke(hexcel);
Quit(hexcel);
delete(hexcel);

How to import lots of data into matlab from a spreadsheet?

I have an excel spreadsheet with lots of data that I want to import into matlab.
filename = 'for_matlab.xlsx';
sheet = (13*2)+ 1;
xlRange = 'A1:G6';
all_data = {'one_a', 'one_b', 'two_a', 'two_b', 'three_a', 'three_b', 'four_a', 'four_b', 'five_a', 'five_b', 'six_a', 'six_b', 'seven_a', 'seven_b', 'eight_a', 'eight_b', 'nine_a', 'nine_b', 'ten_a', 'ten_b', 'eleven_a', 'eleven_b', 'twelve_a', 'twelve_b', 'thirteen_a', 'thirteen_b', 'fourteen_a'};
%read data from excel spreadsheet
for i=1:sheet,
all_data{i} = xlsread(filename, sheet, xlRange);
end
Each element of the 'all_data' vector has a corresponding matrix in separate excel sheet. The code above imports the last matrix only into all of the variables. Could somebody tell me how to get it so I can import these matrices into individual matlab variables (without calling the xlsread function 28 times)?
You define a loop using i but then put sheet in the actual xlsread call, which will just make it read repeatedly from the same sheet (the value of the variable sheet is not changing). Also not sure whether you intend to somehow save the contents of all_data, as written there's no point in defining it that way as it will just be overwritten.
There are two ways of specifying the sheet using xlsread.
1) Using a number. If you intended this then:
all_data{i} = xlsread(filename, i, xlRange);
2) Using the name of the sheet. If you intended this and the contents of all_data are the names of sheets, then:
data{i} = xlsread(filename, all_data{i}, xlRange); %avoiding overwriting

Excel UDF calculation should return 'original' value

I have created a VSTO plugin with my own RTD implementation that I am calling from my Excel sheets. To avoid having to use the full-fledged RTD syntax in the cells, I have created a UDF that hides that API from the sheet.
The RTD server I created can be enabled and disabled through a button in a custom Ribbon component.
The behavior I want to achieve is as follows:
If the server is disabled and a reference to my function is entered in a cell, I want the cell to display Disabled.
If the server is disabled, but the function had been entered in a cell when it was enabled (and the cell thus displays a value), I want the cell to keep displaying that value.
If the server is enabled, I want the cell to display Loading.
Sounds easy enough. Here is an example of the - non functional - code:
Public Function RetrieveData(id as Long)
Dim result as String
// This returns either 'Disabled' or 'Loading'
result = Application.Worksheet.Function.RTD("SERVERNAME", "", id)
RetrieveData = result
If(result = "Disabled") Then
// Obviously, this recurses (and fails), so that's not an option
If(Not IsEmpty(Application.Caller.Value2)) Then
// So does this
RetrieveData = Application.Caller.Value2
End If
End If
End Function
The function will be called in thousands of cells, so storing the 'original' values in another data structure would be a major overhead and I would like to avoid it. Also, the RTD server does not know the values, since it also does not keep a history of it, more or less for the same reason.
I was thinking that there might be some way to exit the function which would force it to not change the displayed value, but so far I have been unable to find anything like that.
EDIT:
Due to popular demand, some additional info on why I want to do all this:
As I said, the function will be called in thousands of cells and the RTD server needs to retrieve quite a bit of information. This can be quite hard on both network and CPU. To allow the user to decide for himself whether he wants this load on his machine, they can disable the updates from the server. In that case, they should still be able to calculate the sheets with the values currently in the fields, yet no updates are pushed into them. Once new data is required, the server can be enabled and the fields will be updated.
Again, since we are talking about quite a bit of data here, I would rather not store it somewhere in the sheet. Plus, the data should be usable even if the workbook is closed and loaded again.
Different tack=new answer.
A few things I've discovered the hard way, that you might find useful:
1.
In a UDF, returning the RTD call like this
' excel equivalent: =RTD("GeodesiX.RTD",,"status","Tokyo")
result = excel.WorksheetFunction.rtd( _
"GeodesiX.RTD", _
Nothing, _
"geocode", _
request, _
location)
behaves as if you'd inserted the commented function in the cell, and NOT the value returned by the RTD. In other words, "result" is an object of type "RTD-function-call" and not the RTD's answer. Conversely, doing this:
' excel equivalent: =RTD("GeodesiX.RTD",,"status","Tokyo")
result = excel.WorksheetFunction.rtd( _
"GeodesiX.RTD", _
Nothing, _
"geocode", _
request, _
location).ToDouble ' or ToString or whetever
returns the actual value, equivalent to typing "3.1418" in the cell. This is an important difference; in the first case the cell continues to participate in RTD feeding, in the second case it just gets a constant value. This might be a solution for you.
2.
MS VSTO makes it look as though writing an Office Addin is a piece of cake... until you actually try to build an industrial, distributable solution. Getting all the privileges and authorities right for a Setup is a nightmare, and it gets exponentially worse if you have the bright idea of supporting more than one version of Excel. I've been using Addin Express for some years. It hides all this MS nastiness and let's me focus on coding my addin. Their support is first-rate too, worth a look. (No, I am not affiliated or anything like that).
3.
Be aware that Excel can and will call Connect / RefreshData / RTD at any time, even when you're in the middle of something - there's some subtle multi-tasking going on behind the scenes. You'll need to decorate your code with the appropriate Synclock blocks to protect your data structures.
4.
When you receive data (presumably asynchronously on a separate thread) you absolutely MUST callback Excel on the thread on which you were intially called (by Excel). If you don't, it'll work fine for a while and then you'll start getting mysterious, unsolvable crashes and worse, orphan Excels in the background. Here's an example of the relevant code to do this:
Imports System.Threading
...
Private _Context As SynchronizationContext = Nothing
...
Sub New
_Context = SynchronizationContext.Current
If _Context Is Nothing Then
_Context = New SynchronizationContext ' try valiantly to continue
End If
...
Private Delegate Sub CallBackDelegate(ByVal GeodesicCompleted)
Private Sub GeodesicComplete(ByVal query As Query) _
Handles geodesic.Completed ' Called by asynchronous thread
Dim cbd As New CallBackDelegate(AddressOf GeodesicCompleted)
_Context.Post(Function() cbd.DynamicInvoke(query), Nothing)
End Sub
Private Sub GeodesicCompleted(ByVal query As Query)
SyncLock query
If query.Status = "OK" Then
Select Case query.Type
Case Geodesics.Query.QueryType.Directions
GeodesicCompletedTravel(query)
Case Geodesics.Query.QueryType.Geocode
GeodesicCompletedGeocode(query)
End Select
End If
' If it's not resolved, it stays "queued",
' so as never to enter the queue again in this session
query.Queued = Not query.Resolved
End SyncLock
For Each topic As AddinExpress.RTD.ADXRTDTopic In query.Topics
AddinExpress.RTD.ADXRTDServerModule.CurrentInstance.UpdateTopic(topic)
Next
End Sub
5.
I've done something apparently akin to what you're asking in this addin. There, I asynchronously fetch geocode data from Google and serve it up with an RTD shadowed by a UDF. As the call to GoogleMaps is very expensive, I tried 101 ways and several month's of evenings to keep the value in the cell, like what you're attempting, without success. I haven't timed anything, but my gut feeling is that a call to Excel like "Application.Caller.Value" is an order of magnitude slower than a dictionary lookup.
In the end I created a cache component which saves and re-loads values already obtained from a very-hidden spreadsheet which I create on the fly in Workbook OnSave. The data is stored in a Dictionary(of string, myQuery), where each myQuery holds all the relevant info.
It works well, fulfils the requirement for working offline and even for 20'000+ formulas it appears instantaneous.
HTH.
Edit: Out of curiosity, I tested my hunch that calling Excel is much more expensive than doing a dictionary lookup. It turns out that not only was the hunch correct, but frighteningly so.
Public Sub TimeTest()
Dim sw As New Stopwatch
Dim row As Integer
Dim val As Object
Dim sheet As Microsoft.Office.Interop.Excel.Worksheet
Dim dict As New Dictionary(Of Integer, Integer)
Const iterations As Integer = 100000
Const elements As Integer = 10000
For i = 1 To elements + 1
dict.Add(i, i)
Next
sheet = _ExcelWorkbook.ActiveSheet
sw.Reset()
sw.Start()
For i As Integer = 1 To iterations
row = 1 + Rnd() * elements
Next
sw.Stop()
Debug.WriteLine("Empty loop " & (sw.ElapsedMilliseconds * 1000) / iterations & " uS")
sw.Reset()
sw.Start()
For i As Integer = 1 To iterations
row = 1 + Rnd() * elements
val = sheet.Cells(row, 1).value
Next
sw.Stop()
Debug.WriteLine("Get cell value " & (sw.ElapsedMilliseconds * 1000) / iterations & " uS")
sw.Reset()
sw.Start()
For i As Integer = 1 To iterations
row = 1 + Rnd() * elements
val = dict(row)
Next
sw.Stop()
Debug.WriteLine("Get dict value " & (sw.ElapsedMilliseconds * 1000) / iterations & " uS")
End Sub
Results:
Empty loop 0.07 uS
Get cell value 899.77 uS
Get dict value 0.15 uS
Looking up a value in a 10'000 element Dictionary(Of Integer, Integer) is over 11'000 times faster than fetching a cell value from Excel.
Q.E.D.
Maybe... Try making your UDF wrapper function non-volatile, that way it won't get called unless one of its arguments changes.
This might be a problem when you enable the server, you'll have to trick Excel into calling your UDF again, it depends on what you're trying to do.
Perhaps explain the complete function you're trying to implement?
You could try Application.Caller.Text This has the drawback of returning the formatted value from the rendering layer as text, but seems to avoid the circular reference problem.Note: I have not tested this hack under all possible circumstances ...

Resources