I'll begin by saying I am really not good in programming especially in extracting data so please bear with me. I think my problem is simple, I just can't figure out how to do it.
My problem is I want to extract part of the data in a series of excel files stored in the same folder. To be specific, let's say I have 10 excel files with 1000 data in each (from A1:A1000). I want to extract the first 100 data (A1:A100) in each excel files and store it in a single variable with a 10x100 size (each row represents each file).
I would really appreciate if any of you can help me. This would make my data processing a lot faster.
EDIT: I have figured out the code but my next problem is to create another loop such that it will reread again the 10 files but this time extract A101:A200 until A901:A1000.
here's the code i've written:
for k=1:1:10
file=['',int2str(k),'.xlsx'];
data=(xlsread(file,'A1:A100'))';
z(k,:)=data(1,:);
end
I'm not sure how i will edit this part data=(xlsread(file,'A1:A100'))' to do the loop i wanted to do.
my next problem is to create another loop such that it will reread again the 10 files but this time extract A101:A200 until A901:A1000.
Why? Why not extract A1:A1000 in one block and then reshape or otherwise split up the data?
data(k,:)=(xlsread(file,'A1:A1000'))';
Then the A1:A100 data is in data(k,1:100), and so on. If you do this:
data = data(reshape, [10 100 10]);
Then data(:,:,1) should be your A1:A100 values as in your original loop, and so on until data(:,:,10).
This should do it:
for sec = 1:1:10
for k=1:1:10
file=['',int2str(k),'.xlsx'];
section = ['A', num2str(1+(100*(sec-1)), ':A', mum2str(100*sec)]
data=(xlsread(file, section))';
z(k,:)=data(1,:);
end
output(sec) = z;
end
Here's a suggestion to loop through the different cells to read. Obviously, you can change how you arrange the collected data in z. I have done it as the first index representing the different cells to read (1 for 1:100, 2 for 101:200, etc...), the second index being the file number (as per your original code) and the third index the data (100 data points).
% pre-allocate data
z = zeros(10,10,100);
for kk=1:10
cells_to_read = ['A' num2str(kk*100-99) ':A' num2str(kk*100)];
for k=1:10
file=['',int2str(k),'.xlsx'];
data=(xlsread(file,cells_to_read))';
z(kk,k,:)=data(1,:);
end
end
Related
I collect data into an excel sheet through a labview program, the data is collected continuously at a regular interval and events are marked in the file in one of the columns with TaskA_0 representing the start of an event, and TaskA_1 representing the end. this is a snippet of the data:
Time Data 1 Data 2 Data 3 Data 4 Event Name
13:38:41.888 0.719460527 0.701654664 0.221332969 0.012234448 Task A_0
13:38:41.947 0.437707516 0.588673334 0.524042112 0.309975646 Task A_1
13:38:42.021 0.186847503 0.589175696 0.393891242 0.917737946 Task B_0
13:38:42.115 0.44490411 0.073132298 0.897701096 0.633815257 Task B_1
13:38:42.214 0.833793601 0.004524633 0.40950937 0.808966844 Task C_0
13:38:42.314 0.953997375 0.055717025 0.914080619 0.166492915 Task C_1
13:38:42.414 0.245698313 0.066643778 0.515709814 0.606289696 Task D_0
13:38:42.514 0.248038367 0.862138045 0.025489223 0.352926629 Task D_1
Currently I load this into matlab using xlsread , and then run a strfind to locate the row indices of the event markers in order to break my data up into tasks where each each task is the data in the adjacent columns between TaskA_0 and TaskA_1 (here there is no data between but normally there is, also between event names there are blank cells normally). Is this the best method for doing this? Once I have it in separate variables I then perform identical actions on each variable, usually basic statistics and some data plotting. If I want to batch process my data I have to rewrite these lines over and over to get the data broken up by task. Which even I know is wrong and horribly inefficient but I don't know how better to do this.
[Data,Text]= xlsread('C:\TestData.xlsx',2); %time column and event name column end up in text, as does the data headers, hence the +1 for the row indices
IndexTaskAStart = find(~cellfun(#isempty,strfind(Text(:,2),'TaskA_0')))+1;
IndexTaskAEnd = find(~cellfun(#isempty, strfind(Text(:,2),'TaskA_1')))+1;
TaskAData = Data([IndexTaskAStart:IndexTaskAEnd,:];
Now I can perform analysis on columns in TaskAData, and repeat the process for the remaining tasks.
Presuming you cannot change the format of the files, but do know which tasks you're searching for, you can still automate the search by creating a list of task names, just appending _0 and _1 onto the task names to search. Then do not create individual named variables but store in a cell array for easier looping:
tasknames = {'Task A', 'Task B', 'Task C'}
for n = 1:numel(tasknames)
first = find(~cellfun(#isempty,strfind(Text(:,2),[tasknames{n},'_0'])))+1;
last = find(~cellfun(#isempty, strfind(Text(:,2),[tasknames{n},'_1'])))+1;
task_data{n} = Data(first:last, :);
% whatever other analysis you require goes here
end
If there are a large number of tasknames but they follow some pattern, you might prefer to create them on the fly instead of preallocating a list in tasknames.
I am creating a program which opens an image, and uses the MATLAB ginput command to store x and y coordinates, which are operated on in the loop to fulfill requirements of an if statement and output a number or string corresponding to the region clicked during the ginput session. At the same time, I am using the input command to input a string from the command window relating to these numbers. The ginput session is placed in a while loop so a click in a specific area will end the input session. For each session (while loop), only one or two inputs from the command window are needed. Finally, I am trying to store all the data in a csv or txt file, but I would like it to be tabulated so it is easy to read, i.e. rows and columns with headers. I am including some sample code. My questions are: 1, how can an input of x and y coordinates be translated to a string? It is simple to do this for a number, but I cannot get it to work with a string. 2, any help on printing the strings and number to a tabulated text or cdv file would be appreciated.
Command line input:
prompt='Batter:';
Batter=input(prompt,'s');
While Loop:
count=1;
flag=0;
while(flag==0)
[x,y]= ginput(1);
if (y>539)
flag=1;
end
if x<594 && x>150 && y<539 && y>104
%it's in the square
X=x;
Y=y;
end
if x<524 && x>207 && y<480 && y>163
result='strike'
else
result='ball'
end
[x,y]= ginput(1);
pitch=0;
if x<136 && x>13
%its' pitch column
if y<539
pitch=6;
end
if y<465
pitch=5;
end
if y<390
pitch=4;
end
if y<319
pitch=3;
end
if y<249
pitch=2;
end
if y<175
pitch=1;
end
end
if pitch==0
else
plot(X,Y,'o','MarkerFaceColor',colors(pitch),'MarkerSize',25);
text(X,Y,mat2str(count));
end
count=count+1
M(count,:)=[X,Y,pitch];
end
For the above series of if statements, I would prefer a string output rather than the numbers 1-6 if the condition is satisfied.
The fprintf function is used to print to a file, but I have issues combining the strings and numbers using it:
fileID = fopen('pitches.csv','w');
fid = fopen('gamedata.txt','w');
fmtString = [repmat('%s\t',1,size(Batter,2)-1),'%s\n'];
fprintf(fid,fmtString,Batter,result);
fclose(fid);
for i=1:length(M)
fprintf(fileID,'%6.2f %6.2f %d\n',M(i,1),M(i,2),M(i,3));
end
fclose(fileID);
I have tried adding the string handles to the fprintf command along with the columns of M, but get errors. I either need to store them in an array (How?) and print all the array columns to the file, or use some other method. I also tried a version of the writetable method:
writetable(T,'tabledata2.txt','Delimiter','\t','WriteRowNames',true)
but I can't get everything to work right. Thanks very much for any help.
Let's tackle your questions one at a time:
1, how can an input of x and y coordinates be translated to a string?
You can use the sprintf command in MATLAB. This takes exactly the same syntax as fprintf, but the output of this function will give you a string / character array of whatever you desire.
2, any help on printing the strings and number to a tabulated text or cdv file would be appreciated.
You can still use fprintf but you can specify a matrix as the input. As such, you can do this:
fprintf(fileID,'%6.2f %6.2f %d\n', M.');
This will write the entire matrix to file. However, care must be taken here because MATLAB writes to files in column major format. This means that it will traverse along the rows before going to the next column. If you want to write data row by row, you will need to transpose the matrix first so that when you are traversing down the rows, it will basically do what you want. You will need to keep this in mind before you start trying to write strings to an file. What I would recommend is that you place each string in a cell array, then loop through each element in the cell array and write each string individually line by line.
Hopefully this helps push you in the right direction. Reply back to me in a comment and we can keep talking if you need more help.
I have a variable that is created by a loop. The variable is large enough and in a complicated enough form that I want to save the variable each time it comes out of the loop with a different name.
PM25 is my variable. But I want to save it as PM25_year in which the year changes based on `str = fname(13:end)'
PM25 = permute(reshape(E',[c,r/nlay,nlay]),[2,1,3]); % Reshape and permute to achieve the right shape. Each face of the 3D should be one day
str = fname(13:end); % The year
% Third dimension is organized so that the data for each site is on a face
save('PM25_str', 'PM25_Daily_US.mat', '-append')
The str would be a year, like 2008. So the variable saved would be PM25_2008, then PM25_2009, etc. as it is created.
Defining new variables based on data isn't considered best practice, but you can store your data more efficiently using a cell array. You can store even a large, complicated variable like your PM25 variable within a single cell. Here's how you could go about doing it:
Place your PM25 data for each year into the cell array C using your loop:
for i = 1:numberOfYears
C{i} = PM25;
end
Resulting in something like this:
C = { PM25_2005, PM25_2006, PM25_2007 };
Now let's say you want to obtain your variable for the year 2006. This is easy (assuming you aren't skipping years). The first year of your data will correspond to position 1, the second year to position 2, etc. So to find the index of the year you want:
minYear = 2005;
yearDesired = 2006;
index = yearDesired - minYear + 1;
PM25_2006 = C{index};
You can do this using eval, but note that it's often not considered good practice. eval may be a security risk, as it allows user input to be executed as code. A better way to do this may be to use a cell array or an array of objects.
That said, I think this will do what you want:
for year = 2008:2014
eval(sprintf('PM25_%d = permute(reshape(E',[c,r/nlay,nlay]),[2,1,3]);',year));
save('PM25_Daily_US.mat',sprintf('PM25_%d',year),'-append');
end
I do not recommend to set variables like this since there is no way to track these variables and completely prevents all kind of error checking that MATLAB does beforehand. This kind of code is handled completely in runtime.
Anyway in case you have a really good reason for doing this I recommend that you use the function assignin for this.
assignin('caller', ['myvar',num2str(1)], 63);
I have the code mentioned below in matlab. I want to write all the 162 rows and 4 columns calculated into an excel file.
When i use xlswrite in the code i get only one row and 4 columns as the value of P gets overwritten in each iterative step.
If i use another loop inside the for loop the execution time increase drastically. Please help to least write the values of P into an array which i can later write into excel file(when i tried 'In an assignment A(I) = B, the number of elements in B and I must be the same' error appeared.)
please help
function FitSMC_BC
clc
% Parameters: P(1)=theta_S; P(2)=theta_r; P(3)=psib; P(4)=lamda;
smcdata=xlsread('asimdata');
nn=length(smcdata)-1;
for i=1:nn
psi=smcdata(:,1);
thetaObs=smcdata(:,i+1);
%Make an initial guess:
Pini=[0.5 0.1 -1 1.5];
P=fminsearch(#ObFun,Pini,[],psi,thetaObs);
disp(['result',num2str(i),': P=',num2str(P)]);
theta=Gettheta(P,psi);
end
function OF=ObFun(P,psi,thetaObs)
theta=Gettheta(P,psi);
OF=sqrt(mean((theta - thetaObs).^2));
function theta=Gettheta(P,psi)
SoilPars.theta_S=P(1);
SoilPars.theta_r=P(2);
SoilPars.psib=P(3);
SoilPars.lamda=P(4);
[theta]=thetaFun(psi,SoilPars);
function [theta]=thetaFun(psi,SoilPars)
theta_S=SoilPars.theta_S;
theta_r=SoilPars.theta_r;
psib=SoilPars.psib;
lamda=SoilPars.lamda;
theta=theta_r+((theta_S-theta_r)*((psib./psi).^lamda));
theta(psi>psib)=theta_S;
You can modify the P line with
P(i,:) = fminsearch(#ObFun,Pini,[],psi,thetaObs);
P will store each calculation (4 element vector) in a new line.
You may also initialise P before the for loop with P = nan(nn, 4);
Then write P in an Excel file using xlswrite.
I haven't studied your code in-depth, but as far as I can tell, you have two options:
Create a matrix P and use xlswrite on the entire matrix. This seems to me like the most reasonable approach.
Use xlswrite1 from the fileexchange in a loop. This will increase execution time a bit, but not nearly as much as using regular xlswrite as it is specially deigned to be used inside loops. The reason why it is so much faster is because it only opens and closes the Excel-file once, whereas the regular xlswrite opens and closes it every time you call the function.
You seem to know how to use indexing so I'm not sure why you're simply doing something like this:
P = zeros(size(smcdata,1),nn)
for i=1:nn
...
P(:,i) = fminsearch(#ObFun,Pini,[],psi,thetaObs);
disp(['result',num2str(i),': P=',num2str(P(:,i))]);
theta = Gettheta(P(:,i),psi); % Why is this here? Are you writing it to file too?
end
xlswrite('My_FileName.xls',P);
Or you could call xlswrite on each iteration of the loop (probably slower) and append the new data using something like this:
for i=1:nn
...
P = fminsearch(#ObFun,Pini,[],psi,thetaObs);
disp(['result',num2str(i),': P=',num2str(P)]);
theta = Gettheta(P,psi); % Why is this here? Are you writing it to file too?
xlswrite('My_FileName.xls',P,1,['A' int2str((i-1)*size(P,2)+1)]);
end
Of course your code isn't runnable so you'll have to debug any other little errors. Also, since smcdata seems to be a matrix rather than a vector, you should be careful using length with it. You probably should use size.
In my code I'm trying to use load with entries from a cell, but it is not working. The portion of my code below produces a 3 dimensional array of strings. The strings represent the paths to file names.
for i = 1:Something
for j = 1:Something Different
for k = 1: Yet Something Something Different
DataPath{j,k,i} = 'F:\blah\blah\blah\fileijk %file changes based on i,j,and k
end
end
end
In the next part of the code I want to use load to open the files using the path names defined in the code above. I do this using the code below.
Dummy = DataPath{l,(k-1)*TSRRange+m};
Data = load(Dummy);
The idea is for Dummy to take the string content out of DataPath so I can use it in load. By doing this I thought that Dummy would be defined as a string and not a cell, but this is not the case. How do I pull the string out of DataPath so I can use it with load? Thanks.
I have to load the data this way because the data is located in multiple folders. I can post more of the code if needed, but it is complex.
Dummy is a cell because you assigned a 3D cell array but are accessing a 2D cell with Dummy = Datapath{1,(k-1)*TSRRange+m}
I don't believe that you can expect to access all cell elements I this way. Instead, use three indices just as you did when creating it.