In Excel, my Data looks like this
A B C D
15 16 17 18
11 12 13 14
7 8 9 10
I am looking for a solution (without VBA) to transform my data into a single row, like this:
15 16 17 18 11 12 13 14 7 8 9 10
For those who have access to the TOROW function:
=TOROW(A1:D3)
Use this function:
=OFFSET(Matrix,MOD((COLUMN()-COLUMN($A$7)),ROWS(Matrix)),TRUNC((COLUMN()-COLUMN($A$7))/(ROWS(Matrix))),1,1)
If you have TEXTJOIN() then try-
=TRANSPOSE(FILTERXML("<t><s>"&TEXTJOIN("</s><s>",TRUE,A1:D3)&"</s></t>","//s"))
Related
This is how my input looks like in excel,
days_took_to_equip
cumu_percent
1
0.017418302
2
0.020625735
3
0.023148307
4
0.025237133
5
0.026972115
6
0.028752754
7
0.030350763
8
0.032040087
9
0.033603853
10
0.035270349
11
0.036788458
12
0.037518976
13
0.038283738
14
0.039379516
15
0.040189935
16
0.040783481
17
0.041685215
18
0.042347247
19
0.043032109
20
0.043739798
21
0.044230616
22
0.04476709
23
0.045269322
24
0.045725896
25
0.046250956
26
0.046684701
27
0.047129861
28
0.047620678
29
0.047997352
30
0.048396854
Where my expected output is
Range
Avg cum Percent
1 to 10
0.027
1 to 20
0.033
1 to 30
0.038
Tried pivots tables and labelling is tricky here
I would need this out put to plot a graph
Try-
=MAP(SEQUENCE(3,1,10,10),LAMBDA(x,AVERAGE(INDEX(B2:B31,SEQUENCE(x)))))
I got three answers and the cells consists of formula
E3: =AVERAGE(INDEX($B$2:$B$31,SEQUENCE(RIGHT($D3,2))))
F3: =AVERAGE(INDEX($B$2:$B$31,ROW(INDIRECT("1:"&RIGHT($D3,2)))))
G3: =AVERAGE(OFFSET($A$1,1,1,RIGHT(D3,2)))
There are two parts of my query:
1) I have multiple .xlsx files stored in a folder, a total of 1 year's worth (~ 365 .xlsx files). They are named according to date: ' A_ddmmmyyyy.xlsx' (e.g. A_01Jan2016.xlsx). Each .xlsx has 5 columns of data: Date, Quantity, Latitude, Longitude, Measurement. The problem is, each .xlsx file consists about 400,000 rows of data and although I have scripts in Excel to merge them, the inherent row restriction in Excel prevents me from merging all the data together.
(i) Is there a way to read recursively the data from each .xlsx sheet into MATLAB, and specifying the variable name (i.e. Date, Quantity etc) for each column(variable) within MATLAB (there are no column headings in the .xlsx files)?
(ii) How can I merge the data for each column from each .xlsx together?
Thank you
Jefferson
Let's go by parts
First I do not recommend to join all your files data in one column, there is no need to have this information all together you can work separately with this, using for example datastore
working in matlab in mya directory:
>> pwd
ans =
/home/anquegi/learn/matlab/stackoverflow
I have a folder with a folder that have two sample excel files:
>> ls
20_hz.jpg big_data_store_analysis.m excel_files octave-workspace sample-file.log
40_hz.jpg chirp_signals.m NewCode.m sample.csv
>> ls excel_files/
A_01Jan2016.xlsx A_02Jan2016.xlsx
the content of each file is :
Date Quantity Latitude Longitude Measurement
1 1 1 1 1
2 2 2 2 2
3 3 3 3 3
4 4 4 4 4
5 5 5 5 5
6 6 6 6 6
7 7 7 7 7
8 8 8 8 8
9 9 9 9 9
10 10 10 10 10
11 11 11 11 11
12 12 12 12 12
13 13 13 13 13
14 14 14 14 14
15 15 15 15 15
16 16 16 16 16
17 17 17 17 17
18 18 18 18 18
19 19 19 19 19
20 20 20 20 20
21 21 21 21 21
22 22 22 22 22
Only to who how it will work.
Reading the data:
>> ssds = spreadsheetDatastore('./excel_files')
ssds =
SpreadsheetDatastore with properties:
Files: {
'/home/anquegi/learn/matlab/stackoverflow/excel_files/A_01Jan2016.xlsx';
'/home/anquegi/learn/matlab/stackoverflow/excel_files/A_02Jan2016.xlsx'
}
Sheets: ''
Range: ''
Sheet Format Properties:
NumHeaderLines: 0
ReadVariableNames: true
VariableNames: {'Date', 'Quantity', 'Latitude' ... and 2 more}
VariableTypes: {'double', 'double', 'double' ... and 2 more}
Properties that control the table returned by preview, read, readall:
SelectedVariableNames: {'Date', 'Quantity', 'Latitude' ... and 2 more}
SelectedVariableTypes: {'double', 'double', 'double' ... and 2 more}
ReadSize: 'file'
Now you have all your data in tables let's see a preview
>> data = preview(ssds)
data =
Date Quantity Latitude Longitude Measurement
____ ________ ________ _________ ___________
1 1 1 1 1
2 2 2 2 2
3 3 3 3 3
4 4 4 4 4
5 5 5 5 5
6 6 6 6 6
7 7 7 7 7
8 8 8 8 8
The preview is a good point to get sample data to work.
You do not need to merge you can work throught all the elements:
>> ssds.VariableNames
ans =
'Date' 'Quantity' 'Latitude' 'Longitude' 'Measurement'
>> ssds.VariableTypes
ans =
'double' 'double' 'double' 'double' 'double'
% let's get all the Latitude elements that have Date equal 1, in this case the tow files are the same, so we wil get two elements with value 1
>> reset(ssds)
accum = [];
while hasdata(ssds)
T = read(ssds);
accum(end +1) = T(T.Date == 1,:).Latitude;
end
>> accum
accum =
1 1
So you need to work with datastore and tables, is a bit tricky but very useful, you also would like to control the readsize and other variables in datastore objects. but this is a good way working with large data files in matlab
For older versions of matlab you can use a more traditional approximation:
folder='./excel_files';
filetype='*.xlsx';
f=fullfile(folder,filetype);
d=dir(f);
for k=1:numel(d);
data{k}=xlsread(fullfile(folder,d(k).name));
end
Now you have the data stored in data
folder='./excel_files';
filetype='*.xlsx';
f=fullfile(folder,filetype);
d=dir(f);
for k=1:numel(d);
data{k}=xlsread(fullfile(folder,d(k).name));
end
data
data =
[22x5 double] [22x5 double]
data{1}
ans =
1 1 1 1 1
2 2 2 2 2
3 3 3 3 3
4 4 4 4 4
5 5 5 5 5
6 6 6 6 6
7 7 7 7 7
8 8 8 8 8
9 9 9 9 9
10 10 10 10 10
11 11 11 11 11
12 12 12 12 12
13 13 13 13 13
14 14 14 14 14
15 15 15 15 15
16 16 16 16 16
17 17 17 17 17
18 18 18 18 18
19 19 19 19 19
20 20 20 20 20
21 21 21 21 21
22 22 22 22 22
But be carefull with a lot of large file
I have a table that looks like this:
ID Total
3 3
3 3
3 3
4 11
4 11
4 11
4 11
4 11
4 11
6 9
6 9
7 13
7 13
7 13
7 13
7 13
7 13
7 13
7 13
7 13
7 13
7 13
7 13
7 13
I would like to calculate the median of column B (Total), excluding duplicate combinations of columns A and B. This could be achieved by constructing a table as below, and calculating the median from that table.
ID Total
3 3
4 11
6 9
7 13
Is there any way of obtaining the median without having to go through this process of manually deleting duplicates?
=MEDIAN(IF(FREQUENCY(MATCH(A2:A25&"|"&B2:B25,A2:A25&"|"&B2:B25,0),ROW(A2:A25)-MIN(ROW(A2:A25))+1),B2:B25))
There is a way with two additional columns. The first column is concatenation of ID and Total, the second counts occurences of each individual combination. Then you just do the median on those rows where the combination occurs for the first time.
In sheet 1 I have:
Sno Description
1 uproc_incident_X
2 sys_win_disque_e
3 sys_unx_disk
4 process_unx_event_wait
5 process_unx_Uproc
6 process_win_china
7 http_get_zom_facturation
8 http_get_zom_stars
9 services_win_TaskScheduler
10 check_sos_out
11 check_LOG
12 app_unx_check
13 app_unx_11000
14 app_win_mqmngr
15 app_lnx_log_syslog
16 app_ora_alertlog
17 ora_tbs_usage
sheet 2 contains:
Sno Description Time
1 uproc 20
2 sys_win 20
3 sys_unx 15
4 process 12
5 http_get 12
6 services 10
7 check 10
8 app_unx 15
9 app_win 15
10 app_lnx 10
11 app_ora 10
12 ora 10
I want a formula to write in sheet 1 next to description by matching my sheet 2 with sheet 1 and provide the exact match number result as in sheet 2 in sheet 1 so the final result should look like this:
Sno Description Time
1 uproc_incident_X 20
2 sys_win_disque_e 20
3 sys_unx_disk 15
4 process_unx_event_wait 12
5 process_unx_Uproc 12
6 process_win_china 12
7 http_get_zom_facturation 12
8 http_get_zom_stars 12
9 services_win_TaskScheduler 10
10 check_sos_out 10
11 check_LOG 10
12 app_unx_check 15
13 app_unx_11000 15
14 app_win_mqmngr 15
15 app_lnx_log_syslog 10
16 app_ora_alertlog 10
17 ora_tbs_usage 10
Can any one help me?
I suggest a mapping table as shown in Column C and then in D2 and copied down:
=VLOOKUP(C2,'sheet 2'!B:C,2,0)
In Excel, I want to lookup/index a table that matches both the station_number and the month.
Say I have the following data on sheet1:
Jan Feb Mar Apr May
station1 1 8 17 14 0
station5 4 5 8 10 14
station7 18 7 4 9 10
station10 5 11 15 12 4
On sheet2, I want to fill in the details below:
Station1 Station2 Station3 Station4 Station5 Station6
Jan 1 4
Feb 8 5
Mar 17 8
Apr 14 10
May 0 14
What is the formula I use in order to look up sheet1 and complete sheet2? I tried =VLOOKUP(B1&A2,'Sheet1'!A1:F5,2,FALSE) which is obviously incorrect. Any help would be great.
You should use Hlookup something like following for column station1:
=+HLOOKUP(A2,Sheet1!$A$1:$F$2,2,0)
It should work and hope this helps also.
where sheet1 is the actual source of your input data, but offcourse with every column the references must be change so for station10 column formula would be:
=+HLOOKUP(A2,Sheet1!$A$1:$F$5,5,0)
Please try:
=IFERROR(INDEX(sheet1!$B$2:$F$5,MATCH(J$1,sheet1!$A$2:$A$5,0),MATCH($I2,sheet1!$B$1:$F$1,0)),"")
in sheet2 where your 1 is (assumed to be J2), and copy across and down to suit.