Hi there I am looking to combine two data ranges/arrays into one in order to feed them into excel FREQUENCY function.
Example:
First data range - B5:F50
Second data range - J5:N50
Bins data range - I5:I16
Function definition - FREQUENCY(data_array; bins_array)
Basically I am lazy and I don't want to reshuffle my excel script to spit out both datasets side by side so that I can reference them using something like B5:K50 range. Is there any way I can combine both datasets into data_array using some kind of formula? Maybe to end up with something along the line of =FREQUENCY((B5:F50,J5:N50); I5:I16) ?
BTW: Either of
=FREQUENCY(B5:F50; I5:I16)
=FREQUENCY(J5:N50; I5:I16)
work just file on their own for me.
Update
Actual formula definition FREQUENCY(data, classes)
2013 MS Excel (unrelated)
In MS Excel FREQUENCY function accepts a "union" as the first argument, i.e. a list of references separated by commas and enclosed in parentheses e.g.
=FREQUENCY((B5:F50,J5:N50),I5:I16)
Note: the "bins array" can also be a union if required
In "Google sheets" I don't think the same thing is possible - there may be a clever workaround, but I'm not aware of it
The Using Arrays page has some details that worked for me:
https://support.google.com/docs/answer/6208276?hl=en
It says:
"You can join multiple ranges into one continuous range using this same punctuation, which works the same way as VMERGE. For example, to combine values from A1-A10 with the values from D1-D10, you can use the following formula to create a range in a continuous column: ={A1:A10; D1:D10}"
I have two named ranges, so I was able to use {namedRange1;namedRange2} and it gave me one continuous range.
Related
I would like to combine three different tables in Excel. I am struggling with the fact that the tables can vary in length.
For example:
What I would like to achieve is all the tables' data in one table without empty spaces. So first the two entries from the first table then the three entries from the second table and lastly the entry from the third table. But the amount of rows in each table can vary.
How can I do this dynamically so when the amount of entries in the tables change it can handle this? I'm using Mac with Office365. Thanks!
EDIT:
Output with Ron Rosenfeld's solution, the range of the list goes down from cell 5 - cell 103. Could this be reduced to 5 - 15?:
If you have Excel 2019 or Office 365, with the FILTERXML and TEXTJOIN functions, you can use:
=FILTERXML("<t><s>" & TEXTJOIN("</s><s>",TRUE,Table1,Table2, Table3) & "</s></t>","//s[.!=0]")
If those zero's are really blanks, you can omit [.!=0] from the xPath argument, but it won't hurt to leave it there
Edit:
With MAC versions of Office 365 that do not have the FILTERXML function, I believe the following will work:
=LET(
a,299,
x,IF(SEQUENCE(99,,0)=0,1,SEQUENCE(99,,0)*a),
y,TEXTJOIN(REPT(" ",a),TRUE,Table19,Table20,Table21),
z, TRIM(MID(y,x,a)),FILTER(z,(z<>"0")*(z<>""))
)
Note the a parameter in the above function
Because of how the splitting algorithm works, the sequence for each cell will not always start at the beginning of a string.
Hence, if there are enough letters in the various strings, the start number may eventually get offset enough to cause a split in the wrong location
One fix is to use an arbitrarily large number of space's to insert.
99 is frequently large enough, but not for this data set.
299 seems to be large enough for the data set as shown in your actual data.
I believe the minimum number should be the sum of the lengths of all the characters in the original tables (including the 0's) plus one (1). But not sure of this.
You can certainly adjust it as needed
If the number becomes too large, you could run into the 32,767 character limitation. If that happened, an error message would occur.
So, if you wanted to compute a, dynamically, you could try something like:
=LET(
a,SUM(LEN(Table19[Column1]),LEN(Table20[Column1]),LEN(Table21[Column1]))+1,
x,IF(SEQUENCE(99,,0)=0,1,SEQUENCE(99,,0)*a),
y,TEXTJOIN(REPT(" ",a),TRUE,Table19,Table20,Table21),
z, TRIM(MID(y,x,a)),FILTER(z,(z<>"0")*(z<>""))
)
but no guarantees.
Assuming the data is in A:C, and empty cell is blank (not 0).
In E1 put :
=IF(ROW()>COUNTA(A:C),"",
INDEX(A:C,
IF(ROW()<=COUNTA(A:A),ROW(),IF(ROW()<=COUNTA(A:B),ROW()-COUNTA(A:A),ROW()-COUNTA(A:B))),
IF(ROW()<=COUNTA(A:A),1,IF(ROW()<=COUNTA(A:B),2,3)))
)
Idea : use row() to guide in selection in index. counta() is used guide converting 'row()' to usable index numbers. Also make the output cell blank "" for row() > counta(a:c).
Please share if it works/not.
Problem is straightforward, but solution is escaping. Hopefully some master here can provide insight.
I have a big data grid with prices. Those prices are ordered by location (rows) and business name (cols). I need to match the location/row by looking at two criteria (location name and a second column). Once the matching row is found (there will always be a match), I need to get the minimum/lowest price from two ranges within the grid.
The last point is the real challenge. Unlike a normal INDEX or MINIFS scenario, the columns I need to MIN aren't contiguous... for example, I need to know what the MIN value is between I4:J1331 and Q4:U1331. It's not an intersection, it's a contiguous set of values across two different arrays.
You're probably saying "hey, why don't you just reorder your table to make them contiguous"... not an option. I have a lot of data, and this spreadsheet is used for a bunch of other stuff. So, I have to work with the format I have, and that means figuring out how to do a lookup/min across multiple non-contiguous ranges. My latest attempt:
=MINIFS(AND($I$4:$J$1331,$K$4:$P$1331),$B$4:$B$1331,$A2,$E$4:$E$1331,$B2)
Didn't work, but it should make it more clear what I'm trying to do. There has GOT to be an easy way to just tell excel "use these two ranges instead of one".
Thanks,
Rick
Figured it out. For anyone else who's interested, there doesn't seem to be any easy way to just "AND" arrays together for a search (Hello MS, backlog please). So, what I did instead was to just create multiple INDEX/MATCH arrays inside of a MIN function and take the result. Like this:
MIN((INDEX/MATCH ARRAY 1),(INDEX/MATCH ARRAY 2))
They both have identical criteria, the only difference is the set of arrays being indexed in each function. That basically gives me this:
MIN((match array),(match array))
And Min can then pull the lowest value from either.
Not as elegant as I'd like... lots of redundant code, but at least it works.
-rt
I would like to solve this either in Excel or in SPSS:
I have categorical data (each number representing a medical diagnosis) that are combined into single cells. In other words, a row (patient) has multiple diagnoses. However, I would like to know the frequencies of each diagnosis. What is the best way to go about this? (See picture for reference)
For SPSS:
First just creating some sample data to demonstrate on:
data list free/e_cerv_dis_state (a20).
begin data
"{1/2/3/6}" "{1/2/4}" "{2/4/5}" "{1/5/6}" "{4}" "{4/5/6}" "{1/2/3/4/5/6}"
end data.
Now the following code will create a separate variable for each possible diagnosis, and will put a 1 in it if the diagnosis exists in the original variable.
do repeat vr=diag1 to diag9/vl=1 to 9.
compute vr=char.index(e_cerv_dis_state, string(vl, f1) ) > 0.
end repeat.
freq diag1 to diag6.
Note this will only work for up to 9 diagnoses. If you have more than that the solution will have to be adapted to multiple digits.
Assuming that the number of columns is fairly regular, I would suggest using text to columns, and then using COUNTIF on the cells if they are the value wanted. However there is a more robust and reproducible solution that would involve using SQL. If you download the free version of SQL Express here: https://www.microsoft.com/en-gb/sql-server/sql-server-downloads
Then you can import your table of data, here's how to do that: How to import an Excel file into SQL Server?
Then you could use the more friendly SQL database to get the answers you want. For example you can use a select statement that would say:
SELECT count(e_cerv_dis_state)
WHERE e_cerv_dis_state = '6'
It would also be possible to use a CASE WHEN statement to add-in the names of the diagnoses.
Basically I am trying to improve a spreadsheet that current uses fixed IF functions within IF functions to determine where to find data, then originally used the VLOOKUP function to return the appropriate cable cleat size. Where "Cleat Diameter">"Cable Diameter".
I've been using this for a while, however excel quickly runs out of resources with all the remaining calculations being performed. As a result, I've opted to put all data a single table, and try to use the match function to retrieve the necessary row. Then Simply use the =INDIRECT function to retrieve data from the appropriate column of the associated row.
Unfortunately I believe the issue relates to the fact that I first need to perform at MATCH Type 0 (exact match), followed by a type -1 for the size to identify the next size up that can accommodate a specific cable size.
I've managed a simple lookup on another dataset using (for exact matches):
=MATCH($B3,'Current Raw Data'!A:A,0)+ROW('Current Raw Data'!A:A)-1
However when I attempt the same thing with two types of matches I get errors. The closest I get it using the following array formula, but it does not work unless the data set is arranged so that the contents of Cell C3 is the first occurring item in the dataset in column A:A:
{=MATCH(C3,($B3='Lookup - Cleats'!A:A)*('Lookup - Cleats'!B:B),-1)}
Main sheet:
Dataset Example:
With this array formula (click Ctrl + Shift + Enter together inside formula bar), you should be able to get your results:
=IFERROR(INDEX('Lookup - Cleats'!C$3:C$26,MATCH($B3&$C3,'Lookup - Cleats'!$A$3:$A$26&'Lookup - Cleats'!$B$3:$B$26,0)),"")
I tried my best to use your data setup but maybe miss one or two things that you will need to adjust accordingly. Let me know if this is not working.
I have the following example list:
Note: In my real list I have around 200 options and 400 suboptions
And I would like to have 2 dropdownlists to select any option and it's suboptions:
For options, I used data validation - list with range =$A$8:$A$12
And for suboptions I tried the following:
Named ranges
It works but it needs a lot of manual work to maintain as the suboptions list is updated kind of frequently and AFAIK I would need to create and maintain many named ranges as many options I have.
Example
Named Range: _ABC05
Refers To: =Sheet1!$D$9:$D$10
Data validation: = INDIRECT(CONCATENATE("_";SUBSTITUTE(A2;"-";"")))
Again, this works but I am trying to avoid to maintain 200 named ranges.
Any solution without using named ranges or vba?
Finally I solve it using dynamic data validation:
In hidden column D, I have the following formula:
=CONCATENATE("D";MATCH(A2;$C$8:$C$15;0)+7;":D";MATCH(A2;$C$8:$C$15;1)+7)
And the data validation like this:
=INDIRECT(D2)
Edit: As mentioned by Aprillion, this will only work if the options list is sorted alphabetically ascending. In my case it is always like that but it would be interesting to know another solution with unsorted data. Also, In this example it is possible to avoid the hidden column and use =indirect(concatenate... in the data validation but in my case I have the lists in a separate worksheet and it is not possible to reference a data validation list in external workbooks or worksheets.
One issue is that once the user select an option and corresponding suboption, and then change again the option, the suboption is still selected even if it is not mapped to the new option. I found one solutions that consists in using a fake list as source of the data validation when C2 already have a value:
=IF(C2="";INDIRECT("$A$8:$A$12");INDIRECT("FakeList"))