Match and Conditional Formatting from Matrix Table - excel

I am looking for some decent help with my matrix table, and is there a good or best approach to properly match dependent instances in certain matrix using drop downs.
This picture represents my matrix table (Picture 1):
As you can see there are a lot of instances, but horizontally and vertically they got the same number of "headers". Those "1`s" are representing not compatibility in my case but lets call it simply "match". That is on one sheet that is gonna be populated with some new values from time to time.
On another sheet which is actually sheet for showing the data and their compatibility possibilities is equipped with drop downs. There you got "Groups (Group1, Group2...)" in a sense of main parts and "dependent groups (AA1, BB2..)" as small components that are part of main parts. To avoid misunderstanding here you have explanations, I used for the sake of this example fictional values:
Groups aka. Main Parts
Dependent groups aka. components
As you can see beneath, is my fictional table but exactly the same concept as I should use in my real case.
I PUT AN EXPLANATION IN THE PICTURE 2 SO YOU CAN FOLLOW ALONG AND SEE EXACTLY WHERE/WHAT I DID!
What I used firstly there are =match functions, one for vertical position (A3) and one for horizontal (B4). This boolean row is done using =or(index) but reffering to the match positions as you can see. And from there I should use true/false for coloring my group boxes in a case compatibility is possible - thats all the science.
So, my question is if there is another approach to this problem? As you can see I have 3 different rows of functions at one place, or imagine if I will have more "groups" that can rise in many more rows and calculations.
Picture 2
EDITED:
This is screenshot of the original sheet, I just hid some rows that were with Infos that is reason the number is not consistent. As you can see it is almost the same as dummy example I provided above. Underneath every "box" you got three rows of calculations as I mentioned before. The two times number "2" that you see here is the position of some value that I found using =match function, one is for horizontal and another for vertical lookup. In this case it is model type, 070FX is position 2, 100FX is 3 and 200FX is 4th position in the matrix table, and so on for all the other groups. And those groups (Model, Endpoint, Gas sensor...) are defined separately on another sheet where I had to make unique list and dependent list so I can reference those to my drop down list.
EDIT Nr 4! So this formula I used for true/false:
=SUMPRODUCT(('0359-matrix'!$A$2:$A$101=F10)*(('0359-matrix'!$B$1:$CW$1=$B$10)+('0359-matrix'!$B$1:$CW$1=$C$10)+('0359-matrix'!$B$1:$CW$1=$D$10)+('0359-matrix'!$B$1:$CW$1=$E$10)+('0359-matrix'!$B$1:$CW$1=$F$10)+('0359-matrix'!$B$1:$CW$1=$G$10)+('0359-matrix'!$B$1:$CW$1=$H$10)+('0359-matrix'!$B$1:$CW$1=$I$10)+('0359-matrix'!$B$1:$CW$1=$J$10)+('0359-matrix'!$B$1:$CW$1=$K$10)+('0359-matrix'!$B$1:$CW$1=$L$10)+('0359-matrix'!$B$1:$CW$1=$M$10)+('0359-matrix'!$B$1:$CW$1=$N$10)+('0359-matrix'!$B$1:$CW$1=$O$10)+('0359-matrix'!$B$1:$CW$1=$P$10)+('0359-matrix'!$B$1:$CW$1=$Q$10)+('0359-matrix'!$B$1:$CW$1=F13)+('0359-matrix'!$B$1:$CW$1=G13)+('0359-matrix'!$B$1:$CW$1=H13)+('0359-matrix'!$B$1:$CW$1=I13)+('0359-matrix'!$B$1:$CW$1=J13))*'0359-matrix'!$B$2:$CW$101)>0
I copied only last part, or when it starts from second row..Because it is too long to write whole funciton - it cuts down automatically.
('0359-matrix'!$B$1:$CW$1=$Q$10)+('0359-matrix'!$B$1:$CW$1=$B$13)+('0359-matrix'!$B$1:$CW$1=$C$13)+('0359-matrix'!$B$1:$CW$1=$D$13)+('0359-matrix'!$B$1:$CW$1=$E$13)+('0359-matrix'!$B$1:$CW$1=$F$13))*'0359-matrix'!$B$2:$CW$101)>0
But on marked cells I am getting the same results: B22 - F22 has the same as B21 - F21 (boolean) what shouldnt be like that but to follow color, green is False, it has to be something with an array reference.

Checkout the following. A1 to E5 is the matrix that shows which pieces are incompatible (=1). The others have to be empty or 0.
In cell I8 I used the following formula (and copied it down up to I11):
=SUMPRODUCT(($A$2:$A$5=H8)*(($B$1:$E$1=$H$8)+($B$1:$E$1=$H$9)+($B$1:$E$1=$H$10)+($B$1:$E$1=$H$11))*$B$2:$E$5)
The formula result shows you the amount of incompatibilities a part has. Eg AA1 has one incompatibility with BB2 but BB2 is incompatible with 2 AA1 and CC3.
To get the TRUE/FALSE use the same formula and append >0: like =SUMPRODUCT(…)>0
For any additinonal "group" (Model, Endpoint, …) you need to add another +($B$1:$E$1=$H$12) where $B$1:$E$1 points to your matrix data and $H$12 to your selected group value.
Overview of the formula ranges:
Note that this kind of calculation can only tell the amount of incompatibilites a part has but not the names of the parts that are incompatible.
Edited horizontal version
Formula in the selected cell is
=SUMPRODUCT(($A$2:$A$5=G8)*(($B$1:$E$1=$G$8)+($B$1:$E$1=$H$8)+($B$1:$E$1=$I$8)+($B$1:$E$1=$J$8))*$B$2:$E$5)
you can pull it to the right.

Related

Excel interpolation with results in situ

Extrapolation in Excel is easy: have a list of numbers (and optionally their paired "X-values"), and it can easily generate further entries in the list with the GROWTH() function.
GROWTH() works for interpolation too: you just need to tell it the intermediate X-values that you want it to calculate for. My problem with it is the appearance of the data in the spreadsheet. Here's an example:
Say I have some inputs, and through some process get some outputs. Only, there were gaps in the experiment so no outputs were generated for some values:
Out of curiosity, I copied the data to the right, and used Excel's "Extend with Growth Trend": I highlighted the first two entries (only), then right-click-dragged-down the little square over the next four cells (overriding the final value there) and chose "Growth Trend" in the context menu. To remind myself that the values were Excel-generated, I gave them a grey background:
Hmm. The generated values (unsurprisingly) aren't a good extrapolation, since they don't factor in the later value. It's out by over 40%! Also note that this Extend feature of Excel is an ease-of-input mechanism, not a calculation tool in its own right - Excel enters the data as raw numbers (to multiple decimal places).
So I formalised the Extend column by using the GROWTH() function - again only factoring in the first two values, but also using their paired X-values and the desired interpolation entry as parameters:
D4: =GROWTH(D$2:D$3,$A$2:$A$3,$A4)
D5: =GROWTH(D$2:D$3,$A$2:$A$3,$A5)
D6: =GROWTH(D$2:D$3,$A$2:$A$3,$A6)
Thankfully, the results mimic those of the previous column (Microsoft use the same mechanism for both features!) I didn't overwrite the last entry, since after all it has the value that I actually want! The fact that the calculated values are the same as before is the problem I'm trying to fix, and that this question is about.
To improve the calculated values, I need to incorporate the last value - but at the same time I want the "natural" sequence of input values to be maintained. In other words, I want the interpolated values to be placed in situ. That implies that the arguments to the GROWTH() function need to be discontiguous ranges, which Excel does by using the (Range,Range,...) syntax. I tried it, and got #REF! errors. I then tried using a named discontiguous range: same result.
After a bit of Googling (and StackOverflowing!) I found references to using INDIRECT() - a particularly problematic 'solution', since it evaluates strings that would need to be manually maintained. Nevertheless:
E4: =GROWTH(INDIRECT({"E2:E3","E7"}),INDIRECT({"A2:A3","A7"}),A4)
E5: =GROWTH(INDIRECT({"E2:E3","E7"}),INDIRECT({"A2:A3","A7"}),A5)
E6: =GROWTH(INDIRECT({"E2:E3","E7"}),INDIRECT({"A2:A3","A7"}),A6)
…and after all that it didn't work anyway! The values remained the same as the previous version, that didn't incorporate the last value. Maybe the last value doesn't make for better interpolation results? So, as an experiment, I ignored the "in situ" requirement and generated an "ex situ" version, with the known values followed by the desired values, allowing me to use simple ranges. Success! But to highlight that the data is in the wrong order, I asked Excel to create an X-Y plot of the data too:
B13: =GROWTH(B$10:B$12,$A$10:$A$12,$A13)
B14: =GROWTH(B$10:B$12,$A$10:$A$12,$A14)
B15: =GROWTH(B$10:B$12,$A$10:$A$12,$A15)
Of course, the results are exponential not linear, so setting the Y-axis to logarithmic generates a very readable result - and it effectively masks the back-and-forth of the data. But deep down, we both know that the data is wrong - just look at the table!
Maybe, just maybe, if I used Excel's "Sort Data" feature it would break up the range for me, and show me how I should have written the formulae? Sadly, although it looks like it worked, I get a "Circular reference" error for B12 - the range wasn't modified to make it discontiguous, and now B12's result is dependent on the original range which includes itself! I coloured it below to indicate that this isn't a viable solution:
So, my "final" solution is to maintain the previous "ex situ" version, and simply have an "in situ" column as well that does a VLOOKUP() on the ExSitu (named) table - and I needed to tell it to do an exact match with the FALSE parameter, since the list isn't sorted:
F4: =VLOOKUP($A4,ExSitu,2,FALSE)
F5: =VLOOKUP($A5,ExSitu,2,FALSE)
F6: =VLOOKUP($A6,ExSitu,2,FALSE)
Note that I labelled the column with an asterisk since it's a cheat: the values are only in situ by copying from another table.
Phew! After all that, my question:
Is there a way to directly interpolate the "in situ" values, without having to have an "ex situ" lookup table to generate the results? The above example was deliberately straightforward: you can easily imagine a longer list with more gaps to be filled in.
Since you had a good data sense, I'll share my discovery path on this case. I'm more like a visual person. I don't see things 'that' clear via tables. Here is what I do to you data points. :
Input Raw
360 7.16
370 28.9
380
390
400
410 5,380.00
Highlight all and press my favorite button > F11. I choose line chart type. Then with the plus button on the top left of the chart, I add trendline > more options.. From there I choose 'polynomial' and 'exponential' . Plus, a tick on 'display equation on chart' As you can see in the links, both fit seem ok. just take the equation and fit in for other values as needed.
Three things I've noticed :
The polynomial and exponential fit is close enough to what I need. But it doesn't exactly 'map' on the ( 410, 5380.00 ) point.
By having the formula I find it easier to make sense of whether or not the trendline 'proposed' by excel is a close fit to my need. As you play around you can see how far-off the linear & logarithmic trendline can be.
The trendline equation doesn't really map to 360,370,410... point as the x value, it assumes x is 0,1,2,3... (try to test it with the 'equation' of the excel proposed trendline)
IMHO, use excel trend with care. My next best fitting tool -> wolframalpha logarithmic fit.
For the original question :
Is there a way to directly interpolate the "in situ" values, without having to have an "ex situ" lookup table to generate the results?
I think my simple answer will be : Indirectly, Yes. Directly? not sure.
Hope this heals/helps in some ways.. ( :

Relabeling Duplicates in Excel of Cells in direct proximity

Apologies for the tile gore - was trying to be descriptive.
I have a large lab result data set, and it has been found that one analyte was screened for twice per sample and i need to capture both sets of results. This results in me having a table similar to below where Antimony is listed twice. Is there a way to automate something either to flag the instances where i have two rows like that or rename to antimony-1 and antimony-2? Since I have 300 sites screened for the same things, everything shows up as duplicate and i can't use the simple methods. The main trigger is the proximity to another row where everything is matching except the results.
If I assume you have the data in you screen shot starting in cell A1 (and Soil as your site) I'd add to columns the first combines Site & Element (Column F in my example):
=A1&C1
Result: SoilAluminium
In the column next to that I'd have a formula:
=F1&COUNTIF($F$1:F1,F1)
Result - SoilAluminium1, SoilAntimony1, SoilAntimony2 etc
Note: Pay Attention to the $'s
I hope that works

Ranking with subsets

I'm trying to rank values and have managed to work out how to sort ties. My data looks at the total number of entries, ranks based on that and if there is a tie it looks to the next column of values to sort them out. However, I have two classes (East and West I've called them) of data within my dataset and want to rank them both separately (but stick to the rules above). So, if I had seven entries, 3 of them West and 4 of the East, I want West to have ranking 1,2,3 based on all the values that lie in that subset and East would have ranking 1,2,3,4. Can you explain what your formula is doing so I can understand how to apply your answer better in the future.
Effectively I'm asking what formula needs to go in achieve my result.
Cheers
Paul
There are a few related ways to do this, most involving SUMPRODUCT. If you don't like the solution below and would like to research other ways/explanations, try searching for "rankif".
The function looks up the Class and Value columns and, for every value in those columns, returns a TRUE or 1 if the current Class is a match AND if its Value is larger than the current Value, False or 0 if otherwise. The SUM adds up all these 1s, and the 1+ is for decoration. Remember to enter as an array formula using Ctrl+Shift+Enter before dragging down.
I used the array formula and SUM above to explain, but the following also works and might even be faster since it's not an array formula. It's the same idea, except we hijack SUMPRODUCT's ability to spit out a single value from an array.
=1+SUMPRODUCT(($A$2:$A$8=A2)*($B$2:$B$8>B2))
EDIT
To extend the rank-if, you could add more subsets to rank by multiplying more conditions:
You can also easily add tiebreakers by adding another SUMPRODUCT to treat the ties as an additional subset:
The first SUMPRODUCT is the 'base rank', while the second SUMPRODUCT is tiebreaker #1.

Break-Down Data in Excel without VBA (Formula Only)

Many times, I am required to provide some type of break-down to the customers - an example is shown in the attached figure.
I have a table of data ("TABLE DATA" - which is some type of pivot) + Customer provides its official form, its structure must be preserved (highlighted in yellow ). Basically, I need to separate the cost details of CODE "A" and CODE "B" into 2 separated sections.
Customer requires me to provided details for each individual Part (example shows Part A - "Break-Down Part A)
Is there anyway to put a"ITEM" from "TABLE DATA" into Code A and Code B ? the rests can be solved by Vlookup (Price, Quantity) - note: "ITEM" is non-duplicated values . Thank you very much
Number your rows in the breakout using =1 and =A1+1 and then just use the formula ="B-ITEM"&TEXT(A1,"000"). If you want to skip making a counter column you could use ="B-ITEM"&TEXT(ROW()-1,"000") to just use the current row number (minus 1 or however many you need).
If your items aren't sequentially like that, but still unique, I would recommend adding counters on the original tab similar to what you have, which would let you quickly find the 5th A or 7th B, something that counts the previous instances of your current type, and then adds 1. For Row 6 you could do =COUNTIF(A$1:A5,A6)+1.

Excel: How to parse/cast text as a formula?

Is it possible to parse/cast text (like "=A1+A2") as a formula in MS Excel? I want to build a formula from pieces of text - some of which will only be typed in later by a user.
If the INDIRECT() function did not only work for referencing cells, then I could have typed this =INDIRECT("=A1+A2").
I know you can a work around this problem by simply adding a lot more hidden columns to do sub calculations. But for the sake scalability and efficiency, I would rather do something like the above.
I found a similar questions here and here, yet they don't solve my problem.
The Real-world problem:
Read on for a better understanding as to why you would want to do the above
Scenario
Each item in the list consists of a string, which contains anywhere from 1 to 5 account names each. Each account name is followed by an account number in brackets. The length of the number determines the type of account. Part of the account number is a date, of which the date format depends on the type of account. Further more, each account type may have more that 1 account-number length associated with it, although each number-length[*] is only associated with 1 account type.
Objectives
Extract account-names and their respective account-numbers and account-types from a list.
Make an assumption as to the account-type from the account-number
Validate this assumption by inspecting the build of the number and elements in the name
Check the validity of the account-numbers depending on their type.
The tricky part (this is where my problem lies)
The account-types and their respective account-number-lengths are not known before hand, and are typed into a table by the user of the sheet, specifying a type of account and the number-lengths associated with this account-type. The user should type this into a list - not go and tinker around with delicate formulas
Done so far
Column A: Contains the raw data (each cell has up to 5 names and numbers)
Columns B..F: Each column extracts 1 name, remains empty if all are already extracted
Columns G..K: Each column extracts 1 number corresponding to its name in columns B..F, remains empty if all are already extracted
Columns L..P: Each column calculates the length of the corresponding number in columns
G..K
Now the user would type the following details into a table which assigns certain number-lengths an account type:
TYPE2, BUSINESS, (OR(length=13,length=6))
where length will later be replaced with the cell address which contains the calculated account number-length.
What I want to do now
Columns Q..U:
Should all indicate the account-type of the corresponding account-number in columns G..K. The idea is to build a nested if-elseIf-elseIf formula using the criteria typed in by the user as specified above. Example of one of the elseIF statements:
SUBSTITUTE(CONCATENATE("=IF(",criteria,",",type,",",errCode)),"length","O10"))
All of these elseIf statements will then be concatenated together to form a master formula which will then need to be parsed/cast as a formula to calculate the account-type
This proposal uses only 5 columns (1 for each account-number, containing the master formula) and a table specifying account-types and criteria, also keeping the user away from formulas. Editing 1 line of code (the criteria) will update all formulas. Efficient & Scalable.
Since the user should never tinker around with the formulas under the hood, a simple 1 column if-elseIf-elseIf is out of the question. The alternative to the above would be to make a separate column to test for each account-type for each account-number. Separating/Abstracting out each test to its own column results in much better readability, easier editing & much less debugging - Unless you like multi-screen-wide-formulas. Example: 5 account-numbers * 10 possible account types = 50 extra columns.
Each edit to any criteria needs to copied to 4 other non-adjacent columns and drag-filled down 10,000 rows (columns can not be adjacent since it is effectively a 5x5 array of columns). Not Efficient nor scalable. Unless I'm missing some elegant way of updating non-adjacent formulas in a single click
The rest of the validations error indications are trivial.
Sample data
Tshepo Trust (6901/2005) Marlene Mead (8602250646085)
Great Force Inv 67 Pty Ltd (200602258007)
Jane (870811) Livingstone (6901/2005) Janette Appel (8503250647056) James (900111)
I know all this would probably be much easier to achieve with clever usage of VBA, eliminating all the need to simulate abstraction, encapsulation, multi-dimensional arrays and functional programming on a spreadsheet. But until I can program in VBA, worksheet formulas will be my refuge.
[*]: account number-length could also be described as the amount of digits in the number or as indicated by this formula: LEN(accNumber)
In VBA you have access to Cell.Formula.
I usually used Range to peek a cell by address.
I'm not sure if this would answer your question(it's a very detailed question!), but if your user was entering the account numbers in a table (I'm calling it 'RefTable') , that was:
Length of account number | business type
----------------------------------------
6 | Accountant
8 | Advisor
Then you could just use a vlookup on the length of the account number, given you've already separated them out.
=vlookup(len(accNumber), Reftable, 2, false)
Make sure that you either use a dynamic range name, or specify plenty of space below in RefTable, so that when your users add types, they don't get lost.
Also, if you have two different accounts with the same length, this could get you into trouble.

Resources