Excel: Running occurrence of items in table column - excel

I am trying to achieve this in a table in Excel:
https://exceljet.net/formula/running-count-of-occurrence-in-list
So, counting each item in the list in an ordered sequence:
(I want it for Red and Green too, of course)
If I however convert it to a table, and add something like these variants, then I get a total instead for each. Is there a way to achieve this in a table column? The only other answer I found to this was within Power Query, which I won't be using for this table:
=IF([#Color]="blue",COUNTIF([Color]:[#Color],"blue"),"")
=IF([#Color]=[#Color],COUNTIF([Color]:[#Color],[#Color]),"")
=IF([#Color]="blue",COUNTIF([#Color],"blue"),"")
=IF([#Color]=[#Color],COUNTIF([#Color],[#Color]),"")

You can use INDEX() to refer to the range at hand per row:
Formula in B2:
=SUM(--(INDEX([Color],1):[#Color]=[#Color]))
Meaning:
INDEX([Color],1) - Always refer to the first row in the "Color" column;
: - Continue creating a valid range reference;
[#Color] - Refer to the current row/value.
=[#Color] - Match the above structure against the current row/value.
-- - Turn the TRUE/FALSE array into 1/0 values.
SUM() - Sum the given array to a total.
Note, the website you have tagged is quite handy, they also have exactly what you are after covered on another page. See this.
Also, I guess this needs to be CSE-entered in pre-365 versions of Excel.

Related

Find closest value in a Poisson distribution table

I am trying to find the closest value in this Poisson Distribution table to help create some variation in my NHL game simulator. Right now I have the minimum value set to =(MAX(H4:R14)/1.25) and the max as MAX(H4:R14). My rand (random) value is set to =RAND()*($V$4-$U$4)+$U$4. My question is, is how do you find the closest value in the table when compared to the random value? Returning the closest match (percentage) is ideal and the table's values will change whenever the teams change. Eventually, I am trying to return the respective column and row values (goals for each team), but this is step 1 and haven't been able to figure it out yet - even with index and match. Feel like it should be fairly straightforward...
You may benefit from SUMPRODUCT and array formula to get closes match and goals values in rows and section. Just as example:
The upper table would be your data. The bottom table is just to check it it works, you can delete. This bottom table is just an absolute value of cell minus your rand. The minimun one is highlighted in green and that's the closest match. Notice how the formulas get the position of that match and the goals values:
Formula to get ROW VALUE:
=SUMPRODUCT(--(ABS(H4:R14-U6)=MIN(ABS(H4:R14-U6)))*G4:G14)
Formula to get COLUMN VALUE:
=SUMPRODUCT(--(ABS(H4:R14-U6)=MIN(ABS(H4:R14-U6)))*H3:R3)
Both of this formulas are array formulas, so to introduce them yo must press CTRL+ENTER+SHIFT or they won't work!
To get the closest match is just INDEX classic formula (we add +1 due to goals indexed as 0,1,2...):
=INDEX(H4:R14,U9+1,V9+1)
See it working!:
I've uploaded the file in case you want to check the formulas:
https://docs.google.com/spreadsheets/d/1wV8bn6B-jZ4jCxAonGdWFMluAEybzwkA/edit?usp=share_link&ouid=114417674018837700466&rtpof=true&sd=true
Please, notice this will work properly as far as there is a single closest match. If by any case two cells are both closest match, formula will return incorrect result.
And remember, as I said before, second table is just to explain the solution, you can delete it and everything will work.
You could create a 2nd table where you calculate the absolute or squared difference between your Poisson table values and the random value.
Then you can simply check for the smallest value in the 2nd table.
In your example, the 2nd table of differences could be created with the formula
=ABS($H$4:$R$14-$U$6)
Let's assume that this 2nd table is just below the 1st table, i.e. in H16:R26
Then the smallest value in this 2nd table is obtained by
=SMALL($H$16:$R$26,1)
from there you can get the row number through
=SUM(($H$16:$R$26=SMALL($H$16:$R$26,1))*($G$4:$G$14))
and the column number through
=SUM(($H$16:$R$26=SMALL($H$16:$R$26,1))*($H$3:$R$3))
Then, to find the closest match in the original table, you can use the INDEX() function and refer to the row and column calculated in the previous step.
You can of course skip all the intermediary steps and write everything into a single formula:
=LET(
table,$H$4:$R$14,
diff,ABS(table-$U$6),
closest,SMALL(diff,1),
closest_row,SUM((diff=closest)*SEQUENCE(ROWS(table))),
closest_col,SUM((diff=closest)*SEQUENCE(1,COLUMNS(table))),
INDEX(table,closest_row,closest_col)
)
Which looks like this:
Note: the last formula uses dynamic arrays and functions only available Excel O365.

XLOOKUP Formula Issue Horizontal & vertical lookup

I am trying to pull data that has a vertical and horizontal lookup. The Vertical lookup provides the product that I'm trying to match and the horizontal lookup is for the dates. I have this lookup working across two sheets. It seems to be working okay, except that lookup for some reason pulls the value beneath the right value. This is really weird. Can someone please take a look at the formula and let me know what the issue might be?
$E59 - contains the lookup value - Forecast2 column A contains the data that would match with $E59. The main page $N$1 contains the date, and $G$4:$O$4 - in Forecast2 contains the date headers, whereas $G$5:$O$52 - contains the actual data.
My formula currently performs the lookup, but instead of bringing in the right value, it brings in the value beneath the right value. To give an example Row 5 in the Forecast2 sheet is the match that $E59 corresponds to, yet for some reason it's giving me the value from Row 6. If I then change the value of $E59 to match with what row 6's value is - the formula produces the value for row 7?
=XLOOKUP($E59,Forecast2!$A$4:$A$51,XLOOKUP('Main Page'!N$1,Forecast2!$G$4:$O$4,Forecast2!$G$5:$O$52))
Actually, your ranges are wrong. To correct it, you need put your values in A5:A52 and then refer to A5:A52 in place of A4:A51. Hence, your formula would become
=XLOOKUP($E59,Forecast2!$A$5:$A$52,XLOOKUP('Main Page'!N$1,Forecast2!$G$4:$O$4,Forecast2!$G$5:$O$52))
Another method is apply an offset of -1 but it is always better to fix the source rather than the result.

How do I display all my results for a specific search. When the specific search has multiples results. Not using vLookup

I have a sheet with 2 columns. ID and SearchTerm
ID has the same ID# for multiple SearchTerms.
I am trying to search for example ID# 25 and then be able to show all results on a separate sheet. Without having to search for the ID number and then Copy and Paste the columb.
I tried doing a vLookup, but it only gives me back the first SearchTerm based on the ID.
For only 7225 rows of data, an array formula isn't too bad, speed-wise (enter it as an array formula with Ctrl+Shift+Enter in a range that is 100 rows long and one column wide):
=INDEX(B1:B7225,SMALL(IF(A1:A7225=4,ROW(A1:A7225)),ROW(INDIRECT("1:100"))))
Change the 4 to the desired search value (or a cell with the desired search value). You can get more/less than 100 results by changing the 100.
I just tested it against a non-array version that you fill down, e.g.,
=INDEX(B1:B7225,AGGREGATE(14,6,ROW(B$1:B$7225)/((A$1:A$7255)=4),ROW(A1)))
and the array version is more than an order of magnitude faster.
Lets assume your search ID in in E4 and you want your search results to be in F4:F21. In F4 place the following formula and copy it down the maximum number row you think you might have.
=INDEX(B:B,AGGREGATE(14,6,ROW(B$1:B$7225)/((A$1:A$7255)=E4),ROW(A1)))
I was going to put a caveat in about don't use full column references within the AGGREGATE function because it performs array calculations and will slow things down, but I believe Scott Craner's comment covered that.
Having said all that I believe using filters is the faster approach.
UPDATE
In order to avoid errors from being displayed, wrap the whole thing in an IFERROR function:
=IFERROR(INDEX(B:B,AGGREGATE(14,6,ROW(B$1:B$7225)/((A$1:A$7255)=E4),ROW(A1))),"")

I want to use sumproduct with two different tables based on selection

I am working on a statistical model where we use sumproduct to generate forecast values by multiplying coefficients in one table with variables in another. Right now it is being done manually and that is taking time. I would like to automate it but I'm not able to figure this out.
We are using concatenate to identify different rows to use for vlookup. The variable columns are the same in number for both tables. I need to multiply each variable cell respectively in both tables and sum them, hence sumproduct.
this is what I am trying to do
Forecast model 1 sales for product A in phones in USA = sumproduct([variables by year from table 1 for USA for phones], [Variables for USA phone product A model 1 from table 2] )
I hope someone can help me.
Proof of Concept
You will need to update the references to suit your spreadsheet table locations.
In cell E21 use the following and copy right and down as required:
=SUMPRODUCT(INDEX($G$3:$I$12,MATCH($B21&$A21&$C21,$A$3:$A$12,0),0),INDEX($F$15:$H$18,MATCH($A21&$C21&$D21&MID(E$20,16,1),$A$15:$A$18,0),0))
This process was simplified because you had a unique ID tag on each of the previous two tables that could be built from the information in the third table. If you ever get into double digit forecast models the MID() function part of the formula will need to be modified. The 16 in the mid function refers to the character location of the number in the forecast model sales header name in Table 3. As such you either need to keep that header format exactly the same or modify the position of the number in the MID() function.
UPDATE 1
Explanation of Formulas
The following formulas were used in this solution:
SUMPRODUCT
INDEX
MATCH
MID
Concatenate
I will start with the assumption that you already understand sumproduct() as you were already using it before you ran into your problem. One thing to note about sumproduct is that it causes array like calculation to occur on the portion within it brackets. In this case we fed it two ranges of equal size. The difficult part was more an issue of determining those ranges.
Using your ID columns as a lookup row we used the match() function to determine which row to use. For the first set of variables we used the following to determine which row to look in:
=MATCH($B21&$A21&$C21,$A$3:$A$12,0)
Match is made up of three arguments inside the brackets:
MATCH(what to look for, where to look, type of match)
What we need to look for in table is a concatenation of various cells in Table 3 to build the ID in Table 1. It could have been written using the full formula:
=CONCATENATE($B21,$A21,$C21)
but the short form using & was used instead:
=$B21&$A21&$C21
Once we had what to look for we needed the range of where to look and supplied the ID column from table 1:
$A$3:$A$12
This now leaves the third and final argument of what type of search to perform. An exact match seemed to be the most appropriate match to perform so the value of 0 was supplied. What match returns is the row within the supplied range. It is relative to the range supplied and not the actual row in the spreadsheet. If it cannot make a match it will return an error instead of a row number.
Now that we know what row we want, we can use this information with the INDEX() function. The INDEX() function is made up of 3 arguments as well with the third argument being optional depending on if a 1D or 2D range is being indexed:
INDEX(Range to work with, 2D Row or 1D Position reference, 2D Column reference)
IN the case we are dealing with for the first table, the range to work with was your list of variables:
$G$3:$I$12
This is a 2D range. As such we need to tell INDEX() both what Row to look in as well as which Columns to look in. For the row to look in, we used the previously discussed MATCH() function. Since we want all columns and not just a specific column we use the value of 0. If Match returns an error, or if a number greater than the number of rows or columns selected is supplied, INDEX() will return an error. Based on the information discussed, the index function would look like:
=INDEX($G$3:$I$12,MATCH($B21&$A21&$C21,$A$3:$A$12,0),0)
You can try entering the above in a cell but it will give you an error. if you select three adjacent cells in the same row and use CONTROL+SHIFT+ENTER when entering the formula, Excel will add {} around the formula and it will be an array formula and should show you the three variables being used.
The same process as described above can be used for determining the second range of variable from Table 2. The only difference here is that the forecast model number was not in a column of its own but instead in the header row surrounded by text. As such the MID() function needed to be used to go into the header row, bypass the surrounding text and pull the model number out so it could be used as part of the CONCATENATION() used for the "what to look for" in MATCH():
=MID(E$20,16,1)
The MID() function work again with three arguments:
MID(Text to look in, which character to start at, how many characters to pull)
So in this case we are looking at the header in E20. Note the lock $ on the row number so the formula is always looking in row 20 no matter how far down it gets copied. It is then going to the 16th character. In this case the character "1" and pulling 1 character. If the header had just been 1 and 2, there would be no need for the MID function and the cell (with proper lock) could have been used.

Using SUMIFS with multiple AND OR conditions

I would like to create a succinct Excel formula that SUMS a column based on a set of AND conditions, plus a set of OR conditions.
My Excel table contains the following data and I used defined names for the columns.
Quote_Value (Worksheet!$A:$A) holds an accounting value.
Days_To_Close (Worksheet!$B:$B) contains a formula that results in a number.
Salesman (Worksheet!$C:$C) contains text and is a name.
Quote_Month (Worksheet!$D:$D) contains a formula (=TEXT(Worksheet!$E:$E,"mmm-yy"))to convert a date/time number from another column into a text based month reference.
I want to SUM Quote_Value if Salesman equals JBloggs and Days_To_Close is equal to or less than 90 and Quote_Month is equal to one of the following (Oct-13, Nov-13, or Dec-13).
At the moment, I've got this to work but it includes a lot of repetition, which I don't think I need.
=SUM(SUMIFS(Quote_Value,Salesman,"=JBloggs",Days_To_Close,"<=90",Quote_Month,"=Oct-13")+SUMIFS(Quote_Value,Salesman,"=JBloggs",Days_To_Close,"<=90",Quote_Month,"=Nov-13")+SUMIFS(Quote_Value,Salesman,"=JBloggs",Days_To_Close,"<=90",Quote_Month,"=Dec-13"))
What I'd like to do is something more like the following but I can't work out the correct syntax:
=SUMIFS(Quote_Value,Salesman,"=JBloggs",Days_To_Close,"<=90",Quote_Month,OR(Quote_Month="Oct-13",Quote_Month="Nov-13",Quote_Month="Dec-13"))
That formula doesn't error, it just returns a 0 value. Yet if I manually examine the data, that's not correct. I even tried using TRIM(Quote_Month) to make sure that spaces hadn't crept into the data but the fact that my extended SUM formula works indicates that the data is OK and that it's a syntax issue. Can anybody steer me in the right direction?
You can use SUMIFS like this
=SUM(SUMIFS(Quote_Value,Salesman,"JBloggs",Days_To_Close,"<=90",Quote_Month,{"Oct-13","Nov-13","Dec-13"}))
The SUMIFS function will return an "array" of 3 values (one total each for "Oct-13", "Nov-13" and "Dec-13"), so you need SUM to sum that array and give you the final result.
Be careful with this syntax, you can only have at most two criteria within the formula with "OR" conditions...and if there are two then in one you must separate the criteria with commas, in the other with semi-colons.
If you need more you might use SUMPRODUCT with MATCH, e.g. in your case
=SUMPRODUCT(Quote_Value,(Salesman="JBloggs")*(Days_To_Close<=90)*ISNUMBER(MATCH(Quote_Month,{"Oct-13","Nov-13","Dec-13"},0)))
In that version you can add any number of "OR" criteria using ISNUMBER/MATCH
You can use DSUM, which will be more flexible. Like if you want to change the name of Salesman or the Quote Month, you need not change the formula, but only some criteria cells. Please see the link below for details...Even the criteria can be formula to copied from other sheets
http://office.microsoft.com/en-us/excel-help/dsum-function-HP010342460.aspx?CTT=1
You might consider referencing the actual date/time in the source column for Quote_Month, then you could transform your OR into a couple of ANDs, something like (assuing the date's in something I've chosen to call Quote_Date)
=SUMIFS(Quote_Value,"<=90",Quote_Date,">="&DATE(2013,11,1),Quote_Date,"<="&DATE(2013,12,31),Salesman,"=JBloggs",Days_To_Close)
(I moved the interesting conditions to the front).
This approach works here because that "OR" condition is actually specifying a date range - it might not work in other cases.
Quote_Month (Worksheet!$D:$D) contains a formula (=TEXT(Worksheet!$E:$E,"mmm-yy"))to convert a date/time number from another column into a text based month reference.
You can use OR by adding + in Sumproduct. See this
=SUMPRODUCT((Quote_Value)*(Salesman="JBloggs")*(Days_To_Close<=90)*((Quote_Month="Cond1")+(Quote_Month="Cond2")+(Quote_Month="Cond3")))
ScreenShot
Speed
SUMPRODUCT is faster than SUM arrays, i.e. having {} arrays in the SUM function. SUMIFS is 30% faster than SUMPRODUCT.
{SUM(SUMIFS({}))} vs SUMPRODUCT(SUMIFS({})) both works fine, but SUMPRODUCT feels a bit easier to write without the CTRL-SHIFT-ENTER to create the {}.
Preference
I personally prefer writing SUMPRODUCT(--(ISNUMBER(MATCH(...)))) over SUMPRODUCT(SUMIFS({})) for multiple criteria.
However, if you have a drop-down menu where you want to select specific characteristics or all, SUMPRODUCT(SUMIFS()), is the only way to go. (as for selecting "all", the value should enter in "<>" + "Whatever word you want as long as it's not part of the specific characteristics".
In order to get the formula to work place the cursor inside the formula and press ctr+shift+enter and then it will work!
With the following, it is easy to link the Cell address...
=SUM(SUMIFS(FAGLL03!$I$4:$I$1048576,FAGLL03!$A$4:$A$1048576,">="&INDIRECT("A"&ROW()),FAGLL03!$A$4:$A$1048576,"<="&INDIRECT("B"&ROW()),FAGLL03!$Q$4:$Q$1048576,E$2))
Can use address / substitute / Column functions as required to use Cell addresses in full DYNAMIC.

Resources