This question builds on my previous post regarding how to identify and output multiple instances of a found value. My previous post:
Array to lookup multiple columns and take another columns information
This question is slightly different, where what if I wanted to sum the values found in an array. I have presented an example below. The Table I2:J4 is the reference table which assigns a value to each Plant #. Cells B3:F9 are already filled in and are used to find the sum of one count of each instance in the row. Column A will show the solution of each row. The formula should allow the column to be populated automatically.
For example I have filled in the answer for the first three rows of Column A. In Row 3, we see that there is a Plant 1002B1, and 1003B1 so the answer is 200+300. We neglect any additional instances found of a Plant #. For example row 4 and 5 both have a sum of 100, regardless of how many instances of the same plant are found.
I found myself trying to use an Index but this would return True/False values and because of this you can't add the sum of the value. Unless you use the True/False to identify the location of the value in the reference table and use that to find that the sum, which I dont how to do.
Use this array formula
=SUM(IF(B2:F2<>"",IF(MATCH(B2:F2,B2:F2,0)=COLUMN(B2:F2)-MIN(COLUMN(B2:F2))+1,SUMIFS(J:J,I:I,B2:F2))))
Being an array formula it must be confirmed with Ctrl-Shift-Enter instead of Enter when exiting edit mode.
If one has the new Dynamic Array formula(currently only available to Office 365 Insiders) then it gets much simpler to get an array of unique entries:
=SUMPRODUCT(SUMIFS(J:J,I:I,UNIQUE(TRANSPOSE(B2:F2))))
Related
I have a sheet with multiple parts belonging to different affiliates and areas. I want to write a formula that brings up the unique part from this list. I created a formula for the data set to get the row number of each unique part as shown below:
I want to just pull the rows from G which began with a 1. I try using the formula below, but it just will pull all the rows in column A:
INDEX(DATA!$A:$A,MATCH(1&"-"&$A$2&"-"&DATA!A2,DATA!G:G,0))
And produce this result:
01949765
01949765
04581664AA
04581664AA
04581914AC
04581914AC
04581914AC
04581914AD
04581915AB
Below is what I want to see:
01949765
04581664AA
04581914AC
04581914AD
04581915AB
Any formula I can use to get just the unique values?
As I understand it, the row already split the unique values, you just have to pull anything in A that has a G counterpart starting with 1.
=INDEX($A$1:$A$8,AGGREGATE(15,6,((LEFT($B$1:$B$8,1)="1")*1/(LEFT($B$1:$B$8,1)="1")*1)*ROW(INDIRECT("1:"&COUNTA($B$1:$B$8))),ROW()-10))
You'll have to adapt the different ranges to your data and the "-10" at the end depending on where you put your result.
My proposal, add a column, put in
=COUNTIF($A$2:$A2,A2)
Then filter only the value = 1. You'll see the results straightaway..
about the parts name in G..
you can load it using index()+match() function. Eg =INDEX(G:G,MATCH(A2,A:A,0)) to 'load' it.
Hope it helps.
I've got column which consist numbers of type 000.XX.XX
=COUNTIFS(temporary!$A1:$A200,">=000.11.35",temporary!$A1:$A200,"<=000.11.39")
this formula counts values between 000.11.35 and 000.11.39. But i want to count only unique values. How can I do this?
There is not a built-in function for this, as you can see from the several suggestions of how to accomplish this on the Office support site. If you can, you can switch to Google Sheets, and they have a "COUNTUNIQUE()" function.
As described at the link provided above, identify the unique items, either using a filer (this is static) or through repeatedly using the "FREQUENCY()" function. Then count the unique items in a separate step.
Lets say you have the data set in the first column, first you need to remove the repetitions in a second column with the following array formula (confirm the formula with Ctrl+Shift+Enter)
=IF(SUM((A2=$A$2:A2)*1)>1,"",A2)
this formula lists only unique values
I would remove your first 4 digits of your string to create a float number and then count it with the following array formula:
=SUM((IF((RIGHT($B$2:$B$14,4)>=RIGHT(G3,4))*(RIGHT($B$2:$B$14,4)<=RIGHT(G4,4)),$B$2:$B$14,"")<>"")*1)
Please look at the two images for clarification
view with formulas
normal printscreen
I am working on a statistical model where we use sumproduct to generate forecast values by multiplying coefficients in one table with variables in another. Right now it is being done manually and that is taking time. I would like to automate it but I'm not able to figure this out.
We are using concatenate to identify different rows to use for vlookup. The variable columns are the same in number for both tables. I need to multiply each variable cell respectively in both tables and sum them, hence sumproduct.
this is what I am trying to do
Forecast model 1 sales for product A in phones in USA = sumproduct([variables by year from table 1 for USA for phones], [Variables for USA phone product A model 1 from table 2] )
I hope someone can help me.
Proof of Concept
You will need to update the references to suit your spreadsheet table locations.
In cell E21 use the following and copy right and down as required:
=SUMPRODUCT(INDEX($G$3:$I$12,MATCH($B21&$A21&$C21,$A$3:$A$12,0),0),INDEX($F$15:$H$18,MATCH($A21&$C21&$D21&MID(E$20,16,1),$A$15:$A$18,0),0))
This process was simplified because you had a unique ID tag on each of the previous two tables that could be built from the information in the third table. If you ever get into double digit forecast models the MID() function part of the formula will need to be modified. The 16 in the mid function refers to the character location of the number in the forecast model sales header name in Table 3. As such you either need to keep that header format exactly the same or modify the position of the number in the MID() function.
UPDATE 1
Explanation of Formulas
The following formulas were used in this solution:
SUMPRODUCT
INDEX
MATCH
MID
Concatenate
I will start with the assumption that you already understand sumproduct() as you were already using it before you ran into your problem. One thing to note about sumproduct is that it causes array like calculation to occur on the portion within it brackets. In this case we fed it two ranges of equal size. The difficult part was more an issue of determining those ranges.
Using your ID columns as a lookup row we used the match() function to determine which row to use. For the first set of variables we used the following to determine which row to look in:
=MATCH($B21&$A21&$C21,$A$3:$A$12,0)
Match is made up of three arguments inside the brackets:
MATCH(what to look for, where to look, type of match)
What we need to look for in table is a concatenation of various cells in Table 3 to build the ID in Table 1. It could have been written using the full formula:
=CONCATENATE($B21,$A21,$C21)
but the short form using & was used instead:
=$B21&$A21&$C21
Once we had what to look for we needed the range of where to look and supplied the ID column from table 1:
$A$3:$A$12
This now leaves the third and final argument of what type of search to perform. An exact match seemed to be the most appropriate match to perform so the value of 0 was supplied. What match returns is the row within the supplied range. It is relative to the range supplied and not the actual row in the spreadsheet. If it cannot make a match it will return an error instead of a row number.
Now that we know what row we want, we can use this information with the INDEX() function. The INDEX() function is made up of 3 arguments as well with the third argument being optional depending on if a 1D or 2D range is being indexed:
INDEX(Range to work with, 2D Row or 1D Position reference, 2D Column reference)
IN the case we are dealing with for the first table, the range to work with was your list of variables:
$G$3:$I$12
This is a 2D range. As such we need to tell INDEX() both what Row to look in as well as which Columns to look in. For the row to look in, we used the previously discussed MATCH() function. Since we want all columns and not just a specific column we use the value of 0. If Match returns an error, or if a number greater than the number of rows or columns selected is supplied, INDEX() will return an error. Based on the information discussed, the index function would look like:
=INDEX($G$3:$I$12,MATCH($B21&$A21&$C21,$A$3:$A$12,0),0)
You can try entering the above in a cell but it will give you an error. if you select three adjacent cells in the same row and use CONTROL+SHIFT+ENTER when entering the formula, Excel will add {} around the formula and it will be an array formula and should show you the three variables being used.
The same process as described above can be used for determining the second range of variable from Table 2. The only difference here is that the forecast model number was not in a column of its own but instead in the header row surrounded by text. As such the MID() function needed to be used to go into the header row, bypass the surrounding text and pull the model number out so it could be used as part of the CONCATENATION() used for the "what to look for" in MATCH():
=MID(E$20,16,1)
The MID() function work again with three arguments:
MID(Text to look in, which character to start at, how many characters to pull)
So in this case we are looking at the header in E20. Note the lock $ on the row number so the formula is always looking in row 20 no matter how far down it gets copied. It is then going to the 16th character. In this case the character "1" and pulling 1 character. If the header had just been 1 and 2, there would be no need for the MID function and the cell (with proper lock) could have been used.
I don't know if I'm going about this the wrong way but it seems like it should be simple. Column A has a list of Names. Along each row is several "W"'s. Another separate field has a drop down representing Column A names. I want to count the number of "W"'s in a row corresponding to what name I select. I've tried using VLOOKUP and COUNTIF but I can't figure out how to select the entire array and then single out the one row that matches my selected name. I can get it working with a bunch of IF statements but thats far too time consuming as I'm manually matching the name to the row (and it isn't future proof).
There are a few ways to first 'narrow in' on the row you're looking for, after which point you can use a simple COUNTIFS to check the number of W's in that row.
One method would be to simply use INDIRECT, and create the row reference on the fly, like so [assumes your search cell is C1]:
=COUNTIFS(INDIRECT(MATCH(C1,A:A,0)&":"&MATCH(C1,A:A,0)),"W")
This first uses MATCH to find the appropriate row, and then builds a reference to that row [like "24:24"], which becomes the row that INDIRECT passes to COUNTIFS, which counts that row for W's.
For only one use of INDIRECT, the high computing costs of INDIRECT should not be an issue.
Another method would be to point out the full possible box that data could be contained in [let's assume that at most only column H would be used], and then use INDEX to give us the appropriate row number, like so:
=COUNTIFS(INDEX(A:H,MATCH(C1,A:A,0)),0,"W")
This again uses MATCH to find the row which contains the value found in C1 within column A. Then it takes the full possible box from INDEX, and returns all columns from the particular row [note that telling index to return 0 for the column # actually returns all columns instead].
Other methods would be possible [for example OFFSET], but I believe these two show the principle fairly well.
You could use the "Helper" Column method:
In the helper column:
=COUNTIF(B2:H2,"W")
Then use SUMIF() in the totals column:
=SUMIF($A$2:$A$9,K2,$I$2:$I$9)
How can I compare records in a table, to make sure these records are not duplicates? Using excel 2007 I don't won’t them to delete after comparison.
Duplicates rows should be colored. I have a table columns are from A to P and I have 500 rows. I want to put condition on A, B, E, F, G, I.
If you don't want to sort your column, you can try with a matrix formula (http://www.stanford.edu/~wfsharpe/mia/mat/mia_mat4.htm).
Practically, you can compare your current row to every row above. Somtething like :
=MIN(LINE(B1)*(IF(A2=A1;1;0))*(IF(B2=B1;1;0)))*(...)
validated with CTRL-SHIFT-ENTER will check if all the conditions are true, else, will return 0.
Please send a file (with anonymous data) if you want a practical example.
Hope that helps
Edit : here is the good solution (provided you want to compare data in the Q column) :
=MIN(LIGNE($Q$5:Q6)*EQUIV(Q6;$Q$5:Q6;0))
If you want to have the first line where the value appear
=MIN(LIGNE($Q$5:Q5)*EQUIV(Q6;$Q$5:Q5;0))
If you'd rather have #N/A if there are no duplicate before that line
Still validate with CTRL-SHIFT-ENTER
Sort by the columns you are interested in then use a formula to compare each row with the one above. You can then use conditional formatting to colour the results.
I may sound stupid here, but usually the simple answers are usually the best.
I did this recently, by literally using the CONCATENATE() function with the TEXT() function to combine all the columns I wanted to compare into a single cell. So in effect I am creating a cell with a unique "key" that holds all the data I want to be unique.
I then sort that column and create another empty column next to it.
Then us this formula to compare the row with the row above it: =IF(A2=A1,0,1)
This simply puts a 0 where it's the same row and a 1 where it's different.
I then filter on the '1's and there are my duplicates!
It'a also usefull as an alternative way of doing a unique COUNT(DISTINCT ...) where I want to count how many unique references of my data exists. SUBTOTAL(3...) is not enough.