For my question I am trying to reduce a very large amount of data using the =countif function in regards to a specific Employee ID (using =vlookup).
In Column 'A' I have every employee ID listed only once. In columns B, C, D, E, and F I would like to count every time that employee has been Hired, Promoted, received a Pay Increase, been Demoted and Fired, respectively.
In Column 'I,' I have again a list of employee ID's and in 'J' each time one of those actions were implemented.
Since there are more than 10,000 employee iterations that exist in column 'I' this is why I am trying to condense these down to numeric values in columns B:F.
ACTUAL QUESTION: Is there anyway to 'nest' these two functions in order to get the required results that I want?
Thanks in advance.
You can use Countifs with multiple conditions (not Countif, which takes only one condition)
Consider the following screenshot. The formula in cell B2 is
=COUNTIFS($I:$I,$A2,$J:$J,B$1)
Copy across and down. Note the position of the $ signs. They are important. The column references for columns I and J are absolute, and will not change when the formula is copied across. The reference to $A2 will always refer to column A, but the row will adjust when copied down. The reference to B$1 will always refer to row 1, but the column will adjust when the formula is copied across.
You can do a similar thing without any formulas at all, using a pivot table. Click a cell anywhere in the data in columns I or J, then click Insert > Pivot Table. In the pivot table pane that appears on the right, drag the Employee ID to the Rows area, drag the action to the Columns area and drag either of the fields to the Values area. The result looks like this:
Look Ma, no formulas!!
Related
I am currently automating a dashboard creation and I've hit a bit of a roadblock. I need some code that will go through about 7000 rows of data and return the highest value in a certain column for each specific item. The data is copied from a pivot table and so is broken down into row sections, I have attached a mock of what it looks like.
I need the highest value in Column G for each portfolio, and will need to use the portfolio code (e.g. XY12345 - They are always 7 characters) to map that value to the dashboard.
My issue is, each portfolio has a different number of rows for the values, and some have blank cells between them, and therefore I am stumped. I was hoping to use Column J to count the number of rows for each portfolio (as there are no breaks for the portfolios in this column) and then use a loop to loop through each portfolios rows of values, based off the Column J count, and then return the highest row value for each portfolio. Problem is I'm new to VBA and have been teaching myself as I go, and I've yet to use a loop.
Many thanks,
Harry
If I understand correctly, you're looking for the largest value in Column G.
I'm not sure why you think you would need VBA for this.
Get the maximum value of a column
You mentioned that you're concerned about each column not having the same number of cells but that's irrelevant. as SUM ignores blank cells, so just "go long", or - find the maximum of the entire column.
To return the largest number in Column G you could use worksheet formula :
=MAX(G:G)
The only catch is that you can't place that formula anywhere column G or else it would create a circular cell reference (trying to infinitely add a number to itself). let's pit that formula in cell F1 for now (but anywhere besides column G would do fine).
Find the location of a value
Now that you know the largest value, you can determine where it is using a lookup function such as MATCH or VLOOKUP. Like with so many things in Excel, there are several ways to accomplish the same thing. I'll go with MATCH.
Replace the formula from above (in F1) with:
=MATCH(MAX(G:G),G:G,0)
This will return the row number of the first exact match of the maximum value of Column G.
As for the third part of question: returning the code like X12345 where the value exists, will be a little tricky since your data is not organized in a logical tabular style (tabular meaning, "like a table").
Your data is organized for humans to look at, not for machines to easily read and manipulate it. (See: Office Support: Guidelines for organizing and formatting data on a worksheet)
Basically, when organizing data in rows, all relevant information should be on the same row (not a subjective number of rows behind). Also, you have the number combined with other information.
My suggestion for a quick fix:
Right-click the heading of Column C and choose Insert to insert a blank column.
In C2 enter formula: =IF(B2="",C1,LEFT(B2,7))
Copy cell C2
Select cells in column C all the way to the "end" of your data, where ever that is (not the end of the worksheet). For example maybe you would select cells B2:B1000)
Paste the copied cell into all those cells.
Now, you can again modify the formula in F1:
=INDEX(C:C,MATCH(MAX(G:G),G:G,0))
This will return the value from Column C in the same row that the maximum value of Column G is located.
This is known as an INDEX/MATCH formula.
Hopefully this works for you in the interim until you can organize your data more logically. There's lots of related information and tutorials online.
So, I've searched for an answer to this, but I can't find anything. Hopefully some Excel guru out there has an easy answer.
CONTEXT
I have a sheet that has two columns; a list of airport codes (Col A) and a list of fuel gallons (Col B). Column A has a bunch of duplicate entries, column B is always different. It's basically a giant list of fill-up events for aircraft over time at different airports. The airports can be the same, because it's one row per fill-up event.
PROBLEM
What I want to do is have a formula that takes the enter data set, finds all identical entries in Col A, sums the Col B values for the matches, and spits out the result on a separate sheet with one entry for every set/match.
OTHER STUFF
I do not have a reference list for Column A and I would rather not create one since there are thousands of entries. I would like to just write a formula that does all this at once, using the data itself as the reference.
All the answers I find are "create a reference list on a separate sheet", and it's driving me crazy!
Thanks in advance for any help!
-rt
Sounds that you need a formula version of remove duplicated for column A, and a simple sumif for column B.
Column A
=IFERROR(INDEX(Data!A$1:A$1000,SMALL(IF(
MATCH(Data!A$1:A$1000,Data!A$1:A$1000,0)=ROW(Data!A$1:A$1000),ROW(Data!A$1:A$1000)),ROW())),"")
Array Formula so please press Ctrl + Shift + Enter to complete it. After that you might see a {} outside the formula.
Column B
=SUMIF(Data!A$1:A$1000,A2,Data!B$1:B$1000)
Just change the range for your data.
Reminders: The formula in columnA should starts from Row#1, or you have to add some offset constant for adjustments.
Since the returning value of MATCH() represents the position of the key in the given array. If we wanted it to be equal to its row number, we have to add some constant if the array is not started from ROW#1. So the adjustment of data in Range(B3:B1000) is below.
=IFERROR(INDEX('Event Data'!B$3:B$1000,SMALL(IF(
MATCH('Event Data'!B$3:B$1000,
'Event Data'!B$3:B$1000,0)+2=ROW('Event Data'!B$3:B$1000),
ROW('Event Data'!B$3:B$1000)),ROW())-2),"")
Further more, the range should exactly the same as the data range. If you need it larger than the data range for future expandability, an IFERROR() should added into the formula.
=IFERROR(INDEX('Event Data'!B$3:B$1000,SMALL(IFERROR(IF(MATCH(
'Event Data'!B$3:B$1000,'Event Data'!B$3:B$1000,0)+2
=ROW('Event Data'!B$3:B$1000),
ROW('Event Data'!B$3:B$1000)),FALSE),ROW())-2),"")
Lastly, I truly recommended that you should use the Remove Duplicated built in excel since the array formula is about O(n^2) of time complexity and memory usage also. And every time you entered any data in even other cells, it will automatically re-calculate all formulas once the calculation option in your excel is automatic. This will pull down the performance.
I apologize if the title is misleading, but
I have an issue where I need to generate a sequential number in a third column based on comparing data from two different columns.
My data looks like this:
Before
The entry with the 1 is the first point, I need to use the value in the 'Back' column to find the same value in the 'Front' Column, then add +1 to the point, so the result looks like:
After
Because of the naming conventions used, sorting either column by value will not work.
Appreciate the help!
Assuming you have the initial 1, and your number column is C, front is D, back is E, this would start at row 2:
=INDEX(C:C,MATCH(INDEX(D:D,MATCH(D2,E:E,0),1),D:D,0),1)+1
Image: http://i.imgur.com/0XfdLrk.png
Did you establish whether your data has duplicates or incomplete sequences?
Here's another formula which should achieve what you want and also doesn't rely on you knowing where the sequence starts. Every sequence will start with 1.
This formula follows your image layout, putting values into column A with data in columns B and C. Please replace the ranges in the formula for columns A and C to cover all of your data. (Ideally, you would do this by inserting a table first and then selecting the data rows, which will cause Excel to put in the table column name instead.)
This is the formula to go into cell A2, assuming you have data in B2:C7
=IF(ISERROR(MATCH(B2,$C$2:$C$7,0)),1,INDEX($A$2:$A$7,MATCH(B2,$C$2:$C$7,0))+1)
Put this formula in D2 and fill down to identify which rows are the ends of sequences:
=ISERROR(MATCH(C2,$B$2:$B$7,0))
Put this formula in E2 and fill down to identify duplicates in the Front column:
=COUNTIF(B$2:B$7,B2)
You can then fill it right one column to also identify duplicates in Back.
I'm trying to use Excel to extract figures based on multiple criteria and their location within columns.
So for example. If I wanted to do a SUMIF to receive the figures associated with the First class. The formula would retrieve the figure in a specified row,
But If I wanted to retrieve the figure associated with England. The formula would contain multiple criteria to look for the First class then look for the country England and retrieve the figure on its row in a specified column.
These columns will grow and shrink each month. Meaning I need it to be somewhat dynamic.
I've tried to do this using SUMIF and SUMIFS with no luck.
=SUMIFS(D2:D10,A2:A10,"First",B2:B10,"England")
The challenge you have is that in columns A, B and C, the values are not repeated downwards into the now blank cells. So values do not appear next to each other in the same row.
Assuming that the example you gave is quite simple, and you could also have multiple International Products for a given Class and Country, I would go for the following solution:
Reserve two columns (E and F) for intermediate calculations. If they are currently used, move those used columns to the right, making room for an empty E and F column. You could of course also choose two other columns for this purpose. But I will assume they are E and F.
Then in E2 put this formula and copy it further down the E column as far as needed.
=IF(A2<>"", A2, OFFSET(E2,-1,0))
In F2 put this formula and copy it down as well:
=IF(B2<>"", B2, IF(A2<>"", "", OFFSET(F2,-1,0)))
This should give the following display (the header titles in E1 and F1 are cosmetic only):
Now you can do formulas on those columns in combination with the C column. For instance:
=SUMIFS(D2:D10, E2:E10,"First", F2:F10,"England", C2:C10,"")
And this would output 2. Note that if you really only want to match one row, you should specify a condition for each column (E, F and C).
The intermediate formulas in the E and F columns are quite resistant to deletion of rows, due to the use of OFFSET. If you insert rows, you should of course make sure the formulas in E and F are copied into it.
If you will ever use more than 3 columns for the source data, you'll need to also add more intermediate columns with similar formulas. Also your SUMIFS would need extra conditions then.
You could use the following SUMPRODUCT() For Class and Country:
=SUMPRODUCT(($A$2:$A$10=$F$1)*($B$3:$B$11=$G$1)*($D$3:$D$11))
Then for all three:
=SUMPRODUCT(($A$2:$A$10=$F$1)*($B$3:$B$11=$G$1)*($C$4:$C$12=H1)*($D$4:$D$12))
A picture for references.
The idea is that each column must move down one row in its reference. And the Sum column must start on the same row as the last column being referenced.
I am working on a project within an excel database and am trying to match 4 different properties which all have their own columns (A,B,C,D) to find a corresponding value on a different page (Sheet2!). One sheet 2 the values are once again found in their own columns (B,C,D,E) and if all of the values match I then want the value in column A Sheet2! to be displayed in column E on sheet1!
The problem is is that often times the values on Sheet1! will be able to match up with as many as 12 different unique rows on Sheet2! making this incredibly difficult with only intermediate experience in VBA. There can be duplicates that match all of the criteria. And for when this happens I would like to return the first item that matches, as long as a previous match was not made on that item.
To give you more information we have given products different values that designate where they belong based off their velocity. This has split them up into Section#, ShelvingType, Verticle, and Horizontal Location. And we are looking to match these values to the values of our previously existing locations that we have that have corresponding(matching) numbers or text values.
To go into even more detail, on sheet one we have the products with values on where they should go. One sheet two with have pre-existing locations for which products can go that have values that are represntative of that location. So, we want to take the products NEW location values off page one and match the existing location values on page two. The problem is that for every location there are up to 12products that could go there. So, we want to go in order saying that product1 goes in the first location with matched values while product2 goes in the next location with matched values, and so on and so fourth
Edited to remove previous responses
Based on your further elaboration, if I understand correctly, I agree with the comment left by #Aaron Contreras. You should create helper columns which show a 'unique ID' where all criteria match, as well as an additional helper column which increases as more items of the same criteria code are found. This will become the 'ultra-unique' ID for that item.
At this point I don't think array formulas will be possible, though I will leave in the answer which provides the result of the first matching criteria without further eliminating 'previously used' results. This could likely be further refined, but I doubt it would be more elegant than simply using the helper columns shown in my response below. At least, I can't figure out how to do it elegantly.
To summarize my assumptions:
-Your available space is in sheet1; column A contaions something like the location of that available space, and columns B-E contain criteria for anything which will be stored there.
-Your new list of items to be placed in a location is in sheet2; columnA will be where our formula goes, showing the available location to put that item.
Enter on Sheet1
In column F on sheet1, drag down this formula:
=B1&C1&D1&E1
This will create a unique ID key to be searched in the future.
However, as there will be multiple hits for the same criteria on sheet1 (because multiple locations can hold the same thing), we need to make each row 'more unique' by showing how many times that criteria combination has already occurred. This formula will thus go in column G on sheet1, starting in cell G1 and dragged down:
=F1&countif($F$1:F1,F1)
As you drag it down, this will count the nth time that the specific combination of criteria has appeared on sheet1.
Enter on Sheet2
Create the same columns in sheet2, in columns F & G. The formulas will be exactly the same, they will just refer to sheet2 instead of sheet1.
Then the formula in column A in sheet 2, dragged down from A1, would be:
=index(sheet1!A:A,match(G1,sheet1!G:G,0))
This will find the first time that all criteria match from sheet1, for the nth time that this criteria has been used on sheet 2.
Let me know if there is anything here I've missed.
Unfinished array method
Again, array responses are possible, but for your purposes likely unnecesarry; you should probably have a unique ID for all combinations anyway. However, in case you want to use the array method, you can like so (does not account for multiple locations being used; left for reference only if you want to take this up):
In sheet2, enter the following formula [confirmed with CTRL + SHIFT + ENTER instead of just ENTER, every time the formula is changed] on the row 1, with the different criteria (and copied down):
=index(Sheet1!A1:A100,match(1,(Sheet1!B1:B100=B1)(Sheet1!C1:C100=C1)(Sheet1!D1:D100=D1)*(Sheet1!E1:E100=E1),0))
This uses the inherent boolean logic of "TRUExTRUE = TRUE; TRUExFALSE = FALSE; FALSExFALSE = FALSE", to find the first row where there is a match of all criteria. Note that I have not made this go all the way down all columns, as with Array formulas this is a significant resource hog.
Assuming that your data starts from 2nd row (1st row for lables):
{MATCH(A1&B1&C1&D1,B2:B100&C2:C100&D2:D100&E2:E100,0)}
The above is an array formula, so you don't have to input the curly brackets {.
Simply press Ctrl + Shift + Enter after typing the formula
More info