long time reader, first time pos(t)er of questions.
I have an Excel 2013 worksheet of about 4,000 unique records (rows) of data. We'll call this the data dump. I've filtered the data dump using any one of about six different data elements (columns). After each filter I save the results to a new worksheet. I clear the filter to start over, and ultimately wound up with about six different worksheets.
I need to be able to account for each unique record in the data dump--each one should (in theory) appear on at least one of the filtered worksheets, and I need to identify any that don't.
My big problem is that the only way to uniquely identify each record is by concatenating a text string out of five consecutive cells in each row. I cannot add a column of concatenated text to these worksheets (for which reasons I'll presently spare you), so essentially I'm trying to build a formula that says the following:
For a given, unique, concatenated string of text of five consecutive cells from one record on this data dump worksheet, identify any exact matching strings from any of the other worksheets and return TRUE if found or FALSE if not.
I will, of course, have to apply this formula to every record in the data dump.
Thoughts or tips? Ultimately I think it comes down to a lot of small moving parts that I could manage individually, but that I'm not confident I could manage collectively.
Any help is appreciated and I'll be happy to clarify where needed. And forgiveness if a similar question has been asked previously--I searched pretty fruitlessly for an answer all afternoon.
Thank you!
You could use Index to create a concatenated range that serves a lookup range to Match(). Match() can concatenate the lookup term. It then returns a number for a match or an error if no match is found. Wrap error trapping formulas around this for the TRUE/FALSE result. Along the lines of
=iferror(match(sheet1!A1&sheet1!B1&sheet1!C1&sheet1!d1&sheet1!e1,index(Sheet2!$a$1:$a$1000&Sheet2!$b$1:$b$1000&Sheet2!$c$1:$c$1000&Sheet2!$d$1:$d$1000&Sheet2!$e$1:$e$1000,0),0),FALSE)
Note that any match will return a number (which will evaluate to a boolean TRUE in summarising formulas) and a non-match will return a FALSE.
This will get you the row number of the match for the first row of original data on sheet1, where the first extract lives on Sheet2 in the first 1000 rows. Use the same principle for the other four sheets and wrap the five formulas into an OR() statement to arrive at a final TRUE or FALSE.
Note that the Index ranges should not encompass whole columns, but only the rows with data. Otherwise the formula will be very slow to recalculate, especially if you use it 4000 times.
Here is one way. If you have your datadump records from A1 downwards.
And assuming you can have your filter sheets similarly. Then adjust your filter ranges so that the formula calls the fixed ranges properly.
You might be able to name them...
This formula need CSE for it to work
Edit by teylyn: This formula is an array formula and needs to be confirmed with Ctrl-Shift-Enter. It
will not work if you only hit the Enter key after editing the formula.
Control-Shift-Enter is sometimes referred to as CSE. People also call
it "array-entering" a formula. Excel will put curly braces around the
formula, which you can see in the formula bar when the cell is
selected.
=OR(
(IFERROR(MATCH(A1&B1&C1&D1&E1,FilterSheet1!$A$1:$A$200&FilterSheet1!$B$1:$B$200&FilterSheet1!$C$1:$C$200&FilterSheet1!$D$1:$D$200&FilterSheet1!$E$1:$E$200, 0), FALSE),
(IFERROR(MATCH(A1&B1&C1&D1&E1,FilterSheet2!$A$1:$A$200&FilterSheet2!$B$1:$B$200&FilterSheet2!$C$1:$C$200&FilterSheet2!$D$1:$D$200&FilterSheet2!$E$1:$E$200, 0), FALSE),
(IFERROR(MATCH(A1&B1&C1&D1&E1,FilterSheet3!$A$1:$A$200&FilterSheet3!$B$1:$B$200&FilterSheet3!$C$1:$C$200&FilterSheet3!$D$1:$D$200&FilterSheet3!$E$1:$E$200, 0), FALSE),
(IFERROR(MATCH(A1&B1&C1&D1&E1,FilterSheet4!$A$1:$A$200&FilterSheet4!$B$1:$B$200&FilterSheet4!$C$1:$C$200&FilterSheet4!$D$1:$D$200&FilterSheet4!$E$1:$E$200, 0), FALSE),
(IFERROR(MATCH(A1&B1&C1&D1&E1,FilterSheet5!$A$1:$A$200&FilterSheet5!$B$1:$B$200&FilterSheet5!$C$1:$C$200&FilterSheet5!$D$1:$D$200&FilterSheet5!$E$1:$E$200, 0), FALSE))
I have put hard returns so you can see what is going on better. Obviously you must collect up the formula
EDIT for new requirement: Ctrl+Shift+Enter required again
=CONCATENATE(
(IFERROR(MATCH(A1&B1&C1&D1&E1,FilterSheet1!$A$1:$A$200&FilterSheet1!$B$1:$B$200&FilterSheet1!$C$1:$C$200&FilterSheet1!$D$1:$D$200&FilterSheet1!$E$1:$E$200, 0) & " - FilterSheet1", ""),
(IFERROR(MATCH(A1&B1&C1&D1&E1,FilterSheet2!$A$1:$A$200&FilterSheet2!$B$1:$B$200&FilterSheet2!$C$1:$C$200&FilterSheet2!$D$1:$D$200&FilterSheet2!$E$1:$E$200, 0) & " - FilterSheet2", ""),
(IFERROR(MATCH(A1&B1&C1&D1&E1,FilterSheet3!$A$1:$A$200&FilterSheet3!$B$1:$B$200&FilterSheet3!$C$1:$C$200&FilterSheet3!$D$1:$D$200&FilterSheet3!$E$1:$E$200, 0) & " - FilterSheet3", ""),
(IFERROR(MATCH(A1&B1&C1&D1&E1,FilterSheet4!$A$1:$A$200&FilterSheet4!$B$1:$B$200&FilterSheet4!$C$1:$C$200&FilterSheet4!$D$1:$D$200&FilterSheet4!$E$1:$E$200, 0) & " - FilterSheet4", ""),
(IFERROR(MATCH(A1&B1&C1&D1&E1,FilterSheet5!$A$1:$A$200&FilterSheet5!$B$1:$B$200&FilterSheet5!$C$1:$C$200&FilterSheet5!$D$1:$D$200&FilterSheet5!$E$1:$E$200, 0) & " - FilterSheet5", ""))
My edit for the new requirement just takes the matches found, as #Messy Jesse said, and also appends the sheet name too. If no match is found in a sheet, then ZLS is added to the string. The total string is then concatenated...
Related
This is the problem i am facing in Excel formula
enter image description here
In column F, i want to find the common text across A2 to E2 (containing Blanks)
My Question:
Is there a simple way to get the result without VB?
Any help is appreciated,thanks
I found that google sheets has some really cool functions.
If you put the formula =SPLIT(A1, ",", TRUE,FALSE) in the cell after your row of common text (or probably even in a different sheet - "probably because hadn't tried it, though it should), the next x cells (where x is the number of "," in A1 - because "," is the delimitator) will be the text.
then you can put the code =IF(SUM(ARRAYFORMULA(if(REGEXMATCH($A$1:$D$1,F1),1,0)))=COUNTA($A$1:$D$1),F1,"") into an equal number of cells after that (probably should just put into the max number), and =CONCATENATE(I1:L1) into the last cell.
Ok. So to tweak this for yourself: I found that ARRAYFORMULA lets you put an array in place of a single cell in a function inside. how it exactly works I read its like a for loop. but I can't really vouch for that. but here it lets you have REGEXMATCH (which is a Boolean check on the cell you give it for if it contains the given REGEX) check each cell in the array.
the sum will add them up, and the if will match against the COUNTA to find if the number of cells in the array that contain this string is equal to the number of non-empty cells.
the concatenate at the end adds all the cells (containing the regex function) together, and since the only non-empty cells will be the one with the string, that is what this cell will return (no spaces).
code:
results:
the test data:
If you need in specifically Excel... this won't help.
We can use power query to achieve the desired result.
Unpivot the columns in Power query
Split all the columns by Comma delimiter
Create a custom column to see if the first column records exist in the remaining columns.
Use the functionText.contains.
Sample function: =Text.Contains([column.1],[column.1]&[column.2]&[column.3])
If the above function returns TRUE then get the first column result(This is the expected result) and load the data back to your excel
I have multiple rows of students on a spreadsheet and their grades across the top as column headers. Not all students do every subject I have listed so therefore some columns will be blank. I am trying to use these grades as a range and have the populated cells appear all next to each other at the end of the spreadsheet. I will hen need the subject header to come with the non-blank cells data.
See the screenshot below to understand what data needs to go where:
I found a kind of answer on stack but its totally the wrong way round to how I have to work (see the image below). So, please help me flip this around so names would go down column A and subjects along the first row. (This is in a Google Sheet).
Lee
Here is a mock up of what I am trying to achieve, all data will likely be on one student row, but I have organised like this for the screenshot.
Based on your annotated screenshot, here's how to get from the top table to the bottom table.
Put your Subject 1, etc. headings in manually.
Put ={G13:G16} into G20 to copy your student names.
In H20 use =INDEX(FILTER($H$12:$O$12, $H13:$O13 <> ""), 1, 1) to grab the first heading in $H$12:$O$12 that is over a non-blank cell in $H13:$O13.
In the above formula, FILTER() grabs all the headings over non-blank cells in the range, while INDEX() is used to grab the first result.
Repeat the formula for H21:H23 (letting Sheets update the references that aren't fixed with a $ prefix.)
In I20 use =INDEX(FILTER($H13:$O13, $H13:$O13 <> ""), 1, 1) to grab the first non-blank value in $H13:$O13.
Repeat the formula for I20:I23.
Moving right, just copy the formulas, but update the values for INDEX() but increase the column argument by one. eg. J20 would contain =INDEX(FILTER($H$12:$O$12, $H13:$O13 <> ""), 1, 2) and K20 would contain =INDEX(FILTER($H13:$O13, $H13:$O13 <> ""), 1, 2).
This should get you well on your way.
Happy spreadsheeting!
A problem that I am trying to do is extract rows out of a spreadsheet. The spreadsheet has over 1200 entries. I have split them up into relevant worksheets so the information can be used. Each of the individual codes will be a worksheet
A sample of the data looks like
The formula that I have found is and trying to run from a separate work sheet is
INDEX(Master!$A$2:$D$13, SMALL(IF((INDEX(Master!$A$2:$D$13,,4,1)="wap"),
MATCH(ROW(Master!$A$2:$D$13),ROW(Master!$A$2:$D$13)), “”), ROWS(A2:$A$2)), ,1)
It fails on
MATCH(ROW(Master!$A$2:$D$13),ROW(Master!$A$2:$D$13))
Getting the dreaded #N/A
I need some help in solving the problem and a brief explanation of the solution would be helpful.
Unfortunately, it has to be done by formula, as I don't have access rights for a VBA query.
Here you go:
=IFERROR(
INDEX(Master!$A$2:$D$13,
SMALL(
IF(
INDEX(Master!$A$2:$D$13,,4)="wap",
INDEX(Master!$A$2:$D$13,,1),
COUNTA(Master!$A$2:$A$13)+1
),
ROWS(A2:$A$2)
),
COLUMN()
)
,"")
This must be entered as an array function. Instead of clicking out of the cell or pressing enter to exit the formula, press Ctrl+Shift+Enter.
The IF function iterates through every cell in column 4 of the Master array. If the Code is a match, the ID is passed. Otherwise, the number of master rows plus one is passed. This is important because it will produce an error in the final INDEX function, which will be escaped to "" (a blank string).
The SMALL function outputs the current row, or key in the generated array. The final INDEX function gets the intersect of the current column and the chosen row. Just copy this across however many columns you have on each sheet and down the number of rows that should be returned. Any additional rows (or columns) will pass the blank string. To be safe, I'd copy down the 1200 rows, but this could slow your processing (if it does, just set the calc mode to manual).
I have an excel document with two sheets, data and edu-plan. The sheet data has the following information:
The sheet edu-plan looks like this:
My question is: how do i create an excel formula that checks if the target group on the specific row in edu-plan! has the course name in question on the same row as the target group in sheet data!, i.e. if Sales and Sales course is on the same row in the sheet data!?
In reality, the data sheet as a couple of hundred rows and will change over time, so i am trying to develop a formula that i can apply easily on all rows/columns in edu-plan!.
The desired result in edu-plan would look like this:
A pivot table might be a good way to go.
If you would like to do it by formula, then you can just use a COUNTIFS
=IF(COUNTIFS(data!$A$2:$A$10,$A2,data!$B$2:$B$10,B$1),"X","")
A possible way to solve your issue with an array formula:
Write in B2 of sheet edu-plan
{=IFERROR(IF(MATCH('edu-plan'!$A2&'edu-plan'!B$1,data!$A$2:$A$6&data!$B$2:$B$6,0)>0,"x",""),"")}
Since it is an array formula, you need to hit shift + ctr + enter.
Here is the formula broken down:
MATCH('edu-plan'!$A2&'edu-plan'!B$1,data!$A$2:$A$6&data!$B$2:$B$6,0)
checks whether the combination of row header and column header is in the data table. MATCH returns the index of the found combination. Since we are not interested in the location, we only ask IF(MATCH > 0, "x", "") to write an "x" if a match was found. If MATCH finds nothing, it returns an error, which is why we add an IFERROR(VALUE, "") around the construct.
For the sake of MWE, I have an array in $AP$4:$BO$20 with a single string in each cell. The data in each cell is an alphanumeric code, such as 1,1a,2b,3c, etc.
Row 22, starting in column AQ, contains a single string that matches one or more of the strings in the array named above. Goal: using each string in AQ22:AO22, create a formula that extracts EVERY row number of the cells in the array $AP$4:$BO$20 that contain exactly the value in AQ22:AO22.
Bonus for doing it without using an array formula. VBA is not an option since this is Google Sheets, and I'd really prefer to avoid g-apps-script.
I've attempted using
=INDIRECT(ADDRESS(MIN(IF(NOT(ISERROR(FIND(AQ22,$AP$4:$BO$20,1))),ROW($AP$4:$BO$20),"")),1))
and
=IFERROR(INDEX($AP$4:$BO$20,SMALL(IF($AP$4:$BO$20=AQ22,ROW($AP$4:$BO$20)-4),ROW(A1)),2),"")
and even the illustrious
=IF(ISERROR(INDEX($AP$4:$BO$20,SMALL(IF($AP$4:$BO$20=AQ22,ROW($AP$4:$BO$20)),ROW(1:1)),2)),"",INDEX($AP$4:$BO$20,SMALL(IF($AP$4:$BO$20=AQ22,ROW($AP$4:$BO$20)),ROW(1:1)),2))
Here is a toy sheet to test out ideas with using this information. Note the comment on the cell where the formula will begin.
Not sure if I understand the desired result correctly, but
= IfError( Filter( Row($AL$4:$AL$16), RegExMatch( $AL$4:$AL$16, "\b" & AQ22 & "\b" ) ), "")
results in 7, and 9 in a separate cell below it. \b is a word boundary that matches between alphanumeric and non-alphanumeric character. If you want the result in one cell, you can join them:
=IfError(Join(",", Filter(Row($AL$4:$AL$16), RegExMatch($AL$4:$AL$16, "\b"&AQ22&"\b"))), "")
You can also match multiple values:
=IfError(Filter(Row($AL$4:$AL$16), RegExMatch($AL$4:$AL$16, "\b(" & Join("|", AQ22:AZ22) & ")\b")), "")
In Excel you can do it without a CSE entered formula, but I don't know if the AGGREGATE function is available in Sheets:
=IFERROR(AGGREGATE(15,6,1/((AQ$22=arr)*(LEN(arr)>0))*ROW(arr),ROWS($1:1)),"")
For an array entered formula:
=IFERROR(SMALL(IF((AQ$22=arr)*(LEN(arr)>0),ROW(arr),""),ROWS($1:1)),"")
For either one of those formulas, enter in AQ24, then fill down until you get blanks, and across. When you fill down, none of the "target rows" can be hidden (or else the result of the formula will be hidden).
arr refers to $AP$4:$BO$20
Though this is a very specific application in response to this question, for the sake of the knowledge base, I'd like to show how I dealt with an instance of multiple match values. There is likely a much better way, but here is one way.
To give this context, imagine the LIST_CELL is a list of question numbers
(which are entered in as a header row, call the range QUESTIONS) on a test that correspond to certain standards, and the goal is to average only the questions that correspond to the standard next to which the list is written, and for each student. Using
=iferror(join(",",ArrayFormula(match(split(LIST_CELL,","),QUESTIONS,FALSE))),"")
The split function splits the a hand-entered list of questions on commas, the match function returns the column number of that particular question in QUESTIONS, and the join function joins the data back together. ArrayFormula allows the match to be performed on an array instead of just the first value.
Another single row heading lists the standards to which each question has been matched (possibly to more than one standard) by the comma separated list in LIST_CELL. For a column list of students in A:A, each standard needs to average the scores of every question that is listed next to the standard. This is accomplished by the nifty (if clunky):
average(ArrayFormula(hlookup(split(vlookup(LOOKUP_VAL,SEARCH_RANGE,COL_W_LIST),","),DATA_SOURCE,row(CURRENT_CELL))))
Breakdown from center outward:
LOOKUP_VAL is the value being looked up (the one that has multiple matches); in the example context, it's the standard.
SEARCH_RANGE is a range of cells containing both the list of lookup value (the standards in context) and the comma separated lists of column numbers generated by the first function. COL_W_LIST is the column number in the array SEARCH_RANGE that contains the list of row numbers matched from LIST_CELL.
Split takes the elements apart and placed them in a temporary array so that hlookup can be performed on each element. Via ArrayFormula the hlookup grabs each value on the same row in the appropriate QUESTIONS column - in context, it grabs the point scores for each question matched to the standard.
Finally, average is self-explanatory, and does take an array as input apparently.
These two functions in combination allow of use of indirect cell references in an array formula, and solves the much asked, "how do I include multiple matches in a calculation" question. At least in this specific context.