Searching two excel columns for occurrence of two strings together

Searching two excel columns for occurrence of two strings together - excel

I'm looking for a way to count the number of times two cells appear side by side in Excel - like intersections. Sometimes in my data (of about 550 records) A Road will appear next to B Road, which is a count of 1. If it occurs again, later in the data the count would be 2. But if B Road appears in the first column and A Road appears in the second column, I can't find a way to make that number 3 in the count.
I've tried concatenating the data, but I need to be able to write this formula without inserting specific criteria (like searching for A Road) because it would be easier in that case to do this manually. Does anyone know if there's a formula for find the occurrence of the same two variables between two columns without specific criteria?

If the order of values in the two columns were important (i.e. A Road followed by B Road is different than B Road followed by A Road), then a simple pivot table would provide the counts you need. You would just put Col M on the rows, Col N on the columns and the count of any field as the value.
But the OP has said that A Road followed by B Road should be counted in the same total as B Road followed by A Road. Let's modify the concatenation in Col O to become =IF(M2<N2,CONCAT(M2," & ",N2),CONCAT(N2," & ",M2)). That provides a canonical form of each combination regardless of order. Having done that, then it is again an easy matter to create a pivot table that shows all the required counts -- just put the concatenated value on the rows and the count as the value.

If I correctly understand your intent, then try this array formula : =SUM(IF(EXACT(B$2:B3&C$2:C3,B3&C3),1,""))+SUM(IF(EXACT(B$2:B3&C$2:C3,C3&B3),1,"")).The second sum formula accounts for any reverse order of adjacent street occurrences.Then copy down the column as needed. Enter with Ctrl+Shift+Enter.

Related

How to search on multiple columns in ecxel "vlookup" and return value from another column in the same row?

I have an excel file.
Its purpose is to have a list of rooms (working rooms), and people names, so I can know TWO things:
Who is embedded in each room.
What is the room of a specific person.
The file has 2 spreadsheets in it.
The first one (called “people”) is a list of names, with another column which says what is the room of each person. Note: There are NOT two identical names. Every name is different.
The second one (called “rooms”) is a list of 5 columns:
the first column is the room number
the other 4, are 4 names of people (because each room has 4 people in it). Note: the order of the people in the room (in the 4 columns) is not important.
The problem starts when I want to swap 2 people from different rooms. Then, I have to change it in both spreadsheets.
I want to find a formula that finds a name in “rooms” spreadsheet, and writes the room number (column A) in the correct row in “people” spreadsheet in column B.
I have tried to use one of the lookup formulas (and especially vlookup) but the problem is the fact that those functions search only in the first column and return value from the other columns, whereas I need exactly the opposite: I want to find value (specific name) from one of the 4 columns, and return the value of the first column (the room number - column A). That way I can go to B column in “people” spreadsheet, and insert a formula for each name in it, so it can find the correct room number.
Another problem is the fact that the room numbers in “rooms” spreadsheet are not necessarily sorted. Sometimes I want to take a group and move it to a different room. In this case, all I do is changing the room number in column A, thing that makes the list to be not-sorted.
How can I do this with a formula, in a way that I will need to change only the people in "rooms" spreadsheet, and not to change room number in "people" spreadsheet? Is there a formula that does what I need, or should I use more than one formula to achieve my goal?

Using INDEX & AGGREGATE Function
• Formula used in cell B4
=INDEX(rooms!$A$4:$A$8,AGGREGATE(15,6,(ROW(rooms!$B$4:$E$8)-ROW(rooms!$B$4)+1)/(people!$A4=rooms!$B$4:$E$8),1))
So let me explain how AGGREGATE() Function returns the position of each emp, actually it returns the row number in which emp falls in
AGGREGATE(15,6,(ROW(rooms!$B$4:$E$8)-ROW(rooms!$B$4)+1)/(people!$A4=rooms!$B$4:$E$8),1)
Next when wrapped within an INDEX() Function it outputs the Room Numbers as desired.
Refer the image below, when altered !

Formula in column B is:
=INDEX($B$3:$B$7;SUMPRODUCT(--($C$3:$F$7=A12)*ROW($B$3:$B$7))-2)
Notice the -2 at the end. This is because data starts at row 3 so we need to adjust.
IF we swap 2 names (name 7 and 8) or 2 complete rooms (rooms 1 and 5) the results will autoupdate:

For Excel365; using LET for readability and alterability you can use:
=LET(data, rooms!B4:E8, index, MIN(IF(data=A4, ROW(data)+1))-ROW(INDEX(data,1,1)), INDEX(rooms!A4:A8, index))
Otherwise:
=INDEX(rooms!A4:A8,MIN(IF(rooms!B4:E8=A4,ROW(rooms!B4:E8)-3)))

Spreadsheet summing based on substring match

I am struggling with one part of this exercise. I'm able to sum based on one column unique values (i.e. column G). I'm also to extract unique names from columns with multiple names in same cell (column I). What I am not able to do is get work assigned for the person from multiple rows. For simplicity, the work is just divided equally between number of people in that row.
Desired outcome is in column L. Sample sheet to work with is here
https://docs.google.com/spreadsheets/d/1xwv8IV-XNMArSFZEdT8mR-ZbOFHRFFZMfAm7dwm3bbE/edit#gid=741595390

Try it as (in J3),
=SUMPRODUCT((D$3:D$8)/(LEN(C$3:C$8)-LEN(SUBSTITUTE(C$3:C$8, " ", ""))+1), --ISNUMBER(SEARCH(I3, C$3:C$8)))

Excel Formula or function that returns the Nth value from a dynamically generated grouping of cells

I am trying to assemble a index/match combination and am having trouble figuring out how to make it work. I have experience with a lot of the formula types in excel, but unfortunately I am pretty ignorant when it comes to these functions.
I will explain what I am trying to do first, but I have attached 3 images at the end that will probably make things more clear.
In order to identify the specific values I want, I am having to use helper cells. These helper cells are denoted with the (helper) tag in the pictures. These cells go through and grab the adjusted closing price of the stock (column A) at the beginning (column C) and the end (Column D) of a dynamically calculated period.
I would like to consolidate these values into numerical order in columns F and G. The thought is that the first non zero number in C/D is belongs to the first predefined period and should go into columns F/G beside the #1 (column E). This gets carried on through all of the periods (ex: 2nd non zero goes beside the number 2, third nonzero number goes beside the number 3 etc.)
This is just an example of one stock. I need the function or formula to be dynamic enough to work on a wide variety of distributions. Sometimes there are up to 100 dynamically calculated periods within the stock analysis.
Below are the images that should provide more clarity
Image 1 is an example of what the data looks like
Image 2 is a crudely drawn example of how I would like the data to move
Image 3 is the desired result
Image 1
Image 2
Image 3
Updated image for Scott Craner showing out of order results
Please let me know if I can clarify any confusion.

If you just need to return the first value of each period (column C) and the last value of each period (column D), you could use index match and lookup to do this without even using helper columns.
Try this in cell F2
=INDEX(A2:A50,MATCH(E2,B2:B50,0))
And this in cell G2
=LOOKUP(E2,B2:B50,A2:A50)
Depending on much variance is in your overall number of rows, you could use indirect references in the formulas to dynamically update the ranges.
Example:
=INDEX(A2:INDIRECT("A"&COUNTA(A:A)),MATCH(E2,B2:INDIRECT("B"&COUNTA(A:A)),0))

You will need to open macro. Then do the following in recorded macro.
+ Filter only non-null value in C/D
+ Select whole column in C/D then copy the whole column
+ Turn off Filter
+ Paste the whole C/D in F/G
+ Stop macro
Gook Luck

Put this formula if F2:
=INDEX(INDEX(C:C,MATCH($E2,$B:$B,0)):INDEX(C:C,MATCH($E2,$B:$B,0)+COUNTIF($B:$B,$E2)-1),MATCH(1E+99,INDEX(C:C,MATCH($E2,$B:$B,0)):INDEX(C:C,MATCH($E2,$B:$B,0)+COUNTIF($B:$B,$E2)-1)))
Copy over one column and down the list.

Excel Instance Parsing

I have a list of data "instances" within one column within an excel sheet.
Each instance can have numerous copies. Here is an example:
abcsingleinstanceblah0001
cdemultipleinstanceexample0001
cdemultipleinstanceexample0002
cdemultipleinstanceexample0003
cdemultipleinstanceexample0004
....
Unfortunately the numbering scheme was not preserved across all of this data. So in some cases copies will have randomized numbers. However, the root instance name is always the same.
QUESTION: What would be a good strategy for creating a function that will parse a list of these instances and, in a new column, list all duplicates past the second copy? In relation to the example above, the new column would list:
cdemultipleinstanceexample0003
cdemultipleinstanceexample0004
I need to have the two duplicates with the lowest integer values preserved out of each set of duplicates, which is why in the example above 3 and 4 would have to go. So in the case of randomized numbers, the two instances with the lowest integer values.
What I have thought of
I was thinking to first organize the column by alphabetical order, which should automatically put duplicates in ascending order. I could then basically strip the number value from all instances, and find where there are more than 2 exact duplicates from the core instance name, which would give me the instances with more than 2 duplicates so that I could perform a function on the original data set... but I don't know if there is a better way of doing this or where to go from here.
I'm looking for formula-based solutions.

Assuming your sorted list is in Column A and that you have a row of headers you could use the following formulas in the neighboring columns.
In B:
=LEFT(A2,LEN(A2)-4)
In C (although not really necessary):
=RIGHT(A2,4)
In D starting with row 3:
=IF(AND(B3=B2,COUNTIF(B1:B3,B3)>2),"Del","Keep")
This formula doesn't work in row 2, but you can hard code the first result.
Then filter the list on Column D for "Del" and delete all the rows.
How's that?

Sort your list in column A. You'll want column headings for later so put those in row 1 (or leave it blank. In B2, type =left(A2,len(A2)-4) and drag the formula down to strip the integers. In C3 type =vlookup(B3,$B$2:$B2,1,0). Populate the formula in C3 right one cell and then down the length of the data. Now in D3 you'll have a list that has errors for any entry that only 2 or fewer instances and will have the name for any that have 2 or more. Sorting this list with a filter on row D for #NA will allow you to delete all the rows with less than two entries.
Remove your filter. Then resort the list in column A in reverse order so the high numbers are first. Replace the contents of C2 and D2 with #N/A. Refilter the list on column D for everything but #N/A and delete all the entries that have an instance listed.

HOWTO ad a unique id to each unique row?

I have data in two columns:
a 1
a 1
a 2
b 3
b 4
In the list there is 4 unique rows. I would like to ad a unique id to each unique row.
Like this:
1 a 1
1 a 1
2 a 2
3 b 3
4 b 4
Of course I have many more rows and columns and date are more complex than in this example.
Anyway to do this i excel?
Mvh Kresten Buch

I have the same issue, I have developed a three formula approach to this. I could probably concatenate it if I nested them, but whatevs, this works.
Assume the data you want to 'number' is in column A, and the first row of the table is row 3.
The first column (in column B) counts occurrences of the 'value' and the range expands from the top of the table downwards as the table grows:
=COUNTIF($A$3:A3,A3)
the second column's formula also expands as the row count does, and simply adds 1 to the transaction count every time it encounters a 1 (ie first occurrence of a new unique value) in column B
=IF(B3=1,MAX($C$2:C2)+1,"")
This one worked for me even in the first row of the table btw - i was expecting to have to manually input a 1 to start the list. Having it work without the manual entry is a good thing, it means the formmulas all work even if you resort the table data into a different order.
The third one in column D uses a vlookup to find the value. Note that when vlookup finds more than one number, it always pulls the first occurrence.
=VLOOKUP(A3,$A$3:C3,3,FALSE)
Note that this will renumber all the data outcomes dynamically if you do resort the entire thing. ie the formulas all work, but the number 'assigned' to a praticular set of data might be different, as it all works from whatever order the list of items is in.
My use case for these formulas assumes that every month i paste a new set of data to the bottom of the table, some items of which are repeats from previous months - ie are already in the table, and some of which are new.
if the dynamic renumbering is a problem, use a 'row key' so you can resort back to the original order at the end.

Assuming your data is in B2:C6 please try =IF(AND(B1=B2,C1=C2),A1,A1+1) in A2, copied down

If your data is not sorted, it's more complicated... but you can use something like this in A2:
=IF(COUNTIFS($B$1:B2,B2,$C$1:C2,C2)>1,INDEX($A$1:A1,IFERROR(MATCH(B2&"-"&C2,$B$1:B1&"-"&$C$1:C1,0),1)),MAX($A$1:A1)+1)
I'm assuming that there are no headers and you have already put 1 in cell A1 for the first record.
It basically checks the whole columns above the formula and if there's already a similar record, it'll assign the previously given unique ID and if not, it'll give a new ID.
This is an array function and as such will work if you use Ctrl+Shift+Enter and not Enter alone.
The IFERROR() is there because MATCH(B2&"-"&C2,$B$1:B1&"-"&$C$1:C1,0) would return an error if it is on row 2 (the first record to check).
Once you put that in the first cell, you can fill down the formula.

I deal with this issue all the time when structuring a data set into panel data. say you have multiple columns of data, and each are unique based on the name of someone, like:
ANNE
ROSE
ANNE
FRANK
TOM
ROSE
ANNE
but instead of having each column related to Anne, Rose, Frank, or Tom, you want it to look like this:
1 ANNE
2 ROSE
1 ANNE
3 FRANK
4 TOM
2 ROSE
1 ANNE
So that each name now has a unique numerical identifier that can be used in place of the name.
Make a pivot table of your data and only place the column that has the names (or whatever the identifier may be) into the Rows section. This will single out all the different names used within the dataset. Copy and paste this pivot table anywhere on the sheet so that the names are in actual cells and not off of a pivot table. To the right of the names, enter 1 next to the first name, and then =B1+1, and so on so that you number each name with a unique value; then copy and paste this column as numbers so that their formulas are erased. Finally, just go to your original dataset and perform a VLOOKUP so that the names get attached with whatever unique value was assigned off of the pivot table. Make sure to copy and paste as numbers once done to remove the VLOOKUP formula.
Takes literally 2 minutes to do, depending on size of dataset, and is very easy. It will work perfectly every time.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string