I'm trying to clean up some email lists. I've got eight CSVs of email addresses. I'm trying to find email addresses that appear in ALL eight lists.
I've tried copying and pasting them into a single Excel document (a column for each list). But I am only able to compare TWO columns using methods like this:
=IF(COUNTIF($B:$B, $A2)=0, "No match in B", "")
so it doesn't tell me if there's a match in ALL columns (and technically that code tells me if there isn't a match). I can sort of figure it out by comparing two lists, then using those results to compare it to the third list, then use those results and compare it to the fourth list, and so on until I get through all eight lists. But
I've also tried using conditional formatting which formats duplicate values, but it again doesn't tell me if the match appears in ALL columns (it highlights if there's even one match).
Is there any way in Excel that I can indicate either with highlighting, or copy the value to a new column, any email addresses that appear in ALL columns?
So, just one formula:
SUM(IF(IFERROR(FIND(J3,A$3:H$12,1),0),1,0))>=8
See:
Just used names as you did not provide any data.
You can do it easily with Pivot Tables. Instead of creating a column for each CSV file (like the left part of image above), just copy all of them in the same column, one CSV after each other (like column E in image). Then insert a Pivot Table:
Take field Email into Values section and rows Section
Filter by values using the number of CSV as criteria (3 in the image, 8 in your case because you got 8 CSV files)
Create a PivotTable to analyze worksheet
data
Results are the emails that appear in all CSV.
Of course, this will work only if each e-mail appears once in each CSV file. If there are duplicates, you'll need to deduplicate each CSV before creating the unique column
Related
I'm having trouble filtering an excel table. M, it is a set of two rows from two tables, where it is necessary to find duplicates.
2 rows with duplicates
Some idents are repeated, they are present both in the current and previous months. In the example below, with the help with this function =IFERROR(MATCH(A2;B:B;0); "NO"), I obtained information about which data from last month is repeated in the current month and exactly in which row it is located. The code for determining whether it is repeated is as follows =COUNTIFS($A$2:$B$13;A2)>1
duplicates and if repeated
I would like to retrieve only duplicates from the list, I tried the code =IFERROR(INDEX(A:A;SMALL(IF(NOT(D$2:D$104=TRUE);ROW(B2)-ROW(INDEX(B2;1;1))+1);ROW(G:G)));" ERROR")to get the ones that are repeat and skip those ones that arent, but the result is not as desired. In line G, you can see an example of how Excel gives me data regarding the entered function. In cell H, it is shown how I would like a new row to be created with only duplicates.
Current vs. desired display
In this example, the columns are a bit small, but in reality there could be at least a thousand rows, so I would need help filtering those.
You implied these columns were present in two different tables. So I used Tables with structured references. You can convert to normal addressing if you require that instead.
If you have Windows Excel 2021 or later, you can use:
=FILTERXML("<t><s>" &TEXTJOIN("</s><s>",,UNIQUE(LastMonth[Last month marks],FALSE,TRUE),UNIQUE(CurrentMonth[Current Month],FALSE,TRUE))& "</s></t>","//s[following::*=.]")
Create a list of distinct items for each row
Create an XML by concatenating the items into an array using Textjoin
Extract only those items that are followed by an identical item
With your earlier version of Excel, again, I would still use Tables and structured references but I would also use a Helper Column
D2: =IFERROR(MATCH(lastMonth[#[last month]],currentMonth[current month],0),"NO") *and fill down*
E2: =IFERROR(INDEX(currentMonth[current month], AGGREGATE(15,6,[Duplicates in Which Row],ROWS($1:1))),"")
I'm looking at writing something for the below in Python but I'm hoping that it can be done natively in Excel.
I have a sheet that contains multiple rows. Within column C, the recipient is listed. Each row basically contains data relating to the sender and recipientS of emails and so, there may just be one recipient or multiple:-
original data screenshot
What I'd like to do is to split on Column C (recipients) that contain multiple recipients into their own unique row and so, looking at the joe.bloggs#hotmail.com,jack#gmail.com example, I'd like this to be split on the , and for the two email addresses to be in their own rows together with the other values that exist in the row.
So, using the example above, I'd run the function/whatever is required and I'd end up with 6 distinct rows containing:-
What i'm looking to achieve
together with the content of the other values in the source row being copied over
Hope this makes sense and thanks in advance
UPDATE - I've been Googling this and think a Power Query may be the way forward.... I'm researching this further now
So I want to isolate all of the rows that are labeled 'Good' from all the rows that are labeled "bad".
I've tried to use the 'sort and filter' tool in excel, but this hasn't worked, I think due to the presence of the index table, which I've used to generate my formulas.
Here are the formulas being used to obtain a unique number for each row, which I then use to determine whether a value is "good" or "bad".
For reference, not all the boxes in the spreadsheet that are green are labeled 'good'.
This can be done by using an if statement and matching to the relevant number in the table, then printing the corresponding column data. This will need to be repeated for each column of data you want to use. An example of the code is given below;
=IF(ISNUMBER(MATCH(A2,AB:AB,0)),AG:AG,"No Data")
I'm trying to combine all the same part numbers on excel, in order to have one part number, but reference the filenames. However, when there is more than one file, I need to separate them by a comma.
Most of the vlookup and merging I have tried only pulls one item. And when I merge the part numbers I lose some filenames.Attached is the file showing the punctuated part number and file name. Some punctuated part numbers show up more than once while others only show up once. For every duplicate part number the filename is different as it contains a different view of the product.
Does it have to be in one cell with commas in between? With a few clicks you can create a pivot table with part number and file name in the rows area. This will group the data by the part number.
You can use a formula if you have a list of unique part numbers. The formula in H2 uses TextJoin like this:
=TEXTJOIN(", ",TRUE,IF(Table1[part]=G2,Table1[file],""))
Copy down.
This is an array formula and unless you have an Excel version with the new Dynamic Array formulas, you need to confirm it with Ctrl+Shift+Enter.
Every month I get given a budget from one of our clients in a Google sheet, which I need to convert into a SQL query so it can be uploaded into our database. As the number of rows and columns changes, I want to write some formula to semi-automate the process for time saving and mistake elimination.
This budget has spends in multiple columns, which I've managed to write formulas to combine into one column, with the correct details in the columns next to it (see example links below).
How I've transformed the data so far
The issue is this budget per country and partner, then has to be split again across multiple options. This leaves me with three columns worth of spend values, that I'd really like to combine back into one column, and ideally skip out all the zero values.
I've found an array formula on this site that will skip the zeroes, but I can't get it to work on more than one column.
=IFERROR(INDEX($U:$U,SMALL(ROW(myRange)*(myRange<>0),SUMPRODUCT(N(myRange=0))+ROWS($1:1))),"")
From this Question's Answer
Is it possible to write a formula, that skips the zero values down one column, and then starts at the next? And that will also allow me to keep the correct matching details from the other columns alongside it, as well as bring in the column headers for the options as entries in a new column?
Thanks
Edit:
Here is the final format I'm looking for:
There is a concatenated field off the end that combines all the columns. Most of the values are populated by various Vlookups, to transform from the text version, into the database IDs, needed to fill the table.
It's also worth saying, that not being able to skip the zeros, is OK, as I can manually delete them fairly easily.
But as the number of countries and partners can and will change, I want the formula to be able to move column at the end of the dataset.