I have some data.
I would like to find the "unique" values in the data. That is, not any data that was duplicated.
To be clear:
A
B
C
A
B
I want
C
I do not particularly care if this makes a new column of modifies the existing column. I do not, however, want to get C highlighted - my data sets are v. large and i really don't want to be scrolling along finding hyper-color yellow entries.
(I have a sneaking suspiscion this has been asked before, but given the dual connotations of "unique", it is kind of hard to search for it here)
Its a standard option of conditional formatting
Duplicate is the default, But the box allows unique too
That menu is accessed from home tab > conditional formatting > new rule
be sure to set a format for the cell then
Conditional Formatting and Filtering is much the simplest, but if you really want a formula to obtain the unique values in a different column:
=IF(COUNTIF($A$5:$A$15,A5)=1,A5,"")
This assumes the original data is in A5:A15, and this formula is entered into another column and copied down the same number of rows. You'll end up with a load of blank cells though. You'll need to copy/paste-special values, then sort in descending order (or filter) this list so that the blanks are at the bottom and can be deleted.
Actually, slightly better would be:
=IF(COUNTIF($A$5:$A$15,A5)=1,A5,"zzz")
because, after copy/paste-special values, you can sort in ascending order and you can see the values (at the bottom) that you need to delete.
Related
Lets say I have two sets of data and want to compare each row and column to make sure that they are identical.
Both sets of data have the same number of Columns and rows, say first set is Columns A-G, 2nd set of data is on the same tab an goes from H-N (in reality I actually have 50+ columns in each set).
Typically what I do when I don't have a lot of columns, i do something like:
=if(AND(A2 = h2, B2=i2, c2=j2),"Good","Bad")
Once I have a formula, then I press the little square and drag it down across all rows. This is able to quickly show me whether there is data difference in any of the columns or not.
However in this case I have a lot of columns to compare. Is there a quicker way to do this, or generate dynamically somehow?
Thanks.
You could use SUMPRODUCT:
=IF(SUMPRODUCT(--(A2:G2=H2:N2))=0,"Good","Bad")
=TEXTJOIN(,,A1:C1)=TEXTJOIN(,,h1:j1)
This will return either TRUE or FALSE.
The situation: I have an automatic procedure for gathering data from different input-sheets and presenting in a pivot-friendly format. It appears others are in need of the same data, though they want it formatted slightly differently (and they are not friends with excel). I therefor have a version of my table formatted as they want it (with empty columns where my extract does not contain any data).
The table (both) is one line for each department for each year for each cost/income (from now, cost) category. The raw data contains the cost for each year, though some of the users want it to be cost delta from initial year. I want:
One column for raw cost (X). One column for delta cost (Y). One output column (Z) that contains one of those two values, depending on dropdown selection. The first two columns are situated to the right of the "select with mouse and copy these"-columns, so that I dont need to teach the other users how to select non-adjecent columns :P (just letting u know the level of understanding i have to work with here)
Now the naive approach to this would be to have an if-statement in column Z like this:
=IF(selected_Calc="Use raw cost";[#[X]];[#[Y]])
Alternatively nest more ifs (one for "Use difference to 2019", and potentially add more nesting if more ways to show the value should appear in future)
This works. However, it isnt as elegant as I would like it, and if I indeed end up with more ways to calculate this for other people, it will be a lot of nested ifs.
I was therefore considering something like this:
=INDIRECT("[#["INDEX(mapTab_out;match(selected_Calc;mapTab_in;0))&]]")
But this gives a #ref, and tbh i didn't really expect it to work.
The idea is though: .
Have a range mapTab_in. This has the different selections for the dropdown box.
Have the adjecent range mapTab_out. This has the name of the column (X,Y...) that contains the desired calculation)
Have in column Z a formula for selecting which column's (X,Y...) value is to be displayed in Z
The google-stuff I have found so far all seem interested in using the indirect function from outside the table, and usually want to sum an entire column. I have used this in the past. The "ThisRow" things like using # dont seem to work with indirect though. Any ideas, or have I simply made some beginner-error in my formula?
Assuming it's in the same table, you can take advantage of implicit intersection and simply use:
=INDEX(Tablename,,MATCH(selected_Calc,Tablename[#Headers],0))
where selected_Calc is the name of the column you want back. (You could make that the result of a further INDEX/MATCH if you want to use a lookup table for some reason.)
I have this formula on google sheet
VLOOKUP(upper(J2:J),colorState!A:B,{2}*sign(row(J2:J)),FALSE)
and I want it to sort the result ascending automatically when I add new data or edit(like arrayformula)
Is there anyway or any formula to do that? (I know that there's SORT formula but I'm not sure how to use it together)
thanks.
I believe I understand what you need :)
Essentially what I understand is that you would like to recreate the "main" sheet but have it automatically ordered by the 'color' column when new data is added. I don't have any idea how to do this to the raw data but you can mirror the raw data by creating another sheet (name 'mainmirror') and in cell A1 just enter this formula:
=query(main!$A:$R,"select * order by P ASC",-1)
It will take you 2 seconds to reformat with a filter view, and you'll be left with a mirror of 'main' that is always sorted by column P and should remain current as data is added.
Hopefully this is an acceptable workaround. Other option would be to use a script but this is less tedious if it's suitable.
Side note: this method will turn your values into strings to mirror them on the duplicate sheet, so on the 'main' sheet I would recommend changing the cell format of column P to a custom number format, 00, which will ensure there's a leading 0 if there's only one digit. this will cause the strings in the mirror to sort correctly, instead of 1,11,12,2,3,4,etc. If you're expecting column P to have 3 digit value, make the number format 000 accordingly.
Hi I have a list of Values in a table that I would like to Order, the data in the column has been compiled from three separate tables where the custom format was applied for example;
When I enter the value "2" the formatting adds the prefix "MCRY-" so the value I get is "MCRY-2".
But in another table it adds the prefix "ACCRY-"
When I copy this over into the "master" sheet the formatting is copied over to which is perfect, however when I order the column the formatting is ignored as the cell values are only the numbers and not the prefixes.
My question is, how do I get the sorting process to acknowledge the prefixes as if they are apart of the cell value?
So much so that is I have "MCRY-01", "ACCRY-01", "MCRY-02", "ACCRY-02" it will order it ACCRY and then MCRY.
I have tried special pasting as values but that doesn't work. Any help?
Thanks
I wouldn't use conditional formatting for this process. You will need to hardcode. If "2" is in A1 then put the following into B1 to get the required result
="MCRY-"&TEXT($A1,"00")
Will give you "MCRY-02"
I’ve had some ‘lulus’ but this may count as my nastiest hack ever (#CallumDS33 is basically correct).
Format the “ACCRY-“ table with "ACCRY-"#;"ACCRY-"# then proceed as normal but add a helper column in your “master” sheet with:
=CELL("format",B1)
copied down to suit, where B is assumed to be the column to be ordered. Now sort on the column to be ordered within sorting of the helper column.
How can I compare records in a table, to make sure these records are not duplicates? Using excel 2007 I don't won’t them to delete after comparison.
Duplicates rows should be colored. I have a table columns are from A to P and I have 500 rows. I want to put condition on A, B, E, F, G, I.
If you don't want to sort your column, you can try with a matrix formula (http://www.stanford.edu/~wfsharpe/mia/mat/mia_mat4.htm).
Practically, you can compare your current row to every row above. Somtething like :
=MIN(LINE(B1)*(IF(A2=A1;1;0))*(IF(B2=B1;1;0)))*(...)
validated with CTRL-SHIFT-ENTER will check if all the conditions are true, else, will return 0.
Please send a file (with anonymous data) if you want a practical example.
Hope that helps
Edit : here is the good solution (provided you want to compare data in the Q column) :
=MIN(LIGNE($Q$5:Q6)*EQUIV(Q6;$Q$5:Q6;0))
If you want to have the first line where the value appear
=MIN(LIGNE($Q$5:Q5)*EQUIV(Q6;$Q$5:Q5;0))
If you'd rather have #N/A if there are no duplicate before that line
Still validate with CTRL-SHIFT-ENTER
Sort by the columns you are interested in then use a formula to compare each row with the one above. You can then use conditional formatting to colour the results.
I may sound stupid here, but usually the simple answers are usually the best.
I did this recently, by literally using the CONCATENATE() function with the TEXT() function to combine all the columns I wanted to compare into a single cell. So in effect I am creating a cell with a unique "key" that holds all the data I want to be unique.
I then sort that column and create another empty column next to it.
Then us this formula to compare the row with the row above it: =IF(A2=A1,0,1)
This simply puts a 0 where it's the same row and a 1 where it's different.
I then filter on the '1's and there are my duplicates!
It'a also usefull as an alternative way of doing a unique COUNT(DISTINCT ...) where I want to count how many unique references of my data exists. SUBTOTAL(3...) is not enough.