Select all rows where all cells equal one another - excel

I have a large table that only shows a single type of information: whether or not a species of plant was present at a particular study site. I have 500+ species listed in the first column, and 30 sites as column names. The table is populated with a simple "Y" or "N" to show presence. Example:
Scientific Name Old Wives Beach Dadi Orote N Airstrip
Abelmoschus moschatus N N N
Abrus precatorius Y N Y
Abutilon indicum N N N
However, the species list contains some species that do not occur at any sites, rendering a row full of "N"s, like the 1st and 3rd rows in the example above. I need to delete those rows in order to make the table more manageable.
Is there any way to achieve this without a long IF AND statement?

Inspired by pnuts' comment, in a new column, use the a COUNTIF() formula. For example, =COUNTIF(B2:AE2,"Y"), assuming the row/column headers are in row 1 and column A and the data is in the range B2:AE501+.
If you then select the entire range, including the headers and the new formula column and add filters, then you can select only the rows where the count of Y's is 0. Once you have only the 0's showing, you can select the entire rows and delete them (using Right-Click, Delete) without effecting the non-zero rows.
At this point, if you no longer need the count column, you can turn off the filter and delete the column but I wouldn't be surprised if you find the count comes in handy for some other reason.
Alternately, you could just use the filter to HIDE the 0 rows rather than delete them and that way to don't remove the data altogether but it's no longer in your way.

The code below is one way to do this, assuming there are no gaps in the data. The animated gif steps through to demonstrate how it works. You should remove the .select statements once you understand it.
Sub deleteIfAllN()
Dim plantR As Range, cell As Range, allN As Boolean
Set plantR = Range("A2")
While plantR <> ""
plantR.Select
Set r = plantR.Offset(0, 1)
allN = True
Do
r.Select
If r <> "N" Then
allN = False
Exit Do
End If
Set r = r.Offset(0, 1)
Loop Until r = ""
Set plantR = plantR.Offset(1, 0)
Rows(plantR.row - 1).Select
If allN Then Rows(plantR.row - 1).Delete
Wend
End Sub

You can use the Advanced Filter
Set up your data and criterion area as below
For the example you posted, the formula would be:
=COUNTIF($B8:$D8,"N")<>3
For 30 columns, just modify the range and the count.
Before
After
I chose to filter in place
Note that there is also an option to Copy to another location which would place the results of the filter in another location.

Related

Find max for values of the same group and indicate the value in a seperate column

I'm trying to find the max in the second column when they belong to the same group defined in the first column and write the maximum value on a third column on the same line as it appears in column 2.
I wrote a piece of code but can't get further.
Sub MarkMax()
Dim i As Double
Dim x As Double
For i = 1 To 1000
For x = 1 To 4
If Cells(i, 1) = x Then
Cells(i, 3) = Application.WorksheetFunction.Max(Cells(i, 2))
End If
Next x
Next
End Sub
Thank you!
No reason for using VBA for that. As explained here there is a basic feature, subtotals, which covers this.
Obviously, as a function, don't use SUM or AVERAGE, but MAX:
You don' need VBA for this. If I were you I would create a table to make the formulas dynamic when you enter new rows.
Supposing that your column headers are in A1:C1 and that you have only 20 rows paste this in first cell of MAX column and copy down.
=IF(MAXIFS($B$2:$B$20;$A$2:$A$20;A2)=B2;MAXIFS($B$2:$B$20;$A$2:$A$20;A2);"")

Feeding pairwise compared values into a large table

I've got a few large (500-1000) datasets in the following format using only the first two rows.
id
value
a-b
number
a-c
number
a-d
number
...
number
b-c
number
b-d
number
and so on
They compare two values and save their difference while skipping previously done comparisons. I want to put them in a table like this:
id
a
b
c
d
e
a
/
number
number
number
number
b
number
/
number
number
number
c
number
number
/
number
number
d
number
number
number
/
number
e
number
number
number
number
/
The lower left half of this table is easily prepared with offset, but how do I feed the values into the upper right half?
Is there a way to mostly automate doing this?
If i understand what you are trying to do, I would suggest to do this in 2 steps:
Set up formulas in the results table to "read" data
Have a macro to "save" data
First fill your results table with this formula (example for cell B2) - keep your offset formula in bottm left half
=IF(B$2=$A3, "\", VLOOKUP(IF(B$2>$A3,$A3&"-"&B$2, B$2&"-"&$A3), $H$3:$I$35, 2, FALSE))
This will give you \ if the row and column id or the same or vlookup on row_id-column_id/column_id-row_id if they are different ensuring they are ordered "low-high" always to give you your mirror across the diagonal. Some might argue the duplication of vlookup is inefficient but more inefficient than an offset and figuring our how to paste one formula into one diagonal and a different one in the other? Who really knows and it is simple and it works IMHO
Next put the data for the current "pass" into the observations table and your values will appear in the table via the VLOOKUP
Finally you need a little macro to replace the formula with the value and run it after every set of results is acquired, so that you don't lose this data when you put a subsequent set of new values in your results table
Option Explicit
Sub replace_formula_with_value()
Dim ws As Worksheet
Set ws = ThisWorkbook.Sheets("Sheet1")
Dim r_in As Range
' this is where my input data start
Set r_in = Range(ws.Range("B3"), ws.Range("B3").End(xlDown).End(xlToRight))
Dim r As Range
For Each r In r_in
If Not IsError(r.Value) Then
If r.Value <> "\" Then
'overwrite formula with value by setting value to the evaluation of the formula in the cell
r.Value = r.Value
End If
End If
Next r
End Sub
If you dont want to have a macro then you need to keep ALL data in your observations table and just build it up over time.

Excel Match Numbers in 2 Columns to a Number in a 3rd

I've run into a bit of a road block. I get a .PDF output from an accounting program and copy/paste the data into excel, then convert text to columns. I am trying to match the GL code with the totals for that specific account. Columns A, B, and C show the state of my data prior to sorting it, and the lines under Intended Output show how I would like the data to output.
I am trying to automate this process, so I can paste data into columns A, B, & C in the raw format and have it automatically spit out the required numbers in the format of the Intended Output. The GL codes remain the same, but the numbers and the number of rows will change. I've color coded them for ease of review.
Thank you very much in advance!
Using a combination of the following formulas you can create a list of filtered results. It works on the principal that you Data1 text that you want to pull is the only text with a "-" in it, and that the totals you are pulling from Data2 and Data3 are the only numbers in the column. Any change to that pattern will most likely break the system. Note the formulas will not copy formatting.
IFERROR
INDEX
AGGREGATE
ROW
ISNUMBER
FIND
Lets assume the output will be place in a small table with E2 being the upper left data location.
In E2 use the following formula and copy down as needed:
=IFERROR(INDEX(A:A,AGGREGATE(15,6,ROW($A$1:$A$30)/ISNUMBER(FIND("-",$A$1:$A$30)),ROW(A1))),"")
In F2 use the following formula and copy to the right 1 column and down as needed:
=IFERROR(INDEX(B:B,AGGREGATE(15,6,ROW($A$1:$A$30)/ISNUMBER(B$1:B$30),ROW(A1))),"")
AGGREGATE performs array like calculations. As such, do not use full column references such as A:A in it as it can lead to excess calculations. Be sure to limit it to the range you are looking at.
Try this procedure:
Public Sub bruce_wayne()
'Assumptions
'1. Data spreadsheet will ALWAYS have the structure shown in the question
'2. The key word "Total" (or whatever else it might be) is otherwise NOT found
' anywhere else in the 1st data column
'3. output is written to the same sheet as the data
'4. As written, invoked when data sheet is the active sheet
'5. set the 1st 3 constants to the appropriate values
Const sData2ReadTopLeft = "A1" 'Top left cell of data to process
Const sData2WriteTopLeft = "J2" 'Top left cell of where to write output
Const sSearchText = "Total" 'Keyword for summary data
'*******************
Const sReplaceText = "Wakanda"
Dim r2Search As Range
Dim sAccountCode As String
Dim rSearchText As Range
Dim iRowsProcessed As Integer
Set r2Search = Range(sData2ReadTopLeft).EntireColumn
sAccountCode = Range(sData2ReadTopLeft).Offset(1, 0).Value
iRowsProcessed = 0
Do While Application.WorksheetFunction.CountIf(r2Search, sSearchText) > 0
Set rSearchText = r2Search.Find(sSearchText)
Range(sData2WriteTopLeft).Offset(iRowsProcessed, 0) = sAccountCode
Range(sData2WriteTopLeft).Offset(iRowsProcessed, 1) = rSearchText.Offset(0, 1).Value
Range(sData2WriteTopLeft).Offset(iRowsProcessed, 2) = rSearchText.Offset(0, 2).Value ' add this if there are more summary columns to return
'last two lines could be collapsed into a single line; at the expense of readability..
rSearchText.Value = sReplaceText 'so that next search will find the next instance of the trigger text
iRowsProcessed = iRowsProcessed + 1
sAccountCode = rSearchText.Offset(1, 0).Value
Loop
r2Search.Replace what:=sReplaceText, Replacement:=sSearchText
End Sub

Excel - finding unmatched data in order

I have 2 tabs of data with a unique identifier. The identifier is not in any particular order. I need my vlookup / index / match to show me all the identifiers that are not present in tab 2.
Reason: I am working where the systems they used failed a data transfer. I have to see what data there was compared to what data is currently on the system. Any data that is missing, i will need to add to the new system.
Example;
Tab1 Column A:
123456,
654321,
789456,
456789.
Tab2 Column B:
654321,
123456,
456789.
In Tab 3, I want excel to tell me that 789456 is not present in Tab 2.
As you can see in the above example, the unique identifier could be in any order, therefore i cannot put both columns side by side and ask to do a match between the 2 - i need it to look through the whole column.
All the tutorials i have seen assume that column A matches in order of column B
I have 70,000 rows to go through.
Any help would be appreciated.
Thanks in advance.
To do it with a formula you will want a helper column in the First tab.
In an empty column, I used column B, put the following in the second row:
=IF(ISERROR(VLOOKUP(A2,Sheet2!B:B,1,FALSE)),MAX($B$1:B1)+1,"")
This will create a column of numbers that increment on the ones not found in sheet two.
At this point you can simply filter on the new column for anything that in non blank and get your list.
If you want to do it with a formula in the Third tab then use this formula that refers to the helper column on the first tab:
=IFERROR(INDEX(Sheet1!A:A,MATCH(ROW(1:1),Sheet1!B:B,0)),"")
Then copy/drag down sufficient to get blanks.
With 70,000 items I would avoid array formulas as it will slow the calculation down and may even crash excel.
You could try using something like this:
=IFERROR(VLOOKUP(<value cell>, 'Tab2'!B:B, 1, FALSE), FALSE)<>FALSE
Copy all the values from tab 1 column A into tab 3 column A. In tab 3 column B, paste the above formula in every row where there is a value in column A, using referencing the cell from column A and the same row as the value cell. The formula will attempt to look up the value from tab 1 in tab 2. If it is missing, it will generate an error which is caught by the IFERROR function, which will return FALSE instead of letting the error escape. Finally, that FALSE is negated to return TRUE if the value is present in tab 2, and FALSE if the value is missing in tab 2.
From this point you can use a column filter in tab 3 to only see those rows with a TRUE value, that will only show you values that are present in both tab 1 and tab 2.
Soulution for this is COUNTIF() the formula would be:
=COUNTIF(Sheet1!A:A,Sheet2!A1)
After applying that for all rows, just filter those that have value 0.
This macro will produce a compact list in Sheet3:
Sub WhatsMissing()
Dim s1 As Worksheet, s2 As Worksheet, s3 As Worksheet
Dim r1 As Range, N As Long, K As Long, i As Long
Dim v As Variant
Set s1 = Sheets("Sheet1")
Set s2 = Sheets("Sheet2")
Set s3 = Sheets("Sheet3")
Set r2 = s2.Range("B:B")
K = 1
N = s1.Cells(Rows.Count, "A").End(xlUp).Row
With Application.WorksheetFunction
For i = 1 To N
v = s1.Cells(i, "A").Value
If .CountIf(r2, v) = 0 Then
s3.Cells(K, "A").Value = v
K = K + 1
End If
Next i
End With
End Sub

How to randomize Excel rows

How can I randomize lots of rows in Excel?
For example I have an excel sheet with data in 3 rows.
1 A dataA
2 B dataB
3 C dataC
I want to randomize the row order. For example
2 B dataB
1 A dataA
3 C dataC
I could make a new column and fill it with random numbers using =RAND() and sort based on that column.
But is this the best way to do it? The RAND equation will provide up to a million random numbers and I have a quarter of a million rows so it seems like it would work.
Thanks
I searched for a bit and while this answer about randomizing columns is close it seems like way overkill.
Perhaps the whole column full of random numbers is not the best way to do it, but it seems like probably the most practical as #mariusnn mentioned.
On that note, this stomped me for a while with Office 2010, and while generally answers like the one in lifehacker work,I just wanted to share an extra step required for the numbers to be unique:
Create a new column next to the list that you're going to randomize
Type in =rand() in the first cell of the new column - this will generate a random number between 0 and 1
Fill the column with that formula. The easiest way to do this may be to:
go down along the new column up until the last cell that you want to randomize
hold down Shift and click on the last cell
press Ctrl+D
Now you should have a column of identical numbers, even though they are all generated randomly.
The trick here is to recalculate them! Go to the Formulas tab and then click on Calculate Now (or press F9).
Now all the numbers in the column will be actually generated randomly.
Go to the Home tab and click on Sort & Filter. Choose whichever order you want (Smallest to Largest or Largest to Smallest) - whichever one will give you a random order with respect to the original order. Then click OK when the Sort Warning prompts you to Expand the selection.
Your list should be randomized now! You can get rid of the column of random numbers if you want.
I usually do as you describe:
Add a separate column with a random value (=RAND()) and then perform a sort on that column.
Might be more complex and prettyer ways (using macros etc), but this is fast enough and simple enough for me.
Here's a macro that allows you to shuffle selected cells in a column:
Option Explicit
Sub ShuffleSelectedCells()
'Do nothing if selecting only one cell
If Selection.Cells.Count = 1 Then Exit Sub
'Save selected cells to array
Dim CellData() As Variant
CellData = Selection.Value
'Shuffle the array
ShuffleArrayInPlace CellData
'Output array to spreadsheet
Selection.Value = CellData
End Sub
Sub ShuffleArrayInPlace(InArray() As Variant)
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' ShuffleArrayInPlace
' This shuffles InArray to random order, randomized in place.
' Source: http://www.cpearson.com/excel/ShuffleArray.aspx
' Modified by Tom Doan to work with Selection.Value two-dimensional arrays.
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim J As Long, _
N As Long, _
Temp As Variant
'Randomize
For N = LBound(InArray) To UBound(InArray)
J = CLng(((UBound(InArray) - N) * Rnd) + N)
If J <> N Then
Temp = InArray(N, 1)
InArray(N, 1) = InArray(J, 1)
InArray(J, 1) = Temp
End If
Next N
End Sub
You can read the comments to see what the macro is doing. Here's how to install the macro:
Open the VBA editor (Alt + F11).
Right-click on "ThisWorkbook" under your currently open spreadsheet (listed in parentheses after "VBAProject") and select Insert / Module.
Paste the code above and save the spreadsheet.
Now you can assign the "ShuffleSelectedCells" macro to an icon or hotkey to quickly randomize your selected rows (keep in mind that you can only select one column of rows).
Use Excel Online (Google Sheets).. And install Power Tools for Google Sheets.. Then in Google Sheets go to Addons tab and start Power Tools. Then choose Randomize from Power Tools menu. Select Shuffle. Then select choices of your test in excel sheet. Then select Cells in each row and click Shuffle from Power Tools menu. This will shuffle each row's selected cells independently from one another.
For example this is our data set.
Then type this formular and add it B1 tO B9 cell.
Now you can go to Data tab, and select Sort smallest to largest or Sort largest to smallest as you need.
***Then there is a popped dialog, and check Expand the selection option. And click Sort.
Data range has been shuffled by rows randomly.Then you can remove the formula cells.(B column)

Resources