Copying and pasting nonzero rows and adjacent cells - excel

I am dealing with a lot of GIS metadata that I am importing into Excel, with many rows and columns of blank or zero values. I am trying to take data like this:
(Left Column is the name, columns to the right are the values associated with that name)
and by only selecting the columns that have a value, end up with a new set of tables showing the nonzero rows and its corresponding name to the left:
I have tried doing it by filtering the data in a table so that it only shows nonzero values, copying that column and the far left column and pasting it onto a new sheet. This is easy if I only have a few columns, but given the amount I have it is very painstaking. I have to filter each column separately because the rows in each column may or may not have blanks or zeros depending on the column.
Can the LOOKUP function be used for this, or would using VBA be better?

why can't use 2 simple loops (rows, columns)??
i suppose that the data start from "A1", I count rows and columns and play this code.
Sub tras()
Dim lastRw As Integer
Dim lastCol As Integer
Dim ResultRow As Integer
ResultRow = 20 '1th row for result
lastRw = Range("A1").End(xlDown).Row
lastCol = Cells(1, ActiveSheet.Columns.Count).End(xlToLeft).Column
Set rgn = Range("A1", Cells(lastRw, lastRw))
For x = 2 To lastCol
For y = 1 To lastRw
If Cells(y, x) <> "" Then
Cells(ResultRow, 1) = Cells(y, 1).Value
Cells(ResultRow, 2) = Cells(y, x).Value
ResultRow = ResultRow + 1
End If
Next y
ResultRow = ResultRow + 1
Next x
End Sub

A filtering in a similar case can be performed with array formulas such as in the example below.
See an example:
http://i.stack.imgur.com/YlyUN.jpg
(I cannot post images yet - please edit my post so that the image is integrated)
To use it select range E1:E6, write the shown formula and press ctrl+shift+enter.
The logic is as follows:
1) based on range B1:B6 create two vectors: 1 with row numbers of each row which simply is a v1 = {1,2,3,4,5,6} in this case and a vector of TRUEs and FALSEs for each row with TRUE if a row is nonempty and FALSe if empty: in this case it is v2 = {TRUE,FALSE,TRUE,TRUE,FALSE,FALSE}.
2) multiply v1 and v2 element wise which gives us vector v3 = {1,0,3,4,0,0}
3) using SMALL function extract values from the smallest to the largest using COUNTBLANK function to skip all the zeros (their number is equal to number of blanks) - ROW function used on range E1:E6 serves as our iterator in SMALL function
4) after point 3) you end up with row numbers of non-empty cells based on range B1:B6, now you have to call INDEX function to extract values from range A1:A6
5) add IFERROR on top of everything so that it returns "" when going out of SMALL's input array's range
To assign values in column F simply use INDEX + MATCH (I encourage everyone to forget about *LOOKUP).
What you end up here with is something that I think addresses your core problem. If you want to make it all "draggable" or "fillable", use it on ranges other that beginning with 1 or put it all in setup row after row rather than column after column, you'll have to make some modifications to the formulas used here but the logic will be the same.

Related

How can I calculate the sum of the numbers of two columns for only a portion of the rows?

Say I have an Excel sheet with 10,000 rows and two columns. All 20,000 cells are filled with numbers and there is no missing data. I want to have a third column, the values of which are the sum of Column A and Column B. For example, C70 = A70 + B70, and C82 = A82 + 82, and the like.
The only problem is I want to do it for only a portion of the rows, say from row 125 to row 8954. I don't care about the rest of the values. And I don't want to do it by dragging the grid using the mouse. Is that possible?
If you have access to SEQUENCE() (Currently only available to Office 365 Insiders) then yes it is possible:
=INDEX(A:A,SEQUENCE(1000,,ROW(),1))+INDEX(B:B,SEQUENCE(1000,,ROW(),1))
Where 1000 is the number of rows desired. Place the formula in the first cell desired and it will automatically fill the rest.
I believe you need some logic about what is going on, related to the start and end row.
You can use an if-statement or sumifs() for this... will do an if-statement so i can specify not meeting the requirements as null.
With Start row as 2 and end row as 4 (see image), you can use this formula, and drag it down to the bottom of all columns:
=IF(AND(ROW(A2)<=F$2,ROW(A2)>=E$2),SUM(A2:B2),"")
Notice in the image that C5 has no value; this is due to the conditions of the if-statement being false.
Another idea, a simple macro that will do what you want by asking the user what the starting and end row is.
Sub test()
Dim startrow As Integer 'variable to hold first row
Dim endrow As Integer 'variable to hold last row
startrow = InputBox("Enter the start row")
endrow = InputBox("Enter the end row")
'loops through you desired range calculating what you want
Dim i As Integer
For i = startrow To endrow
Cells(i, 4).Value = Cells(i, 1).Value + Cells(i, 2).Value
Next
End Sub
Just change the values to suit your needs in what cells you want to add up and where you want the sum to go.

In Excel, need to match the numbers in each row and add the number of cells containing the match

In Excel I am trying to match the numbers in a list to the numbers in another list on the same worksheet, then count how many cells contain the matched numbers.
Column A is a list of names. Column B thru F is the numbers assigned to each name (the master list). Columns I thru N have numbers entered (the entry list). I need see how many cells in each row of the master list match all of the cells in the entry list (several rows).
For example, on the master list is John, he has 2 numbers in cells B2 thru F2 that match the cells in columns I thru N. So the result (2) is displayed in G2. Please help!
I have tried countif in conditional formatting, and am able to highlight the matched cells in color, but I don't know how to add the number of cells containing the matched numbers.
=countif($H:$N,B2)>0
You can also put the Countif's together in a single formula
=SUMPRODUCT(COUNTIF(I:N,B2:F2))
or if there is a possibility of duplicates in columns I:N and you only want to count them once,
=SUMPRODUCT(--(COUNTIF(I:N,B2:F2)>0))
Here is an example of the second formula counting only distinct occurrences
Try this:
Sub macro()
Dim lastRow As Long, count As Long
lastRow = Cells(Rows.count, 1).End(xlup).Row
count = 0
For xrow = 2 To lastRow
For xcol = 2 To 6
For i = 9 To 14
If Cells(xrow, xcol) = Cells(xrow, i) Then
count = count + 1
Else:
End If
Next i
Next xcol
Cells(xrow, 7).Value = count
count = 0
Next xrow
End Sub
EDIT:
Changed the Cells(i, xcol) in the if statement to Cells(xrow, i) and it fixed it. Works on my machine.

Combine Adjacent Row Cells Depending on their Left Column Cell

I'd like to combine cells in the right column into one cell according to the adjacent cell on the left. I tried Merging but I could only get so far. And after searching online I couldn't find anything that can parse each row and combine for the length of the left cell's span. I know it's a CONCATENATE function, but how would I get it to parse the whole spreadsheet?
This is an example of the results I would want for the above:
This may be too complicated - in which case I would go back to the drawing board and do a full VBA version, but initially I was looking for a challenge to construct a solution only using formulas. Unfortunately, there appears to be no standard formula-based approach to concatenate a variable number of cells.
So, to accomplish this, I added one function:
Function CombineRange(ByRef rng As Range, ByVal delim As String)
Dim arr
Dim i As Long
arr = rng.Value
CombineRange = ""
For i = 1 To UBound(arr)
If i > 1 Then
CombineRange = CombineRange & delim
End If
CombineRange = CombineRange & arr(i, 1)
Next i
End Function
Assumptions:
your data is in a sheet called "YourData"
Your merged data is column A
Your "single row" data is column B
Row 1 is some kind of header row.
Next, set up four columns on a new sheet (I call it "Collapsed")
A - Start Row = (first row) whatever row your data starts on (2, in our case)
A - Start Row = (all others) A2+B2
B - Offset = {IFERROR(MATCH(FALSE,ISBLANK(INDIRECT(ADDRESS(A2+1,1,,,"YourData")&":A200")),0),0)}
Note this is an array function, so you need to do shift+Enter when entering it
C - Level1 = =INDEX(YourData!A:A,A2)
D - Combined Level 2 = =IF(B2<=1, INDIRECT(ADDRESS(A2,2,,,"YourData")), CombineRange(INDIRECT(ADDRESS(A2,2,,,"YourData")&":"&ADDRESS(A2+B2-1,2)),"; "))

Highlighting duplicate rows

I would like to highlight duplicate rows in Excel VBA. Assume I have the following exemplary table with columns A, B, C and D for testing:
A B C D (Strings)
1 1 1 dsf
2 3 5 dgdgdgdg
1 1 1 dsf
2 2 2 xxx
6 3 4 adsdadad
2 2 2 xxx
The duplicate rows should be highlighted in any colour, e.g. grey. I am looking ideally for fast performing code, as it will be used for rather big tables.
Note there are solutions available for highlighting duplicate cells (but not duplicate rows). I don't know how to identify if rows are duplicates and at the same time how to do that fast, i.e. without nested looping. The solution should be in VBA (not Excel).
What is the best/fastest way to achieve that?
add a conditional formatting with the following sumproduct formula (or a countifs)
=SUMPRODUCT(($A$1:$A$6&$B$1:$B$6&$C$1:$C$6=$A1&$B1&$C1)*1)>1
Explanation:
SUMPRODUCT is handy to work with ranges which you need to manipulate prior to checking a condition. In this case I concatenate A, B & C columns across the range and compare it with the concatenation of the current row. I then convert the TRUE/FALSE array to a 1/0 array by multiplying by 1 and the SUM part of SUMPRODUCT sums the rows where the condition is true, giving me the duplicate rows (all occurences). If you have a small range, using the formula evaluation you can clearly see how this works.
It's a quick fix, but performance is not ideal, I use it a lot for detecting duplicates or generating sequential numbers.
Solution from comments suggested by ponydeer - higher performance
based on sorting suggesting, requires to add key column, put in auto filters and sort on key, then do conditional on key column:
I have tested 3 different approaches on the sample file link from OP's comment. Probably the VBA implementations were not optimal, but below are the results with average time of 100 passes:
1) Conditional formatting using:
a)SUMPRODUCT concatenating columns - 3s
b) COUNTIFS with full column reference - 1.9s
c) COUNTIFS referencing used ranges - 0.2s
2) Sorting the range on all columns, comparing row by row, sorting back - 0.3s
3) Using advanced filter 3.5s
Here is the code for the fastest method:
Sub CF1()
Application.ScreenUpdating = False
Dim sFormula As String
Dim rRng As Range
Dim nCol As Integer, i As Integer
Set rRng = Range("A1").CurrentRegion
nCol = rRng.Columns.Count
'build the formula
sFormula = "=COUNTIFS("
For i = 1 To nCol
sFormula = sFormula & rRng.Columns(i).Address & "," & _
rRng.Cells(1, i).Address(False, True)
If i < nCol Then sFormula = sFormula & ","
Next
sFormula = sFormula & ")>1"
'write the formula in helper cell to get it's local version
rRng.Cells(1, nCol + 1).Formula = sFormula
rRng.FormatConditions.Delete
With rRng.FormatConditions.Add(Type:=xlExpression, _
Formula1:=rRng.Cells(1, nCol + 1).FormulaLocal)
.Interior.ThemeColor = xlThemeColorAccent3
End With
rRng.Cells(1, nCol + 1).Clear
Application.ScreenUpdating = True
End Sub
Sort your range first regarding all columns
Workbooks(1).Sheets(1).Range("A:C").Sort Key1:=Workbooks(1).Sheets(1).Range("A:A"), Order1:=xlAscending, Key2:=Workbooks(1).Sheets(1).Range("B:B"), Order2:=xlAscending, Key3:=Workbooks(1).Sheets(1).Range("C:C"), Order3:=xlAscending, Orientation:=xlSortRows
Then loop through all rows and compare them with the one above them
Dim a As Application
Set a = Application
For i=1 to 1000 ' here you need to set the number of rows you have
if Join(a.Transpose(a.Transpose(ActiveSheet.Rows(i).Value)), Chr(0)) = _
Join(a.Transpose(a.Transpose(Sheets(1).Rows(i+1).Value)), Chr(0)) then
Sheets(1).Range(i+1 & ":" & i+1).EntireRow.Interior.Color = 49407
end if
Next i
The comparison of two rows is based on this thread: How to compare two entire rows in a sheet
Please insert the names of your Workbook, Sheet and set your range and the limits in the code yourself.
I think fastest/best will depend upon the proportion of duplicates – only one row should be quicker than 50% as in the example – and on the actual size of the array (how many columns from which to create a key, etc).
Given that it is rarely possible to beat inbuilt functions with ‘pure’ VBA I suspect using the UI, within VBA if desired, will be faster in some circumstances. Eg:
Add an index column (series fill would serve), copy entire sheet (say to Sheet2), apply Remove Duplicates to all but index column, then apply as CF formula rule of this kind the relevant range of the original sheet:
=$A1=MATCH($A1,Sheet2!$A$1:$A$3000,0)>0
Assuming the start point is like so:
and a ColumnA inserted with numeric series fill starting 1, Sheet2 should look so after Remove Duplicates:
I have assumed ColumnE is to be ignored as far as duplication is concerned.
In source sheet, select array (from A1: - see!), eg A1:I6 and HOME >Styles - Conditional Formatting, New Rule..., Use a formula to determine which cells to format, Format values where this formula is true::
=$A1=MATCH($A1,Sheet2!$A:$A,0)>0
Format..., Fill, grey, OK, OK.
For me results in:
Sub HighlightDuplicateRows_2()
'This one is more modifiable and can handle multiple columns of data
'Just add another *WorksheetFunction.CountIf(Range("A2:A" & LastRow),Cells(r,1).Value) > 1* and change the column values
Dim LastRow As Long
LastRow = Cells(Rows.Count, "A").End(xlUp).Row
For r = 2 To LastRow
If WorksheetFunction.CountIf(Range("A2:A" & LastRow), Cells(r, 1).Value) > 1 And WorksheetFunction.CountIf(Range("B2:B" & LastRow), Cells(r, 2).Value) > 1 Then
Rows(r).Interior.ColorIndex = 6
End If
Next r
End Sub

Macro to insert blank cells below if value >1 and copy/paste values from cell above

This site already has something similar: Copy and insert rows based off of values in a column
but the code doesn't take me quite where I need to go, and I haven't been able to tweak it to make it work for me.
My user has a worksheet with 4 columns, A-D. Column A contains specific contract numbers, column B is blank, column C has part numbers, and column D has the entire range of contract numbers. My user wants to count the number of times the entire range contract numbers has duplicates so I entered the formula =countif($D$2:$D$100000,A2) in cell E2 and copied down, giving me the number of times the specific contract in column A appears in column D. The numbers range from 1 to 11 in this workbook but the number may be higher in other workbooks this method will be used in.
The next thing I need to do is to enter blank cells below all values in column E that are greater than 1, very much like the example in the previously asked question. I then also need to copy in the same row and insert copied cells exactly to match in the same row in column A. Example: Cell E21 has the number 5 so I need to shift cells in column E only so that there are 4 blanks cells directly below it. In column A, I need to copy cell A21 and insert copied cells in four rows directly below.
Just trying to get the blank cells to insert has been a trial, using the code as given in the previous question.
Dim sh As Worksheet
Dim lo As ListObject
Dim rColumn As Range
Dim i As Long
Dim rws As Long
Set sh = ActiveSheet
Set lo = sh.ListObjects("Count")
Set rColumn = lo.ListColumns("Count").DataBodyRange
vTable = rColumn.Value
For i = rColumn.Rows.Count To 1 Step -1
If rColumn.Cells(i, 1) > 1 Then
rws = rColumn.Cells(i, 1) - 1
With rColumn.Rows(i)
.Offset(1, 0).Resize(rws, 1).Cells.Insert
.EntireRow.Copy .Offset(1, 0).Resize(rws, 1).Cells
.Offset(1, 0).Resize(rws, 1).EntireRow.Font.Strikethrough = True
End With
End If
Next
I would be very grateful for any help as I have been fighting with this monster for a week.
While this is indeed possible to do, it might be a good idea to look into moving the list of all contract numbers from column D to a different sheet. Even though it is quite simple to loop through a range and insert rows based on cell values - it'll also create holes in columns D and E.
Here's code for simply adding the rows and copying the values as you specified.
Sub Main()
'---Variables---
Dim source As Worksheet
Dim startRow As Integer
Dim num As Integer
Dim val As String
Dim i As Long
'---Customize---
Set source = ThisWorkbook.Sheets(1) 'The sheet with the data
startRow = 2 'The first row containing data
'---Logic---
i = startRow 'i acts as a row counter
Do While i <= source.Range("E" & source.Rows.Count).End(xlUp).Row
'looping until we hit the last row with a value in column E
num = source.Range("E" & i).Value 'Get number of appearances
val = source.Range("A" & i).Value 'Get the value
If num > 1 Then 'Number of appearances > 1
Do While num > 1 'Create rows
source.Range("A" & i + 1).EntireRow.Insert 'Insert row
source.Range("A" & i + 1) = val 'Set value
num = num - 1
i = i + 1 'Next row
Loop
End If
i = i + 1 'Next row
Loop
End Sub
Of course you could also remove the holes from column D after inserting the new rows and modify the formula in column E so that it remains copyable and doesn't calculate for the copied rows.
Generally it makes things easier if a single row can be thought of as a single object, as creating or deleting a row only affects that one single object. Here we have one row represent both a specific contract and a contract in the all contracts list - this could end up causing trouble later on (or it could be totally fine!)

Resources