Optimize my search and copy code - excel

I have an Excel project which has a few thousand rows containing strings which need sorting out.
Typically one cell in each row should have a six digit number 123456 but many are 123456/123456/234567 etc. which need to have the / deleted and then be separated onto individual rows. There is other information in the surrounding columns which needs to stay with these six digit numbers.
I decided to approach this by firstly making copies of the rows the appropriate number of times and then deleting the surplus information
This code below deals with the copying part and it works.. but it's really slow. Is there a quicker way to achieve what I'm trying to do?
Thanks for any help.
Chris
Sub Copy_extra_rows()
Application.ScreenUpdating = False
s = 2
Do Until s = Range("N20000").End(xlUp).Row
'checks for / in Mod list
If InStr(1, Range("N" & s), "/") Then
'determines number of /
x = Len(Range("N" & s)) - Len(Replace(Range("N" & s), "/", ""))
'loops x times and copies new row
For a = 1 To x
Range("J" & s & ":O" & s).Select
Selection.Copy
Range("J" & s + 1).Select
Selection.Insert Shift:=xlDown
s = s + 1
Next a
Else
End If
s = s + 1
Loop
End Sub

I would have approached this differently to optimize the process and improve the overall efficiency of code.
Firstly, I would load the entire column into an array. This way it's always faster to access the elements of that array rather then referring Cells() multiple times in loops. Working with objects in memory is much faster because your client doesn't need to for example update the UI. Generally, arrays big O is O(1) which means you instantly can access an object/data stored at a specific index.
Let's consider an SSCCE.
Then the code (*Note: I have added comments in the code in the right places, hopefully that helps you understand what is going on)
Sub Main()
Dim columnArray As Variant
' create an array from Range starting at L2 to the last row filled with data
columnArray = Range("N2:N" & Range("N" & Rows.Count).End(xlUp).Row)
Dim c As New Collection
' add separate 6 digit numbers to the collection as separate items
' iterate the columnArray array and split the contents
Dim element As Variant
For Each element In columnArray
If NeedSplitting(element) Then
Dim splittedElements As Variant
splittedElements = Split(element, "/")
Dim splittedElement As Variant
For Each splittedElement In splittedElements
c.Add splittedElement
Next
Else
c.Add element
End If
Next
' print the collection to column Q
PrintToColumn c, "Q"
End Sub
Private Sub PrintToColumn(c As Collection, ByVal toColumn As String)
Application.ScreenUpdating = False
' clear the column before printing
Columns(toColumn).ClearContents
' iterate collection and print each item on a new row in the specified column
Dim element As Variant
For Each element In c
Range(toColumn & Range(toColumn & Rows.Count).End(xlUp).Row + 1) = element
Next
Application.ScreenUpdating = True
End Sub
Private Function NeedSplitting(cell As Variant) As Boolean
' returns true if the cell needs splitting
If UBound(Split(cell, "/")) > 0 Then
NeedSplitting = True
End If
End Function
After running the code all your numbers should appear as separate elements in column Q
NOTE: Why use a Collection?
Collections in VBA are dynamic. It means you don't have to know the size of a collection in order to use it - unlike arrays. You can re-dim your array multiple times to increase its size but that's rather considered a bad practice. You can add nearly as many items to a Collection as you want with a simple Collection.Add method and you don't have to worry about increasing the size manually - it's all done for you automatically. In this scenario the processing happens in memory so it should be much quicker then replacing cells contents inside a loop.

Try this:
Dim s As Integer
Dim splitted_array() As String
s = 2 'Assuming data starts at row 2
Do Until Range("N" & s).Value = vbNullString Or s >= Rows.Count
'Split the array
splitted_array = Split(Range("N" & s).Value, "/")
If UBound(splitted_array) > 0 Then
'Set the first value on the first row
Range("N" & s).Value = splitted_array(0)
For i = 1 To UBound(splitted_array)
'Add subsequent rows
Rows(s + i).Insert xlDown
Range("J" & s + i & ":O" & s + i).Value = Range("J" & s & ":O" & s).Value
Range("N" & s + i).Value = splitted_array(i)
Next
End If
s = s + 1 + UBound(splitted_array)
Loop
This code turns this:
into this:

Related

How To Consolidate Multiple Rows Into One Row

My situation is as follows. I have a list of around 2k student accounts and sort the information to a specific format that i can format to our new CRM. The way the data is presented initially makes that problematic.
As you can see on the first screenshot, every student's university choice is presented in a separate row. So if a student has chosen more than one university, data about it is stored in 2-6 rows (each student can select 1 to 6 universities) repeating his personalID, name, forename and DoB every line.
What I need to achieve is to remove repeating information and store all data about each student in one row per student(example on screenshot 2).
I have no idea how to achieve this using VBA. I was trying with IFs, loops and arrays but without any progress. I need help on how to accomplish that using VBA.
Please let me know if you need more information. I will try to explain it in more details if required.
Screenshot 1
Screenshot 2
EDIT: This is the part of the report. I am working on a macro that will format it to our needs and will give us more info about the student's accounts. That is why I am asking for help in VBA.
No need to use VBA for this. Power Query will help you better. Have a look here: https://excelgorilla.com/power-bi/power-query/aggregate-text-values-using-group-by/
This seems to work. I'm new to VBA and programming in general so it's possibly not the most efficient solution and can definitely be improved.
Instead of working with a blank sheet, it transforms the current data to the format you wanted. You can add field headings where you want.
Edit: It assumes that each Student has 5 universities in the list. The code can be adjusted to account for any number by just adjusting the target range dynamically.
Edit 2: I added the change to account for students who've entered any number of universities between 1 to 5. Let me know if this gets it done!
Sub ReArrange_Data()
Dim lrow As Long
lrow = Cells(Rows.Count, 1).End(xlUp).Row
Dim First As Integer
Dim Last As Integer
Dim test As Integer
Dim test1 As Integer
Dim student_range As Range
Dim student_rows As Integer
Dim target_range As Range
First = 2
For i = 2 To lrow
Last = First
If Cells(First, "D").Value = "" Then GoTo Break 'reached end of data
While Cells(Last, "D").Value = Cells(Last + 1, "D").Value
Last = Last + 1
Wend
If Last <> First Then 'check if mulitiple uni and build range
Set student_range = Range("E" & First & ":" & "E" & Last)
student_rows = student_range.Rows.Count
If student_rows = 5 Then
Set target_range = Range("E" & First & ":" & "I" & First)
ElseIf student_rows = 4 Then
Set target_range = Range("E" & First & ":" & "H" & First)
ElseIf student_rows = 3 Then
Set target_range = Range("E" & First & ":" & "G" & First)
ElseIf student_rows = 2 Then
Set target_range = Range("E" & First & ":" & "F" & First)
End If
Else
GoTo Skip 'student entered one uni, go to next loop
End If
target_range = Application.WorksheetFunction.Transpose(student_range.Value) 'row to column
Rows(First + 1 & ":" & Last).EntireRow.Delete
Skip: 'delete repeated entries
First = First + 1
Next i
Break:
End Sub

Join rows based on unique ID

I have 32.000 rows with data. Some data are in a different place and I want to join them with something that I can apply to all rows and not manually. Each "group" have the same ID, in this example is "XPTO"
I have something like this now (but with more columns):
I want it to be like this:
The problem is that I need a clever way, because they are not always exactly like this example. Some of them have 10 rows with the same ID "XPTO" (example)
I am struggling with this =/ ty
Here's how I would approach this.
1) From your comment, I understand that the logic is positional (the first one on the left (Casteloes de) goes with the first one on the right (R Dr Antonio) for the matching value in column A. If that is true, then I would insert a column where you start numbering sequentially, then Fill Down to get sequential numbers all the way to the end. This will help preserve the positional logic if you need to sort or rearrange your data. It will also help you with the logic of "first match", "second match", etc.
2) My next step would be to separate the two sets of data into separate tables/tabs (with the sequentially numbered column appearing in each) and use INDEX/MATCH. The recent answer here will help you with how to increment the match: Is there such thing as a VLOOKUP that recognises repeated numbers?
3) Alternative - this may even be easier, although you'll want to do extensive data checking to make sure nothing got screwed up. With the two tables from step 2, sort by any column with data in it, then delete the blank rows from each table. Then, sort each by the sequentially numbered column to return to the original order. At that point you may be able to just copy and paste. Check carefully for errors if you do this.
I am positive that the solution above given by CriketBird work, at least it has a good logic to solve it, but since I am a newbie in excel, I couldn't figure it out how to solve it that way.
So I solved it by using VBA in excel...(maybe I went too far for this simple problem, but it was my only option).
I will leave the code here if someone want it for a similar situation. (just select the first column and row your table starts and hit run)
Function Area(medico As String) As Integer
Do While countOk < 1
If medico = ActiveCell.Value Then
ActiveCell.Offset(1, 0).Select
rowCount = rowCount + 1
Else: countOk = 1
End If
Loop
Area = rowCount
End Function
Sub Teste()
Dim PaginaMedico As String
Dim totalrowCount As Integer
Dim rowCount As Integer
Dim countOk As Integer
Dim right As Integer
Dim left As Integer
Dim listaleft As New Collection
Dim listaright As New Collection
rowCount = 1
rowOk = 0
totalrowCount = 0
right = 0
left = 0
Do While ActiveCell.Value <> 0
PaginaMedico = ActiveCell.Value
rowCount = Area(PaginaMedico)
totalrowCount = totalrowCount + rowCount
Range("A" & (totalrowCount - (rowCount - 1))).Select
For i = ((totalrowCount + 1) - rowCount) To totalrowCount
If IsEmpty(Range("E" & (i)).Value) And IsEmpty(Range("F" & (i)).Value) Then
Range("T" & (i)).Value = "Empty"
ElseIf Not IsEmpty(Range("E" & (i)).Value) And Not IsEmpty(Range("F" & (i)).Value) Then
Range("T" & (i)).Value = "Full"
ElseIf Not IsEmpty(Range("E" & (i)).Value) And IsEmpty(Range("F" & (i)).Value) Then
left = left + 1
listaleft.Add i
ElseIf IsEmpty(Range("E" & (i)).Value) And Not IsEmpty(Range("F" & (i)).Value) Then
right = right + 1
listaright.Add i
End If
Next i
If Not (right = left) Then
Range("T" & totalrowCount).Value = "BOSTA"
right = 0
left = 0
End If
If listaleft.Count = listaright.Count Then
For i = 1 To listaleft.Count
Range("F" & listaright(1) & ":" & "S" & listaright(1)).Cut Range("F" & listaleft(1) & ":" & "S" & listaleft(1))
listaright.Remove (1)
listaleft.Remove (1)
Next i
End If
Set listaleft = New Collection
Set listaright = New Collection
Range("A" & (totalrowCount + 1)).Select
Loop
End Sub

Find all values in a column and retrieve the addresses in an array

I want to use the .find function in VBA to find instances of a value in a column, however there are calculations which are made based on criteria on the same rows as where the value is found. This is problematic because although the value I am looking for might be the same, the criteria which are used to create the overall score are different. As a result, I would need to loop through all the values which are found in the column and I was wondering how to do that in vba. I know the findnext function but I can never get it to work properly.
counted = Application.WorksheetFunction.CountIfs(cl.Range(finletter & "9:" & finletter & "317"), "Value", cl.Range("H9:H317"), wl.Range("A" & y.row).Value)
'Pol small low complex
If counted > 0 Then
MsgBox wl.Range("A" & y.row).Value
If cl.Range("C" & y.row).Value < 3 And cl.Range("D" & y.row).Value = 1 And cl.Range("E" & y.row).Value = "Interim" Then
wl.Range(y.Address) = 3.75 * counted
Here is an example. Say we are looking for the text "LOVE" in column A and process the data on those rows:
Option Base 1
Sub LookingForLove()
Dim s As String, rng As Range, WhichRows() As Long
Dim rFound As Range
ReDim WhichRows(1)
s = "LOVE"
Set rng = Range("A1:A25")
Set rFound = rng.Find(What:=s, After:=rng(1))
WhichRows(1) = rFound.Row
Do
Set rFound = rng.FindNext(After:=rFound)
If rFound.Row = WhichRows(1) Then Exit Do
ReDim Preserve WhichRows(UBound(WhichRows) + 1)
WhichRows(UBound(WhichRows)) = rFound.Row
Loop
msg = UBound(WhichRows) & vbCrLf & vbCrLf
For i = 1 To UBound(WhichRows)
msg = msg & WhichRows(i) & vbCrLf
Next i
MsgBox msg
End Sub
NOTE:
the Exit Do prevents looping forever
your code would continue by looping the elements of WhichRows() and processing the items on those rows.
your code could alternatively create a dynamic array of ranges or cell addresses.
Another alternative approach would be to use VBA to establish an AutoFilter and process the visible rows.

How to hide rows in VBA based on values in row, quickly

this is my first time using the site, so forgive me for any inept explaining. I have a working macro to hide/unhide rows based on content of the rows, I just want it to be faster. Using a check box, when the box is checked, all rows with an "x" in column D get unhidden, those without an "x" get hidden. Same thing happens when it is unchecked, except it references column C, not D.
Right now, this code works. It's just a little slower than I'd like, since I'm sharing this with a bunch of people. Any ideas for how to speed it up? I'm pretty darn new to VB (the internet is astoundingly wise and a good teacher), but that doesn't matter. I already improved the code - before it selected each row, then referenced the column, and it was awful. Any ideas to speed it up (preferably without moving the screen) would be great.
Thanks so much folks,
DS
Sub NewLuxCheck()
Dim x As Integer
NumRows = Range("A42", "A398").Rows.Count
Range("A42").Select
If ActiveSheet.Shapes("checkbox2").OLEFormat.Object.Value = 1 Then
For x = 42 To NumRows + 41 Step 1
If Worksheets("Base").Range("D" & x).Value = "x" Then
Worksheets("Base").Range(x & ":" & x).EntireRow.Hidden = False
Else
Worksheets("Base").Range(x & ":" & x).EntireRow.Hidden = True
End If
Next
Else
For x = 42 To NumRows + 41 Step 1
If Worksheets("Base").Range("C" & x).Value = "x" Then
Worksheets("Base").Range(x & ":" & x).EntireRow.Hidden = False
Else
Worksheets("Base").Range(x & ":" & x).EntireRow.Hidden = True
End If
Next
End If
MsgBox ("Done")
End Sub
You could use array formula and let Excel to return array with row-numbers where 'x' value occures. It will be quicker but you'll have to reorganise your code and create separate functions etc.
Here example where array formula finds rows whre in column 'D' the cell has value 'x'. Then string of this row numbers is created in form of "A1,A5,A10" ...means 'x' was found in rows 1,5,10. And finally Range(rowsJoind).EntireRow.Hidden is used for all the rows to be hidden/un-hidden in one step.
For rows with value different then 'x' you'll have to use formula like '=IF({0}<>""x"", ROW({0}), -1)'.
Sub test()
Dim inputRange As Range
Dim lastRow As Long
Dim myFormula As String
Dim rowsJoined As String, i As Long
Dim result As Variant
With Worksheets("Base")
lastRow = .Range("D" & .Rows.Count).End(xlUp).Row
Set inputRange = .Columns("D").Resize(lastRow)
Application.ReferenceStyle = xlR1C1
myFormula = "=IF({0}=""x"", ROW({0}), -1)"
myFormula = VBA.Strings.Replace(myFormula, "{0}", inputRange.Address(ReferenceStyle:=xlR1C1))
result = Application.Evaluate(myFormula)
result = Application.Transpose(result)
Application.ReferenceStyle = xlA1
For i = LBound(result) To UBound(result)
If (result(i) > -1) Then
rowsJoined = rowsJoined & "A" & result(i) & IIf(i < UBound(result), ",", "")
End If
Next i
.Range(rowsJoined).EntireRow.Hidden = False
End With
End Sub

Delete Excel record when value ends with plus sign ('+')

We have a blank workbook which I would like the user to be able to paste a list of reference numbers into column A. Some of these reference numbers will have a "+" at the end.
Sub texter1()
With Sheets("texter")
ll = .UsedRange.SpecialCells(xlCellTypeLastCell).Row
For i = 1 To ll
If InStr(1, .Range("a" & i).Value, "+", 1) Then
.Range("b" & i).Formula = "=LEFT(A" & i & ", LEN(A" & i & ")-1)"
.Range("c" & i).Value = Sheets("texter").Range("b" & i).Value
.Range("d" & i).Formula = "=VLOOKUP($c" & i & ", _
[Current_Master.xlsm]Master!$A$3:$BB$20000,14,FALSE)"
.Range("e" & i).Formula = "=VLOOKUP($c" & i & ", _
[Current_Master.xlsm]Master!$A$3:$BB$20000,15,FALSE)"
Else
Cells(i, "a").EntireRow.Delete
End If
Next i
End With
End Sub
I would like reference numbers without the "+" to have the whole row deleted. Reference numbers with a "+" work fine.
this seems to work but has to be run multiple times for it to delete all the rows without a "+" and I cannot figure out why. Please help
Thank you
You cannot delete a row inside a loop as far as affects the iterations. Imagine this: you have 4 rows; the second row meets the conditions and is deleted; in the next iteration the counter is 3 but the row number 3 is now the fourth row (when you delete a row, all the ones below go up one position); consequently, row number 3 wouldn't be analysed. Thus the solution is simple:
Cells(i, "a").EntireRow.Clear()
If you want to actually delete the whole row, you would have to do it outside the main loop. For example: store all the rows to be deleted in an array and iterate through this array right after completing the main loop.
Another alternative would be performing the iterations in the main loop in inverse order (from maximum row to minimum one), although this option is not always applicable (not sure if in your case) and might provoke further problems. The two options above are good enough, I have mentioned this last alternative just as something worthy to be known.
--- UPDATE
To delete the rows after the main loop you can use something on these lines:
'Declaration of variables
ReDim allRows(ll + 1) As Long
Dim allRowsCount As Long: allRowsCount = 0
In your main loop you store the given rows (where you have now Cells(i, "a").EntireRow.Delete):
For i = 1 To ll
'etc.
else
allRowsCount = allRowsCount + 1
allRows(allRowsCount) = i
After the loop is completed, you go through all the stored rows (in inverse order) and delete them:
If (allRowsCount > 0) Then
Dim curRow As Long: curRow = allRowsCount + 1
Do
curRow = curRow - 1
.Rows(allRows(curRow)).Delete
Loop While (curRow > 1)
End If
End With
End Sub

Resources